dclerk Memory Corruption

Fairlight fairlite at fairlite.com
Fri Sep 4 16:17:09 PDT 2020


Your methodology looks like exactly what I use when I do it.  The
difference is, rebuilding indexes stopped helping the last time I ran
into it.

The index files are BTree+, if that helps you.  I know how the algorithm
works.  There's an excellent visualiser for how it works.  How they
implemented it on disk is a bit of a semi-opaque black box.  It's
documented, but I always seem to have trouble going from algorithm -> file
structure on disk.  At least so far, but I haven't take a bunch of time to
nail it, because I don't really do things on spec anymore.  I may still
do it in the future, but I have neither the time nor energy anytime soon.
My Copious Free Time[tm] is better spent practising guitar and working on
music for the foreseeable future.

m->


On Fri, Sep 04, 2020 at 09:45:50PM +0000, Seijyaku thus spoke:
> dclerk is being ran from a script, so I modified the script so that it would output trace files ??? see example below:
> strace -e trace=all dclerk {clerk config} > /dev/null 2>> /tmp/strace.${$}
> 
> I let that run for 30 seconds and then collected my trace files to look for faults, which will appear as:
> --- SIGABRT
> --- SIGSEGV   <-- This is the segmentation fault I'm interested in
> 
> Just before the segmentation fault, there is a seek and read request on file descriptor 8: {These two lines do not change between failures}
> lseek(8, 136192, SEEK_SET)              = 136192
> read(8, ".\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1024) = 1024
> 
> I tracked file descriptor 8 back to an open request for the assumed bad index ??? Do be careful as file descriptor numbers are reused, so make sure that you don't pass over a close call when backtracking.
> open("/opt/filepro/lock/indexfile.C", O_RDWR) = 8
> 
> 
> I am by no means an expert at this and welcome alternate views.  I have opened the index in a hex editor and reviewed the section it was reading directly before the segmentation fault but I don't see anything out of the ordinary ??? Without a full understanding of how the index file is laid out, I can only make assumptions.
> 
> In the end, all related errors have stopped after rebuilding this index.  I do have a backup of the faulty index in case the same index becomes corrupt again so that I can compare the two.
> 
> Thank you for all your help!
> 
> Sent with ProtonMail Secure Email.
> 
> ????????????????????? Original Message ?????????????????????
> On Friday, September 4, 2020 3:47 PM, Fairlight via Filepro-list <filepro-list at lists.celestial.com> wrote:
> 
> > Rebuild the index in question with dxmaint.
> >
> > I don't suppose you'd share your methodology for tracing them back to a
> > specific index? ;)
> >
> > m->
> >
> > On Fri, Sep 04, 2020 at 08:29:37PM +0000, Seijyaku thus spoke:
> >
> > > I have analyzed ten stack traces of clerk segmentation faults and tracked each of them back to a read on the same index file which is related to a lock table. This would seem to confirm your suspicion that the issue was related to a corrupted index.
> > > What are my options for either validating or rebuilding the index?
> > > Sent with ProtonMail Secure Email.
> > > ????????????????????? Original Message ?????????????????????
> > > On Thursday, September 3, 2020 4:52 PM, Fairlight via Filepro-list filepro-list at lists.celestial.com wrote:
> > >
> > > > To me, it seems unpredictable, but not completely random. I've had
> > > > bigger fish to fry, and this wasn't noticeably affecting customers, or
> > > > it would have been a higher priority. I'd love it if our Apache logs
> > > > would stop getting polluted, though.
> > > > Good luck getting help from fP Tech, since they refuse to lift a finger
> > > > to support 5.0.14 any longer, no matter how much you've already paid
> > > > them. One of the 'lovely' things about that company. (Richard, save
> > > > it. Nothing will sway me on this.)
> > > > m->
> > > > On Thu, Sep 03, 2020 at 09:33:00PM +0000, Seijyaku thus spoke:
> > > >
> > > > > The server is running on VMware.
> > > > > Do you know if this is limited to specific clerk calls or is it completely random? I'm still working to pin down what calls result in errors on my side.
> > > > > Sent with ProtonMail Secure Email.
> > > > > ????????????????????? Original Message ?????????????????????
> > > > > On Thursday, September 3, 2020 3:48 PM, Fairlight via Filepro-list filepro-list at lists.celestial.com wrote:
> > > > >
> > > > > > For what it's worth, I see this regularly with CentOS 7.8 (and when it
> > > > > > was any 7.x starting around .4) on VMware 5.5.
> > > > > > I originally thought it was some bad indexes, because it went away
> > > > > > originally when we rebuilt a problem index. It does not actually seem
> > > > > > to be that any longer.
> > > > > > I literally do not know what causes it, but I would greatly benefit from
> > > > > > hearing the explanation. The servers in question are never short on
> > > > > > RAM when it's transpiring, and also only -some- of the hundreds of calls
> > > > > > pre minute to clerk will do it, not all of them.
> > > > > > It's not just you, though.
> > > > > > You on VMware, or physical hardware? That would potentially rule out
> > > > > > VMware if you're on physical hardware only.
> > > > > > m->
> > > > > > On Thu, Sep 03, 2020 at 07:04:26PM +0000, Seijyaku via Filepro-list thus spoke:
> > > > > >
> > > > > > > CentOS 7.8.
> > > > > > > Sent with ProtonMail Secure Email.
> > > > > > > ????????????????????? Original Message ?????????????????????
> > > > > > > On Thursday, September 3, 2020 2:02 PM, Ross Salas ross.salas at gmail.com wrote:
> > > > > > >
> > > > > > > > Which operating system/version you running?
> > > > > > > > On Thu, Sep 3, 2020 at 11:36 AM Seijyaku seijyaku+filepro at protonmail.com wrote:
> > > > > > > >
> > > > > > > > > Below is the output from the Linux free command. I assume this is what you're asking for?
> > > > > > > > > total used free shared buff/cache available
> > > > > > > > > Mem: 32936704 2835928 2408340 1759048 27692436 27864508
> > > > > > > > > Swap: 0 0 0
> > > > > > > > > 32GB installed, about 3GB used, the majority as buffer/cache.
> > > > > > > > > Sent with ProtonMail Secure Email.
> > > > > > > > > ????????????????????? Original Message ?????????????????????
> > > > > > > > > On Thursday, September 3, 2020 1:23 PM, Ross Salas via Filepro-list filepro-list at lists.celestial.com wrote:
> > > > > > > > >
> > > > > > > > > > what's the memory (RAM) status of that server?
> > > > > > > > > > On Thu, Sep 3, 2020 at 10:58 AM Seijyaku via Filepro-list <
> > > > > > > > > > filepro-list at lists.celestial.com> wrote:
> > > > > > > > > >
> > > > > > > > > > > I'm working on a server running dclerk version 5.0.14D4 that is
> > > > > > > > > > > occasionally showing the error below in some of the logs. Any input on what
> > > > > > > > > > > might be causing this or how to resolve it?
> > > > > > > > > > > \x07,
> > > > > > > > > > > *** Error in `dclerk': malloc(): memory corruption (fast): 0x08b0c2c8 ***,
> > > > > > > > > > > ======= Backtrace: =========,
> > > > > > > > > > > /lib/libc.so.6(+0x77dfc)[0xf7ddfdfc],
> > > > > > > > > > > /lib/libc.so.6(+0x7ae98)[0xf7de2e98],
> > > > > > > > > > > /lib/libc.so.6(__libc_malloc+0x9a)[0xf7de496a],
> > > > > > > > > > > dclerk[0x80b1e4e],
> > > > > > > > > > > dclerk[0x80b18e6],
> > > > > > > > > > > dclerk[0x80b15f1],
> > > > > > > > > > > dclerk[0x80b1e02],
> > > > > > > > > > > dclerk[0x80ae0af],
> > > > > > > > > > > dclerk[0x80af2f4],
> > > > > > > > > > > dclerk[0x80aec39],
> > > > > > > > > > > dclerk[0x80a95ad],
> > > > > > > > > > > [0xf7f8bec0],
> > > > > > > > > > > dclerk[0x80daa1d],
> > > > > > > > > > > dclerk[0x80daf33],
> > > > > > > > > > > dclerk[0x80d90d7],
> > > > > > > > > > > dclerk[0x80d301b],
> > > > > > > > > > > dclerk[0x80d66cd],
> > > > > > > > > > > dclerk[0x80d61ef],
> > > > > > > > > > > dclerk[0x80d6178],
> > > > > > > > > > > dclerk[0x8070610],
> > > > > > > > > > > dclerk[0x8061017],
> > > > > > > > > > > dclerk[0x805fbb2],
> > > > > > > > > > > dclerk[0x805e65f],
> > > > > > > > > > > dclerk[0x805e907],
> > > > > > > > > > > dclerk[0x8071bc2],
> > > > > > > > > > > dclerk[0x804f2e7],
> > > > > > > > > > > dclerk[0x804e230],
> > > > > > > > > > > /lib/libc.so.6(__libc_start_main+0xf3)[0xf7d822a3],
> > > > > > > > > > > ======= Memory map: ========,
> > > > > > > > > > > 08048000-080f2000 r-xp 00000000 08:02 1614140597 /opt/filepro/dclerk,
> > > > > > > > > > > 080f2000-08104000 rwxp 000aa000 08:02 1614140597 /opt/filepro/dclerk,
> > > > > > > > > > > 08104000-08113000 rwxp 00000000 00:00 0 ,
> > > > > > > > > > > 08aa0000-08b2a000 rwxp 00000000 00:00 0 [heap],
> > > > > > > > > > > f7b00000-f7b21000 rwxp 00000000 00:00 0 ,
> > > > > > > > > > > f7b21000-f7c00000 ---p 00000000 00:00 0 ,
> > > > > > > > > > > f7cae000-f7cc7000 r-xp 00000000 08:02 1611440773
> > > > > > > > > > > /usr/lib/libgcc_s-4.8.5-20150702.so.1,
> > > > > > > > > > > f7cc7000-f7cc8000 r-xp 00018000 08:02 1611440773
> > > > > > > > > > > /usr/lib/libgcc_s-4.8.5-20150702.so.1,
> > > > > > > > > > > f7cc8000-f7cc9000 rwxp 00019000 08:02 1611440773
> > > > > > > > > > > /usr/lib/libgcc_s-4.8.5-20150702.so.1,
> > > > > > > > > > > f7cd5000-f7d43000 rwxp 00000000 00:00 0 ,
> > > > > > > > > > > f7d43000-f7d53000 rwxs 00000000 00:01 2 /SYSVf102e0ed (deleted),
> > > > > > > > > > > f7d53000-f7d5e000 r-xp 00000000 08:02 1611347286 /usr/lib/
> > > > > > > > > > > libnss_files-2.17.so,
> > > > > > > > > > > f7d5e000-f7d5f000 r-xp 0000a000 08:02 1611347286 /usr/lib/
> > > > > > > > > > > libnss_files-2.17.so,
> > > > > > > > > > > f7d5f000-f7d60000 rwxp 0000b000 08:02 1611347286 /usr/lib/
> > > > > > > > > > > libnss_files-2.17.so,
> > > > > > > > > > > f7d60000-f7d68000 rwxp 00000000 00:00 0 ,
> > > > > > > > > > > f7d68000-f7f2c000 r-xp 00000000 08:02 1611317456 /usr/lib/libc-2.17.so,
> > > > > > > > > > > f7f2c000-f7f2d000 ---p 001c4000 08:02 1611317456 /usr/lib/libc-2.17.so,
> > > > > > > > > > > f7f2d000-f7f2f000 r-xp 001c4000 08:02 1611317456 /usr/lib/libc-2.17.so,
> > > > > > > > > > > f7f2f000-f7f30000 rwxp 001c6000 08:02 1611317456 /usr/lib/libc-2.17.so,
> > > > > > > > > > > f7f30000-f7f33000 rwxp 00000000 00:00 0 ,
> > > > > > > > > > > f7f33000-f7f73000 r-xp 00000000 08:02 1611317464 /usr/lib/libm-2.17.so,
> > > > > > > > > > > f7f73000-f7f74000 r-xp 0003f000 08:02 1611317464 /usr/lib/libm-2.17.so,
> > > > > > > > > > > f7f74000-f7f75000 rwxp 00040000 08:02 1611317464 /usr/lib/libm-2.17.so,
> > > > > > > > > > > f7f75000-f7f78000 r-xp 00000000 08:02 1610785783
> > > > > > > > > > > /usr/lib/libtermcap.so.2.0.8,
> > > > > > > > > > > f7f78000-f7f79000 r-xp 00002000 08:02 1610785783
> > > > > > > > > > > /usr/lib/libtermcap.so.2.0.8,
> > > > > > > > > > > f7f79000-f7f7a000 rwxp 00003000 08:02 1610785783
> > > > > > > > > > > /usr/lib/libtermcap.so.2.0.8,
> > > > > > > > > > > f7f85000-f7f87000 rwxp 00000000 00:00 0 ,
> > > > > > > > > > > f7f87000-f7f8b000 r--p 00000000 00:00 0 [vvar],
> > > > > > > > > > > f7f8b000-f7f8d000 r-xp 00000000 00:00 0 [vdso],
> > > > > > > > > > > f7f8d000-f7faf000 r-xp 00000000 08:02 1610615685 /usr/lib/ld-2.17.so,
> > > > > > > > > > > f7faf000-f7fb0000 r-xp 00021000 08:02 1610615685 /usr/lib/ld-2.17.so,
> > > > > > > > > > > f7fb0000-f7fb1000 rwxp 00022000 08:02 1610615685 /usr/lib/ld-2.17.so,
> > > > > > > > > > > ffa8e000-ffab0000 rwxp 00000000 00:00 0 [stack],
> > > > > > > > > > > -------------- next part --------------
> > > > > > > > > > > An HTML attachment was scrubbed...
> > > > > > > > > > > URL: <
> > > > > > > > > > > http://mailman.celestial.com/pipermail/filepro-list/attachments/20200828/4fed3c7e/attachment.html
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Filepro-list mailing list
> > > > > > > > > > > Filepro-list at lists.celestial.com
> > > > > > > > > > > Subscribe/Unsubscribe/Subscription Changes
> > > > > > > > > > > http://mailman.celestial.com/mailman/listinfo/filepro-list
> > > > > > > > > >
> > > > > > > > > > -------------- next part --------------
> > > > > > > > > > An HTML attachment was scrubbed...
> > > > > > > > > > URL: http://mailman.celestial.com/pipermail/filepro-list/attachments/20200903/1b6e9be8/attachment.html
> > > > > > > > > > Filepro-list mailing list
> > > > > > > > > > Filepro-list at lists.celestial.com
> > > > > > > > > > Subscribe/Unsubscribe/Subscription Changes
> > > > > > > > > > http://mailman.celestial.com/mailman/listinfo/filepro-list
> > > > > > > > > > -------------- next part --------------
> > > > > > > > > > An HTML attachment was scrubbed...
> > > > > > > > > > URL: http://mailman.celestial.com/pipermail/filepro-list/attachments/20200903/8727da64/attachment.html
> > > > > > >
> > > > > > > Filepro-list mailing list
> > > > > > > Filepro-list at lists.celestial.com
> > > > > > > Subscribe/Unsubscribe/Subscription Changes
> > > > > > > http://mailman.celestial.com/mailman/listinfo/filepro-list
> > > > > >
> > > > > > --
> > > > > > Audio panton, cogito singularis.
> > > > > > Filepro-list mailing list
> > > > > > Filepro-list at lists.celestial.com
> > > > > > Subscribe/Unsubscribe/Subscription Changes
> > > > > > http://mailman.celestial.com/mailman/listinfo/filepro-list
> > > >
> > > > --
> > > > Audio panton, cogito singularis.
> > > > Filepro-list mailing list
> > > > Filepro-list at lists.celestial.com
> > > > Subscribe/Unsubscribe/Subscription Changes
> > > > http://mailman.celestial.com/mailman/listinfo/filepro-list
> >
> > --
> >
> > Audio panton, cogito singularis.
> >
> > Filepro-list mailing list
> > Filepro-list at lists.celestial.com
> > Subscribe/Unsubscribe/Subscription Changes
> > http://mailman.celestial.com/mailman/listinfo/filepro-list
> 
> 
> 

-- 
Audio panton, cogito singularis.


More information about the Filepro-list mailing list