[NCLUG] discrepency between reiserfs and ext3?

Robie Lutsey robiel at tgstech.com
Fri May 23 16:14:34 MDT 2003


> -----Original Message-----
> From: nclug-admin at nclug.org [mailto:nclug-admin at nclug.org] On Behalf
Of
> jbass at dmsd.com
> Sent: Friday, May 23, 2003 2:32 PM
> To: nclug at nclug.org
> Subject: Re: [NCLUG] discrepency between reiserfs and ext3?
> 
> Hi Robie,
> 
> Tough problem, as a bunch of software and interactions are
> present which can create the problems you describe as a system.
> 
> First, can the application be run on the same machine as the
> filesystems to remove NFS from the equation? There is a good
> chance the problem is with NFS over reiserfs, and not the
> reiserfs itself.


[Robie ] Unfortunately, no.  The software suite does not run in linux.

> 
> This is especially true if multiple applications/processes are
> accessing the database concurrently, as local caching and filesystem
> operations ordering is not as ridgid once NFS is added to the
> equation, and locking protocols are relaxed via the nfs lock manager.


[Robie ] In an attempt to see if it was nfs + reiserfs, I did a local
copy of the data from reiserfs to ext3 (no nfs involved) and the data
still fails the test when stored on ext3.  It appears that, once data
has been written to the reiserfs volume, by any means (nfs or local
copy) it will fail even if moved off of the reiserfs volume, by any
means (nfs of local copy).

> 
> As a first stage in the debugging, you might want to work hard to
> find the minimal failure sequence, then trace the NFS requests
> that generate the failure and compare the sequence and ordering
> with the same request stream using ext3. If they appear effectively
> the same, then the next step is a bit harder - debugging nfs to the
> filesystems. Binary comparison of the good and failed database under
> the same work load stream might prove useful.
> 
> Instrumenting the kernel nfs server to show the sequence of operations
> to the filesystem is a bit tougher, and then comparing that to the
> generated disk I/O stream is a bit more work ... and you will learn
> a lot about kernel filesystems and disk I/O in the process.
> 
> If on a RedHat based system you could punt and simply bug report it,
> but without a linux to linux job stream to replicate the error and
> provide testing for debugging, it's not likely this will get debugged
> and fixed in the near term.


[Robie ] wouldn't that be nice:)  Speaking of being on a redhat system,
are you aware of  any kernel patches between 2.4.18-3 and current that
might address this problem?
> 
> If really important you may need to hire a consultant to do the
> debugging, if skills do not exist in house.  You might well be
> looking at 3-7 man days to issolate the problem and maybe another
> week or two to develop a fix for the problem if very complex.
> 
> Have fun,
> John Bass
> 
> 
> robiel <robiel at tgstech.com> writes:
> > I have a redhat9 box where the root partition is on ext3, and a
second
> drive
> > is reiserfs.
> >
> > The problem:  an ArcInfo table (a small binary database table)
stored on
> the
> > ext3 partition returns sucess after a particular testing script is
run
> on it.
> > The same ArcInfo table fails the same test when stored on the
reiserfs
> > partition.
> >
> > If I copy the ArcInfo table from the reiserfs part. to the ext3
part.,
> it
> > still fails.  basically, once the info table has been written to the
> reiserfs
> > part., it will fail the test.
> >
> > The data does not appear to be corrupt, however.  The info table can
be
> read
> > successfully and returns the expected results with other tools
whether
> stored
> > on ext3 or reiserfs.
> >
> > The problem appears to be that results from the reiserfs part. are
> returned in
> > a different order than from the ext3 (or hfs) partitions.
> >
> > As a last note, both partitions are accessed via NFS from an hp-ux11
> box.
> >
> > Does anyone have any ideas why this might be happening and/or ways
to
> correct
> > it?
> >
> > Thanks, Robie Lutsey.
> _______________________________________________
> NCLUG mailing list       NCLUG at nclug.org
> 
> To unsubscribe, subscribe, or modify your settings, go to:
> http://www.nclug.org/mailman/listinfo/nclug





More information about the NCLUG mailing list