[NCLUG] A *nix DFS alternative?

DJ Eshelman djsbignews at gmail.com
Tue Feb 16 14:57:51 MST 2010


No problem- this is sort of hard to explain.

Basically when you're dealing with several hundred (eventually 
thousands) of files that are upwards of 16 MB each, the most workable 
solution is to have a local server that is syncing on the backend any 
changes made- that way if I'm at one office making changes, and my wife 
is at home making changes, she's not having to download each file to her 
PC, save it and re-upload it.  Rsync doesn't seem a reliable enough 
solution for this because the traffic it would generate would be 
immense, too much to run during the day.  The benefit of Rsync if it 
could be invoked in a smart way, is that I could have a Linux server on 
one end and a Windows server on the other if I really wanted to.  I just 
question if running Rsync on a cron job would be efficient once you get 
up to a few terabytes of data- you'd be running a glorifed backup; the 
way to do this efficiently is to sync only when changes occur and I 
haven't yet found a way to do that with Rsync.

I'm not really sure subversion et al would really work for this because 
this is really just file storage, not version builds.  I'll look into it 
and see if it would work but honestly I don't know if subversion sync's 
changes on the fly, I would assume not based on what it's designed for.  
Seems like having multiple versions of a multiple-terabyte filestore 
would be...  dangerous.  It's been 11 years since I've touched any kind 
of CVS program so I really wouldn't know any more :)

I appreciate the feedback- it's a tough issue and I really don't want to 
give Microsoft money if I can avoid it.

-DJ

On 2/16/2010 1:45 PM, Chad Perrin wrote:
> On Tue, Feb 16, 2010 at 11:38:15AM -0700, DJ Eshelman wrote:
>    
>> So I need:
>> 1)  Bit-Level sync (delta- changes only, not the whole file every single
>> time)
>> 2)  Automatic sync (the files arrive or change and immediately being to
>> synchronize)
>> 3)  Very low overhead AND scalable (we're talking about storage that
>> could grow to several terabytes in a matter of a few years if we're
>> successful in this)
>>      
> It's possible I misunderstand your needs somehow, of course, but what you
> describe sounds to me like it might be a decent fit for a version control
> system (plus a touch of glue code).  Have you considered Mercurial,
> Subversion, or Git?  Their command line interfaces tend to be very easy
> to script, their incremental rollback capabilities are incredibly smooth
> and fine-grained, and they're generally pretty good at minimizing storage
> size for a repository.  I imagine there's not really any need for much
> "merge" capability for what you need, and if there is no need for local
> independent commits I don't see any reason that raw Subversion wouldn't
> work for you.  If you *do* need them for some reason, Mercurial and Git
> have got your back, and SVK offers similar functionality as a front-end
> to Subversion.
>
> Regarding local commits, I wonder if that's exactly what you need when
> you talk about not wanting to send backups across the network at the
> wrong time of day.  With local independent commits, you can commit
> changes to the local repo then, at your leisure, send the updates to the
> repo across the network to another computer with everything necessary to
> roll back to any point set during the period when you weren't sending
> anything across the network included.  Cron scripts or bandwidth
> monitoring to trigger pushing or committing at the best times can
> automate the process of sending outside of peak usage.
>
> If a version control system suits your needs here, it will probably
> actually provide more, and more flexible, functionality in a smaller
> package (in terms of storage, memory, and CPU resources) than from an MS
> Windows based solution.  Note that I'm speaking out of my fourth point of
> contact here and could easily be mistaken, but that's my guess anyway.
>
> Does this come anywhere near helping with your problem, or am I way off
> base?
>
>    
>
>
> _______________________________________________
> NCLUG mailing list       NCLUG at lists.nclug.org
>
> To unsubscribe, subscribe, or modify
> your settings, go to:
> http://lists.nclug.org/mailman/listinfo/nclug



More information about the NCLUG mailing list