[NCLUG] A *nix DFS alternative?

Kasey Erickson kasey.erickson at gmail.com
Wed Feb 17 14:27:19 MST 2010


>>> The benefit of Rsync if it could be invoked in a smart way...

I've used inotifywait to monitor a directory for file changes then
fire off various shell commands based on those changes.  Its really
slick.

In your case once a change occurs, run rsync and the changed file will
quickly be updated.  This replaces cron polling with an event-driven
approach.  Saves bandwidth, cpu and disk.  The constraint will be the
number of files the kernel can monitor.  man or lkml might help there.
 Inotify limit manipulation:
http://monodevelop.com/Inotify_Watches_Limit

Kasey


On Wed, Feb 17, 2010 at 10:34 AM, Chris Funk <chris at us-reports.com> wrote:
> My first thought (though I've never done it) was DRBD. After some quick searches, it sound like latency is a problem, but there were several mentions of DRBD Proxy.  Lot of "it should work" comments, but didn't see anyone actually doing it.
>
> Chris
>
>
> -----Original Message-----
> From: nclug-bounces at lists.nclug.org [mailto:nclug-bounces at lists.nclug.org] On Behalf Of DJ Eshelman
> Sent: Wednesday, February 17, 2010 9:41 AM
> To: nclug at lists.nclug.org
> Subject: Re: [NCLUG] A *nix DFS alternative?
>
> DFS does a similar thing, where it will (up to a specified limit) in the
> case of file conflicts store a copy and run a somewhat intelligent
> algorithm to determine which file is current, then only change those
> bits.  My problem with rsync the last time I tried it was exactly what
> you mention- if a file was open it wouldn't read and would try to
> overwrite with the one that was open (granted, that was on two Windows
> servers over a VPN, that issue may not exist on Linux servers).  Your
> thought about compression over rsync got me to thinking and there was a
> project going that was using 7-zip compression end-to-end but it's gone
> stale
>
> I may end up trying rsync initially for just our files (kind of an
> active backup) and see how that goes.  My fear is really just loosing
> data more than keeping a realtime service going, I suppose- I just want
> to build it right from the start instead of having to constantly tinker
> with it while it's a production system.  At first we may only have a few
> files each week that change, for which rsync would be great.
>
> Eventually I want to be able to offer clients an alternative to
> RDB/IronMountain, etc but it's really hard to find user-friendly
> automatic backups, especially with bit-level deltas and easy recovery.
> BackupPC is what I'm currently considering for that end of things, but
> I'm also targeting a heavy Mac audience (grumble grumble), so whatever I
> choose will have to be as good on the front end as on the back.  I can
> see why the existing solutions charge so much, but I'm thinking of
> selling it as a value add instead of a standalone solution.
>
> keep the thoughts coming though, I really appreciate it!
>
> On 2/16/2010 3:12 PM, grant at amadensor.com wrote:
>>> No problem- this is sort of hard to explain.
>>>
>>> Basically when you're dealing with several hundred (eventually
>>> thousands) of files that are upwards of 16 MB each, the most workable
>>> solution is to have a local server that is syncing on the backend any
>>> changes made- that way if I'm at one office making changes, and my wife
>>> is at home making changes, she's not having to download each file to her
>>> PC, save it and re-upload it.  Rsync doesn't seem a reliable enough
>>> solution for this because the traffic it would generate would be
>>> immense, too much to run during the day.  The benefit of Rsync if it
>>> could be invoked in a smart way, is that I could have a Linux server on
>>> one end and a Windows server on the other if I really wanted to.  I just
>>> question if running Rsync on a cron job would be efficient once you get
>>> up to a few terabytes of data- you'd be running a glorifed backup; the
>>> way to do this efficiently is to sync only when changes occur and I
>>> haven't yet found a way to do that with Rsync.
>>>
>>>
>>
>> rsync is actually pretty good about efficiently handling big loads.   It
>> works with block level compares, and so it only sends the data it needs
>> to.  if you are sending across SSH, which is easy with rsync, you will get
>> file compression as well, although lossless compression and photos really
>> don't play well, so you won't get much compression.  The other good thing
>> is that if it is in a crontab, it will pick up any new files, or any
>> partial transfers where it left off and finish the job.
>>
>> rsync -auv /home/www/photostore/ bob at otheroffice:/home/www/photostore/
>>
>> You will need to be careful if syncing both ways not to hit the same file
>> from both ends.   Test it with dummy small text files to make sure you
>> have your settings correct.  Likely possible problems:  Every sync makes a
>> new sub directory, which gets copied the other way one level deeper,
>> leading to a never ending depth of path until you run out of disk or the
>> pathname is too long.  Overwriting a good file with a bad one, for
>> instance, I put the -u on there to make sure that it takes the newer file,
>> no matter what, so that if office2 updates a file, the old one from
>> office1 does not end up replacing it.
>>
>>
>> _______________________________________________
>> NCLUG mailing list       NCLUG at lists.nclug.org
>>
>> To unsubscribe, subscribe, or modify
>> your settings, go to:
>> http://lists.nclug.org/mailman/listinfo/nclug
>>
> _______________________________________________
> NCLUG mailing list       NCLUG at lists.nclug.org
>
> To unsubscribe, subscribe, or modify
> your settings, go to:
> http://lists.nclug.org/mailman/listinfo/nclug
>
>
>
> SPECIAL NOTE TO CLIENTS
> If you or your organization are a client of this firm and this electronic mail message is directed to you, please do not forward this transmission to any other party. Strict confidentiality is necessary with respect to our communication in order to maintain applicable privileges. Thank you.
>
> CONFIDENTIALITY NOTICE
> This electronic mail and any attachments contain information which is the property of the sender and which may be confidential and legally privileged. The information in this transmission is intended only for the use of the person or entity to whom the electronic mail was sent, as indicated above. If you are not the intended recipient, any disclosure, copying, distribution, dissemination or action taken in reliance on the contents of the information contained in this transmission is strictly prohibited.
> _______________________________________________
> NCLUG mailing list       NCLUG at lists.nclug.org
>
> To unsubscribe, subscribe, or modify
> your settings, go to:
> http://lists.nclug.org/mailman/listinfo/nclug
>



More information about the NCLUG mailing list