[NCLUG] A *nix DFS alternative?

Chris Funk chris at us-reports.com
Wed Feb 17 10:34:32 MST 2010


My first thought (though I've never done it) was DRBD. After some quick searches, it sound like latency is a problem, but there were several mentions of DRBD Proxy.  Lot of "it should work" comments, but didn't see anyone actually doing it.

Chris


-----Original Message-----
From: nclug-bounces at lists.nclug.org [mailto:nclug-bounces at lists.nclug.org] On Behalf Of DJ Eshelman
Sent: Wednesday, February 17, 2010 9:41 AM
To: nclug at lists.nclug.org
Subject: Re: [NCLUG] A *nix DFS alternative?

DFS does a similar thing, where it will (up to a specified limit) in the
case of file conflicts store a copy and run a somewhat intelligent
algorithm to determine which file is current, then only change those
bits.  My problem with rsync the last time I tried it was exactly what
you mention- if a file was open it wouldn't read and would try to
overwrite with the one that was open (granted, that was on two Windows
servers over a VPN, that issue may not exist on Linux servers).  Your
thought about compression over rsync got me to thinking and there was a
project going that was using 7-zip compression end-to-end but it's gone
stale

I may end up trying rsync initially for just our files (kind of an
active backup) and see how that goes.  My fear is really just loosing
data more than keeping a realtime service going, I suppose- I just want
to build it right from the start instead of having to constantly tinker
with it while it's a production system.  At first we may only have a few
files each week that change, for which rsync would be great.

Eventually I want to be able to offer clients an alternative to
RDB/IronMountain, etc but it's really hard to find user-friendly
automatic backups, especially with bit-level deltas and easy recovery.
BackupPC is what I'm currently considering for that end of things, but
I'm also targeting a heavy Mac audience (grumble grumble), so whatever I
choose will have to be as good on the front end as on the back.  I can
see why the existing solutions charge so much, but I'm thinking of
selling it as a value add instead of a standalone solution.

keep the thoughts coming though, I really appreciate it!

On 2/16/2010 3:12 PM, grant at amadensor.com wrote:
>> No problem- this is sort of hard to explain.
>>
>> Basically when you're dealing with several hundred (eventually
>> thousands) of files that are upwards of 16 MB each, the most workable
>> solution is to have a local server that is syncing on the backend any
>> changes made- that way if I'm at one office making changes, and my wife
>> is at home making changes, she's not having to download each file to her
>> PC, save it and re-upload it.  Rsync doesn't seem a reliable enough
>> solution for this because the traffic it would generate would be
>> immense, too much to run during the day.  The benefit of Rsync if it
>> could be invoked in a smart way, is that I could have a Linux server on
>> one end and a Windows server on the other if I really wanted to.  I just
>> question if running Rsync on a cron job would be efficient once you get
>> up to a few terabytes of data- you'd be running a glorifed backup; the
>> way to do this efficiently is to sync only when changes occur and I
>> haven't yet found a way to do that with Rsync.
>>
>>
>
> rsync is actually pretty good about efficiently handling big loads.   It
> works with block level compares, and so it only sends the data it needs
> to.  if you are sending across SSH, which is easy with rsync, you will get
> file compression as well, although lossless compression and photos really
> don't play well, so you won't get much compression.  The other good thing
> is that if it is in a crontab, it will pick up any new files, or any
> partial transfers where it left off and finish the job.
>
> rsync -auv /home/www/photostore/ bob at otheroffice:/home/www/photostore/
>
> You will need to be careful if syncing both ways not to hit the same file
> from both ends.   Test it with dummy small text files to make sure you
> have your settings correct.  Likely possible problems:  Every sync makes a
> new sub directory, which gets copied the other way one level deeper,
> leading to a never ending depth of path until you run out of disk or the
> pathname is too long.  Overwriting a good file with a bad one, for
> instance, I put the -u on there to make sure that it takes the newer file,
> no matter what, so that if office2 updates a file, the old one from
> office1 does not end up replacing it.
>
>
> _______________________________________________
> NCLUG mailing list       NCLUG at lists.nclug.org
>
> To unsubscribe, subscribe, or modify
> your settings, go to:
> http://lists.nclug.org/mailman/listinfo/nclug
>
_______________________________________________
NCLUG mailing list       NCLUG at lists.nclug.org

To unsubscribe, subscribe, or modify
your settings, go to:
http://lists.nclug.org/mailman/listinfo/nclug



SPECIAL NOTE TO CLIENTS
If you or your organization are a client of this firm and this electronic mail message is directed to you, please do not forward this transmission to any other party. Strict confidentiality is necessary with respect to our communication in order to maintain applicable privileges. Thank you.

CONFIDENTIALITY NOTICE
This electronic mail and any attachments contain information which is the property of the sender and which may be confidential and legally privileged. The information in this transmission is intended only for the use of the person or entity to whom the electronic mail was sent, as indicated above. If you are not the intended recipient, any disclosure, copying, distribution, dissemination or action taken in reliance on the contents of the information contained in this transmission is strictly prohibited.



More information about the NCLUG mailing list