Musings on rolling your own remote "cloud" backups and general backups

Forums › IT Pro and developers › Musings on rolling your own remote "cloud" backups and general backups

1 | 2 | 3 | 4 | 5 | 6 | 7 | ... | 9

altered-ego
2472 posts

Uber Geek
+1 received by user: 865

Trusted

Lifetime subscriber

#1778158 8-May-2017 23:33

Foo: I use Backblaze which works well for me. Gave Crashplan a trial but stuck with Backblaze.

Why did you choose to stay with Backblaze?

Any Carbonite users out there?

Please keep this GZ community vibrant by contributing in a constructive & respectful manner.

davidcole

6112 posts

Uber Geek
+1 received by user: 1476

Trusted

#1778176 9-May-2017 06:20

IcI:
Foo: I use Backblaze which works well for me. Gave Crashplan a trial but stuck with Backblaze.

Why did you choose to stay with Backblaze?

Any Carbonite users out there?

I used to be before CrashPlan. (And mozy)

Their iOS tools were better (thumbnails for photos in the iOS app), but their unlimited didn't have a cap, but they'd slowed you down at certain limits.

Also not sure if they have the same ability to save a backup set to cloud and/or a local folder in machine and/or another machine on your network and/or a friends place.

Previously known as psycik

Home Assistant: Gigabyte AMD A8 Brix, Home Assistant with Aeotech ZWave Controller, Raspberry PI, Wemos D1 Mini, Zwave, Shelly Humidity and Temperature sensors
Media:Chromecast v2, ATV4 4k, ATV4, HDHomeRun Dual
Server Host Plex Server 3x3TB, 4x4TB using MergerFS, Samsung 850 evo 512 GB SSD, Proxmox Server with 1xW10, 2xUbuntu 22.04 LTS, Backblaze Backups, usenetprime.com fastmail.com Sharesies Trakt.TV Sharesight

timmmay

20923 posts

Uber Geek
+1 received by user: 5395

Trusted

Lifetime subscriber

#1778204 9-May-2017 07:18

Does anyone roll their own cloud backups? It's probably not worth it given the number of "unlimited backup" providers for negligible money.

I'll play with CrashPlan backup sets some time soon. I'll get my offsite backups back into the house and see how well it works with drives that are mostly disconnected.

freitasm

BDFL - Memuneh
80944 posts

Uber Geek
+1 received by user: 41699

Administrator

ID Verified

Trusted

Geekzone

Lifetime subscriber

#1778600 9-May-2017 16:38

Carbonite had this thing about automatically selecting files and not allowing you to select anything else. Annoying and not reliable. I ditched then years ago.

Referral links: Quic Broadband (free setup code: R587125ERQ6VE) | Samsung | AliExpress | Wise | Sharesies

Support Geekzone by subscribing (browse ads-free), or making a one-off or recurring donation through PressPatron.

olivernz

512 posts

Ultimate Geek
+1 received by user: 177

ID Verified

Trusted

Lifetime subscriber

#1778606 9-May-2017 16:51

http://www.hashbackup.com/

And B2 cloud storage from BackBlaze.

Only works for unixy systems though

timmmay

20923 posts

Uber Geek
+1 received by user: 5395

Trusted

Lifetime subscriber

#1778623 9-May-2017 17:29

I have seen about the B2 BackBlaze file storage, S3 compatible. Having a Unix client is handy. Backing up EC2 / VM instance data from AWS to outside the AWS ecosystem isn't as easy as it should be.

Geekzone support

Support Geekzone with one-off or recurring donations Donate via PressPatron.

timmmay

20923 posts

Uber Geek
+1 received by user: 5395

Trusted

Lifetime subscriber

#1778872 10-May-2017 07:11

CrashPlan doesn't seem to deduplicate between backup sets.

I moved my backups into a few sets, so I can vary the destinations. For example I want RAW files in the cloud and on an external disk, but not on my internal mirror drive. The RAW files are in two backup sets, one fully uploaded, yet they're being uploaded again. Seems like an edge case they may not have considered that will increase their bandwidth requirements. Perhaps they'll de-duplicate on the server rather than on the client, to reduce total storage requirements.

Sfitz

22 posts

Geek
+1 received by user: 5

#1778880 10-May-2017 07:49

timmmay:

Does anyone roll their own cloud backups? It's probably not worth it given the number of "unlimited backup" providers for negligible money.

I'll play with CrashPlan backup sets some time soon. I'll get my offsite backups back into the house and see how well it works with drives that are mostly disconnected.

I run Owncloud off a Raspberry PI with a TB of space on an external HD. Will be updating this to 2 or 3 TB as I have run out of space. Owncloud supports versioning of files so the extra space will allow for more versions to be kept. The same PI also serves as a file server for media files for a separate media server.

Owncloud client is used to backup laptops and android phone.

A backup copy of the external hard drive is done periodically. Automating that backup and performing it from off-site is on the to-do list but fibre is still a year away for us. Owncloud has the ability to sync with Google and Dropbox etc but I haven't tried it out yet.

davidcole

6112 posts

Uber Geek
+1 received by user: 1476

Trusted

#1778885 10-May-2017 08:06

timmmay:

CrashPlan doesn't seem to deduplicate between backup sets.

I moved my backups into a few sets, so I can vary the destinations. For example I want RAW files in the cloud and on an external disk, but not on my internal mirror drive. The RAW files are in two backup sets, one fully uploaded, yet they're being uploaded again. Seems like an edge case they may not have considered that will increase their bandwidth requirements. Perhaps they'll de-duplicate on the server rather than on the client, to reduce total storage requirements.

Does it actually upload the second copy? I got the impression they were pretty hot on dedupe. And when it rips through analysing the files that's when it figures out it's already there.

You could ask them, I wouldn't think they'd have missed that.

So the raw files for you, are from the same path twice in two backup sets, or are they two different copies (ie two different paths) that have copies of the same data?

I tend to add large datasets to one of my backup sets that don't backup as frequent etc, then add it to a more permanent set later on, and remove from the first, and I've not noticed it having to restart, which would be a similar behaviour to backing it up twice in two different sets.

timmmay

20923 posts

Uber Geek
+1 received by user: 5395

Trusted

Lifetime subscriber

#1778887 10-May-2017 08:12

davidcole:

Does it actually upload the second copy? I got the impression they were pretty hot on dedupe. And when it rips through analysing the files that's when it figures out it's already there.

You could ask them, I wouldn't think they'd have missed that.

So the raw files for you, are from the same path twice in two backup sets, or are they two different copies (ie two different paths) that have copies of the same data?

I tend to add large datasets to one of my backup sets that don't backup as frequent etc, then add it to a more permanent set later on, and remove from the first, and I've not noticed it having to restart, which would be a similar behaviour to backing it up twice in two different sets.

Same files in two different backup sets. I don't keep duplicate information on the same hard drive. I do have a second copy of some files on another drive, but that's not backed up with CrashPlan.

It's using all my bandwidth (because I told it to), so yeah I think it is. I think I created the second backup set including the files, then I may have taken them out of the first backup set - I may have been better waiting until after the first backup ran.

Not that it particularly matters, it's an unlimited backup service and my fiber is unlimited. I prefer to be efficient when I can though.

I was describing my backup system to a friend, who thought it might be a little over the top. I do run a few small businesses though, and have TB of images, including mine and customer images. Not everything goes to each destination - I don't backup customer wedding photos to the cloud, 2TB is too large, so I keep them on hard drives in three locations.

RAID mirror of my main data disk
CrashPlan mirroring data to another drive
CrashPlan uploading data to the cloud service
One hard drive in my shed (first line backups)
Two hard drives at work (main backups)
One hard drive with a friend (mostly wedding backups)

davidcole

6112 posts

Uber Geek
+1 received by user: 1476

Trusted

#1778890 10-May-2017 08:23

timmmay:

davidcole:

Does it actually upload the second copy? I got the impression they were pretty hot on dedupe. And when it rips through analysing the files that's when it figures out it's already there.

You could ask them, I wouldn't think they'd have missed that.

So the raw files for you, are from the same path twice in two backup sets, or are they two different copies (ie two different paths) that have copies of the same data?

I tend to add large datasets to one of my backup sets that don't backup as frequent etc, then add it to a more permanent set later on, and remove from the first, and I've not noticed it having to restart, which would be a similar behaviour to backing it up twice in two different sets.

Same files in two different backup sets. I don't keep duplicate information on the same hard drive. I do have a second copy of some files on another drive, but that's not backed up with CrashPlan.

It's using all my bandwidth (because I told it to), so yeah I think it is. I think I created the second backup set including the files, then I may have taken them out of the first backup set - I may have been better waiting until after the first backup ran.

Not that it particularly matters, it's an unlimited backup service and my fiber is unlimited. I prefer to be efficient when I can though.

I was describing my backup system to a friend, who thought it might be a little over the top. I do run a few small businesses though, and have TB of images, including mine and customer images. Not everything goes to each destination - I don't backup customer wedding photos to the cloud, 2TB is too large, so I keep them on hard drives in three locations.

RAID mirror of my main data disk
CrashPlan mirroring data to another drive
CrashPlan uploading data to the cloud service
One hard drive in my shed (first line backups)
Two hard drives at work (main backups)
One hard drive with a friend (mostly wedding backups)

I don't quite have all the extra drives, but as you

Crashplan to cloud

Crashplan to another drive in same machine

Crashplan to another machine at home

Folder replication from one drive (where files are "mirrored" - I use drive bender not proper drive mirroring) to another drive where files are also mirrored - things like vm exports

Folder replication from one machine to the other that also has crashplan - vm exports and system backups.

It's 100% automatic, I don't do anything manual..

Dell

Shop now for Dell laptops and other devices (affiliate link).

timmmay

20923 posts

Uber Geek
+1 received by user: 5395

Trusted

Lifetime subscriber

#1778892 10-May-2017 08:29

Fully automatic would be nice. I could backup to my wifes laptop I guess, but it has a 120GB SSD and I have 6TB of storage ;) Because of the data volumes I find offsite disks essential for now, which involves scripting. I'm hoping to use CrashPlan to automate it more though.

I've read that the more you store the more memory and resources it takes. That makes sense, deduplication keeps block hashs in RAM. Not a problem at my volume though.

timmmay

20923 posts

Uber Geek
+1 received by user: 5395

Trusted

Lifetime subscriber

#1778893 10-May-2017 08:32

I asked CrashPlan about de-duplication between sets and got a reply in about 20 minutes. Impressive!

We do de-duplicate between backup sets! Its likely that you saw a file verification scan. The scan puts ALL of your data in the To Do list. Then as the backup after the scan runs, its analyzes your files to determine if it has been backed up before and can be skipped, or if its new/changed and needs to be backed up. The percentage complete that you see is CrashPlan's progress through the current To Do list, not the overall backup.

The scan is an important and normal part of CrashPlan's scheduled activities, but it can also be triggered by several events. To read more, click on the link below:

http://support.code42.com/CrashPlan/Latest/Troubleshooting/Is_My_Backup_Starting_Over

The explanation for the lengthy time estimation relates to how CrashPlan prioritizes files for backup. CrashPlan backs up your most recent changes first. When the scan finds new files for backup, these files go straight to the top of CrashPlan's “to do” list. This impacts the estimated time to complete backup because the estimate is based on the type of files the scan is reviewing (i.e., new or existing) and your current network speed.

As these new or newly modified files complete backup, CrashPlan moves on to your already-backed-up files. At that point, the effective transfer rate rises dramatically because CrashPlan sends significantly less data to your backup destinations for previously-backed up files. This is data de-duplication in action.

Please let me know if there's anything else I can do for you—I'm happy to help.

davidcole

6112 posts

Uber Geek
+1 received by user: 1476

Trusted

#1778902 10-May-2017 08:44

timmmay:

Fully automatic would be nice. I could backup to my wifes laptop I guess, but it has a 120GB SSD and I have 6TB of storage ;) Because of the data volumes I find offsite disks essential for now, which involves scripting. I'm hoping to use CrashPlan to automate it more though.

I've read that the more you store the more memory and resources it takes. That makes sense, deduplication keeps block hashs in RAM. Not a problem at my volume though.

For the wifes since "don't play with it" rules apply, I just acronis it every day back to my main server, then they're copied to the mirrored drive and to the other machine.

I also have a big robocopy script that runs over her profile dir that copies to a directory on the server tat is covered by the 3 crashplan destinations.

Re the dedupe, yeah it's the difference between the client saying backing up (normally at around 1-3mpbs) and analysing which you see running in the 50 - 100mbps range. Where I think it's just scanning the file for a hash against a stored one.

timmmay

20923 posts

Uber Geek
+1 received by user: 5395

Trusted

Lifetime subscriber

#1778904 10-May-2017 08:48

My wife doesn't keep anything on her laptop, I mapped drives to my computer (aka "the server") where all important data is stored. If she loses her laptop it doesn't matter.

I could see Crashplan at 15 - 20Mbps, with my upload 20Mbps. Some information is new, some duplicate, so maybe it'll speed up. Doesn't matter.

1 | 2 | 3 | 4 | 5 | 6 | 7 | ... | 9