Friday, September 12, 2008

Backups

Okay, I can't stress this enough: an off-site backup is essential to the operation of any website, especially a tracker. Database backups are the most important, but backing up your files from time to time is necessary as well.

For example, if your thumb-fingered database maintenance *cough* results in a few thousand damaged or deleted rows (done that at least twice) or in the accidental deletion of your privileged user from the database (done that, too), you need a backup to fall back on. It's nice and convenient to have a local backup for that, but you also need to upload that backup to another server (or your home computer) in case the server runs into technical problems or you're suddenly taken down.

I was pretty lazy with backups while running on a shared web host, so I don't have a lot of wisdom to offer there. My host informed me that he ran nightly backups of all databases, so there was no need for me to maintain my own. (My PHP-based backup script had become pretty heavy for his server.) I think this is pretty normal behavior for shared hosts, although I don't know that all of them would give you access to those backups on demand. The point for them is protection from pissed-off customers in case they muck something up on their end or the server crashes or some such. So, when in doubt, back up.

The crux of the matter is that I'm lazy, and you probably are too. If it were left up to me, there'd be thorough backups for a few days, followed by a span of weeks or even months without backup. If you're lucky, that'll be all you need. How lucky do you feel, punk?

Therefore, you need to automate two things: the backup itself, and the process of uploading the backup to an external server. I'm not going to walk you through the setup process since that sort of info is pretty easy to find online, but I'll run over the general concept.

For the backup process, I set up cron jobs with two shell scripts: daily.sh and weekly.sh. When they're set to run is pretty self-evident. daily backs up the database (mysqldump -uroot -prootdatabasepassword --all-databases > backup.sql) and the uploaded metainfo (.torrent) files, while weekly backs up the entire web directory, all the PHP sources and so forth. weekly doesn't have to touch the database, since daily will also be running on that day. Each script then runs tar -czf backup.tar.gz files, with whatever files you intend to compress.

That's all it takes to ensure that you're safeguarded against data corruption. Naturally, there are other pitfalls that you can encounter. Hardware failure, hacking, inept administration, and of course the ever-present DMCA message—all can threaten your data and your backups. For that reason, it's a good idea to maintain off-site backups as well. You can download them manually, of course, but that's not going to be any easier to remember to do on a regular basis than to make the backups in the first place. Instead, I have my backup script upload to an external server via scp. I use that command instead of ftp because it's more secure and easier to automate.

However, scp does require that you have a UNIX box at the end. (Well, there are Windows ssh clients, so maybe there are scp servers as well. Stranger things have happened.) Since I run Mac OS X on my home computer, it's simple to automate the upload to there, or to any other server that I operate. If your shared host gives you rudimentary shell access, it's even okay to use that.

I'm sorry for being so lazy with the tutorial-writing in this one, but this blog entry is already getting rather lengthy and hopefully you can follow me anyway. I just have a few more points to make before I close.

Where one backup is good, archives of backups is better. If you set your script up to name its output files with the creation date, they won't overwrite one another and you'll have a nice collection in case you need to roll back a few days. I have plenty of hard drive space, so I still have backups from 2007. Who knows when you might need them?

Putting all this together, a simple shell script might look something like this (lines are double-spaced to avoid confusion in the case of line breaks):

#!/bin/sh

date=`date -I`

mysqldump --all-databases | gzip > /var/backup/backup-$date.sql.gz

scp /var/backup/backup-$date.sql.gz user@host:/dir/backup-$date.sql.gz

Okay, that's all for now, folks. I meant to throw in some horror stories about maintaining/not maintaining backups, but we're out of time. Maybe I'll get to that later on. Rest assured, such stories do exist, and don't forget: always maintain an off-site backup!

13 comments:

Tomer said...

gmail can also be used to serve as a storage space with softwares such as backup to email http://backup2e.com

CurlyFries said...

Interesting, I didn't know there were automated services for that. I'm not sure how it works with cron, but it does apparently support it.

However, given Google's track record with privacy, I'm not sure how far I'd trust them... :-\

OnionRings said...

I would also be concerned with the maximum size of backup attachments over email. I'm not exactly sure what Gmail's max size is, but I'm sure that our site's backups would exceed that limit.

HackedServer said...

Found this thanks to Torrent Freak. I saw that you didn't have to many comments so I thought I'd let you know that people are reading your post.

Its quite interesting to read what it's like on the Admin end of a tracker, I'm glad that someone finally shared the story.

Looking forward to more posts!

Brent - HackedServer

Giorgi said...

Have read all the posts guys - well done, I would like to hear/read your opinion on two points:

1. How you managed to maintain seeders for files on early stages? Did you seeded all files yourself or maybe used seedboxes or something?

2. Designing and writing good ToS - how to?

Thanks a lot.

David F. Bello said...

Great blog!
I'm interested in hearing more about your personal or business philosophy regarding copyright, corporate injustice, filesharing, etc. If you could talk about political, social, or economic views as well as the practical aspects of your work with torrents, I would love to learn your perspective as an actual leader in this field. Thank you for the site, and I appreciate what you do.

CurlyFries said...

giorgi and David, those are some interesting subjects and I'll be sure to cover them in future posts. :)

Ketchup said...

Yea, I'll make sure to remind Curly to cover 'em if he forgets. >.>'

CurlyFries said...

You should just be amazed that I've kept up with the 5-day schedule so far...

OnionRings said...

/me impressed

Jocke said...

Thanks for this blog. It's great to hear from other Admins, espesially such topics like adding staff, backup. Thoose hit the target =)

Anyway just wanted to let you know you got a new reader here.

idrinkmusic said...

Here you have another person who reads all the posts no matter their length.

Keep them comming.

demolischa said...

Yep keep them coming! Reading everything an looking forward to some more.

Clicky Web Analytics