Using s3cmd to make interactaction with Amazon S3 easier, including simple backups

We use Amazon Web Services quite a bit here.  We not only use it to host most of our clients’ applications, but also for backups.  We like to use S3 to store our backups as it is reliable, secure and very cheap.  S3 stands for Amazon’s Simple Storage Service, it is more or less a limitless place to store data.  You can mount S3 as a network hard drive but it’s main use is to store objects, or data, that you can retrieve at a low cost.  It has 99.999999999% durability, so you most likely won’t lose anything, but even if you do, we use produce multiple backups for every object.

One thing we’ve noticed is that some people have issues interacting with S3, so here are a few things to help you out.  First, if you are just looking to browse your S3 you can do so via your AWS Console or I like to use S3Fox.  However, when you are looking to write some scripts or access it from the command line it can be difficult if you don’t use some pre-built tools.  The best one we’ve found is s3cmd.

s3cmd allows you to list, update, create, delete objects and buckets in your S3.  It’s really easy to install.  Depending on your distribution of linux you can most likely get it from your package manager.  Once you’ve done that you can configure it easily via ‘s3cmd –configure’.  You’ll just need access credentials from your AWS account.   Once you’ve set it up lets go through some useful commands.

To list your available buckets:

To create a bucket:

To list the contents of a bucket:

To put a file in the bucket it is very easy, just run (ie move tester-1.jpg to the bucket):

To delete the file you can run:

These are the basics. Probably the most common uses that we see are doing backups of data from a server to S3. An example of a bash script for this is as follows:

In this script it will just output the the console any errors. As you are most likely not running this by hand every day you’d want to change the “echo” statements to be mail commands or another way to alert administrators of an error on the backup. If you want to backup more than once a day all you need to change is the way the SQL_FILE variable is named to include hours for example.

This is a very simple backup script for MySQL. One thing that it doesn’t do is remove any old files, there is no reason for this to happen in the script. Amazon now has object lifecycles which allows you to automatically expire files in a bucket that are older than 60 days for example.

One thing that many people forget to do when they are making backups is to make sure that they actually work. We highly suggest that you once a month have a script which will check that whatever you are backing up is valid. This means if you are backing up a database that it checks to make sure that the database will reimport and that the data is valid (ie a row that should always exist does). The worst thing is finding out when you need a backup that your backup failed ages ago and you have no valid ones.

Make sure that your backups are not deleted quicker than it would take you to discover a problem. For example, if you only check your blog once a week, don’t have your backups delete after 5 days as you may discover a problem too late and your backups will also have the problem. Storage is cheap, keep backups for a long time.

Hope s3cmd makes your life easier and if you have any questions leave us a comment below!