Using s3cmd to make interactaction with Amazon S3 easier, including simple backups

We use Amazon Web Services quite a bit here.  We not only use it to host most of our clients’ applications, but also for backups.  We like to use S3 to store our backups as it is reliable, secure and very cheap.  S3 stands for Amazon’s Simple Storage Service, it is more or less a limitless place to store data.  You can mount S3 as a network hard drive but it’s main use is to store objects, or data, that you can retrieve at a low cost.  It has 99.999999999% durability, so you most likely won’t lose anything, but even if you do, we use produce multiple backups for every object.

One thing we’ve noticed is that some people have issues interacting with S3, so here are a few things to help you out.  First, if you are just looking to browse your S3 you can do so via your AWS Console or I like to use S3Fox.  However, when you are looking to write some scripts or access it from the command line it can be difficult if you don’t use some pre-built tools.  The best one we’ve found is s3cmd.

s3cmd allows you to list, update, create, delete objects and buckets in your S3.  It’s really easy to install.  Depending on your distribution of linux you can most likely get it from your package manager.  Once you’ve done that you can configure it easily via ‘s3cmd –configure’.  You’ll just need access credentials from your AWS account.   Once you’ve set it up lets go through some useful commands.

To list your available buckets:

To create a bucket:

To list the contents of a bucket:

To put a file in the bucket it is very easy, just run (ie move tester-1.jpg to the bucket):

To delete the file you can run:

These are the basics. Probably the most common uses that we see are doing backups of data from a server to S3. An example of a bash script for this is as follows:

In this script it will just output the the console any errors. As you are most likely not running this by hand every day you’d want to change the “echo” statements to be mail commands or another way to alert administrators of an error on the backup. If you want to backup more than once a day all you need to change is the way the SQL_FILE variable is named to include hours for example.

This is a very simple backup script for MySQL. One thing that it doesn’t do is remove any old files, there is no reason for this to happen in the script. Amazon now has object lifecycles which allows you to automatically expire files in a bucket that are older than 60 days for example.

One thing that many people forget to do when they are making backups is to make sure that they actually work. We highly suggest that you once a month have a script which will check that whatever you are backing up is valid. This means if you are backing up a database that it checks to make sure that the database will reimport and that the data is valid (ie a row that should always exist does). The worst thing is finding out when you need a backup that your backup failed ages ago and you have no valid ones.

Make sure that your backups are not deleted quicker than it would take you to discover a problem. For example, if you only check your blog once a week, don’t have your backups delete after 5 days as you may discover a problem too late and your backups will also have the problem. Storage is cheap, keep backups for a long time.

Hope s3cmd makes your life easier and if you have any questions leave us a comment below!

Recruiters – Before You Call, Do a Little Research

As some of you may know we right now are hiring a mid-level engineer for our team. We’ve noticed in the past few weeks quite the influx of recruiters calling us trying to fill the position. As a company we’ve never used a recruiter in the past, its not that we’ve been closed minded to it, it’s just that we never have had a good experience with one for multiple reasons.  We’re paying the recruiter part a fee for finding us these great people, so they should be doing a little work on their end too.

With a recruiter we expect that the applicant has been pre-screened so that they match what we’re looking for roughly.  Half the time we have anyone call us they don’t even know what type of company we are, come on at least visit our webpage.  I don’t want to have to explain that we are a PHP shop with a heavy Symfony influence, you should already know that.  Of course, once we mention PHP and that we’re looking for a mid level person, the recruiter always has someone that we need to talk to.  This is the best fit for us.

This brings me to my second pet peeve, non-technical recruiters doing technical recruiting.  Now the recruiter know’s we want PHP developers, so they filter their resumes by PHP.  Often the next question is oh are you using Apache? Tomcat? IIS? Node?  For the most part, what does this have to do with it, but no we aren’t primarily using java or a javascript web server.  Often it is clear the recruiter who insists they’ve personally screened the person has no clue what they are talking about, they just are trying to match keywords to a resume.

Third, stop pushing to get me to come to your office to interview candidates I have no idea who they are.  Often on these calls after they’ve learened who we are and what we want, they want me to jump on a call or come into their office to do interviews with their perfect match candidates.  Everyone is busy, I want to see some resumes before going into these first round interviews, otherwise they could be a total waste of both our time.

Lastly, we’re a consulting firm, this means we have clients.  I can’t tell you how many times a recruiter doesn’t look at our clients list and then proceeds to give us people who still work for our clients.  A heads up, most of our contracts do not allow us to hire directly from a client while we are engaged with them (some even for a period there after).  Nevertheless, if the client ever saw us and thought we’d were stealing or aggressively recruiting their employees we can kiss that relationship good bye.

What do I want from a recruiter?  First, I want you to have some technical knowledge, at least know what groups of technologies go together and that LAMP is not a word but an acronym.  Second, take 5-10 minutes, look at our website, projects, blog, and clients make sure whomever you are telling us is a great fit actually has a good chance of being a good fit.  Third, send me a resume, remove all the contact information if you’re worried about us going direct to them, before trying to push me to either jump on a phone interview or come to your office.

Finally, if I’ve said no thank you we’re fine for now, do not continue to email and call me saying that you do have a better candidate.

This may come off as a bit of a rant, but really I hope some recruiters read this and understand that we would be happy to look at your candidates if you’ve put a little effort into making sure they are actually a good fit.

 

Tips: Why you should build tools to empower your sales team

Earlier this week, I was catching up with a buddy of mine and we started talking about some of the custom workflow tools we’d built for his sales team. From an engineering perspective, these tools are fairly straightforward – things like scrapers, browser extensions, and simple crawlers. What was interesting, was listening to how much of an impact simple tools like this were having on my friend’s sales team. From decreasing “grunt work” to helping them surface high quality leads, it started to become clear that the tools were generating high business value relative to their development costs. As we were chatting, I started to synthesize what seemed like the three key reasons the sales guys loved the tools.

Cut down on “grunt work”

Most people hate grunt work, but salespeople have a special disdain for it. They would much rather be on the phone selling rather than scraping data off a website, entering it in Excel, and then seeing if some formula decided they had a worthwhile prospect. Apart from being a waste of time, repetitive menial work kills morale and artificially limits sales bandwidth. By building tools to automate these processes, we had unknowingly helped keep the team motivated while also letting them do more selling and less bitch work.

Help reduce information asymmetry

As a naivete observer, my impression is that in many instances high volume sales people enter the sales process with imperfect or incomplete information. Often times, the information they don’t readily have is public but they lack tools to make the data easily available. From straightforward examples like knowing an employee headcount to more technical ones like knowing what hosting provider a prospect uses, these data points can often help shape the pitch to ultimately win the sale. After chatting with my friend, custom tools effectively fill this space since they’d offer unique insights compared to off the shelf solutions.

Nurture experimentation

Stagnation is a common problem across every job function. Once processes are set, it becomes the same daily grind and its difficult to justify any need for change. However, in my experience introducing new tools gives the organization the impetus to try out new things and hopefully pushes the envelop forward. There is clear precedent for this in software development and I’d argue the same holds true for both sales and marketing. Given new tools and room to explore, motivated people will buck the norm and try something new.

At a high level, I think developing custom tools for any job function is a worthwhile investment and it sounds like some of our sales tools have had a measurable impact which is awesome. I’m now off down the Quora rabbit hole in search of additional insights and anecdotes….

Gadgets: 5 gadgets for your summer wishlist

Over the weekend, Fred Wilson posted an awesome video of the unboxing and flight of a Parrot AR drone along with a note that he was planning to grab one and develop some custom node.js code for it. After seeing the video, and with spring finally here I started brainstorming about what gadgets I’d want to play with over the summer.

Parrot AR Drone

Shown in the video linked above, the Parrot AR Drone is a remote controlled 4 rotor helicopter that is controlled via an iOS or Android device. What sets the Parrot apart from other similar devices is that it there is an node.js library for simplifying development of custom functionality on the Parrot platform.

Not exactly sure what we’d be looking to build with an AR drone but the Red Bull Air Race comes to mind.

Sphero Robotic Ball

Built by Boulder, CO based Orbotix the Sphero robotic ball is a gyroscopically stabilized ball that can be controlled using an iOS or Android device. The Sphero has a software development SDK and there’s also an active app store to download pre-built apps that work with your Sphereo.

Just brainstorming, but something awesome to build with a Sphero would be an app to draw out large drawings using the Sphero to actually draw the lines. Imagine drawing a 50’x50′ line art graphic by uploading some art and then letting the Sphero roll around the canvas.

Pebble watch

Born on Kickstarter, the Pebble watch is an indie entrant into the “smartwatch” space. Sporting iOS and Android integration via Bluetooth along with a scriptable watch face, the Pebble is shaping up to be an interesting player in a developing market.

As far as development, writing custom faces to visualize information differently or pull data off a smartphone seems to be pretty exciting. It still seems a bit early to get a sense of how the Pebble will fare long term as a platform though.

Jawbone UP

Although primarily known for their speaker systems and Bluetooth headsets, the Jawbowne UP is a personal activity monitor that helps users track their physical activity, sleep cycles, and eating habits. The UP fits into the trending theme of the quantified self, where users track KPIs about their daily life in an effort to iterate and improve. Pulling data off the UP is relatively easy and it also plugins in to RunKeeper.

The “quantified self” concept sounds like it would be interesting to experiment with and using the UP to try it out seems like an obvious choice. Leveraging the UP would also make it easy to “compete” with anyone else looking to jump into activity tracking.

Raspberry Pi

Released last year after intense anticipation, the Raspberry Pi is basically a six square inch board with a fully featured computer including video output and USB ports. Coming in at $25 or $35, the Raspberry Pi is cheap enough to experiment with, hack it, and if it happens break it. With full Linux support, the Raspberry Pi is also robust enough to handle “serious business”.

Looking at the list of Rasberry Pi Hacks, theres definetely some awesome inspiration to build something cool. Using a Pi to power a TV screen with real time interactive content seems like it might be an early winner though – we’ll see where that goes.

Anyway, that’s my list, unfortunately I’m not sure what I’ll actually get around to hacking on this summer. Would love to hear about any other cool gadgets or hacks.

PHP: Dispatch tables, an alternative to switch hell

Earlier this week I was putting together a block of code which ended turning into a switch statement with a tangled mess of long case blocks, complicated fall throughs, and ultimately became impossible to follow. Being a stand up guy, I decided to refactor the block using a technique where the case blocks are converted into anonymous functions, indexed into an associative array, and then the correct function is called depending on the value. I haven’t seen this show up too often in PHP code so I thought I’d share.

So what’s the problem?

The life of a switch statement usually starts out relatively benign, you have a few simple conditions and each block is relatively compact:

But then, the switch grows and each case becomes complicated enough that it the entire block becomes mostly unreadable: Ellipsis for effect.

At this point, its hard to reason about what’s going to happen because each case statement has presumably gotten so large and different conditions are “falling” through so the side effects are difficult to trace through.

An alternative

An alternative to using a normal switch statement is to use a dispatch table, which is basically an array of functions indexed by whatever variable you’d normally be “switching” on. The primary benefit to structuring the code this way is that you can easily reason about side effects since the only variables that can be changed are what captures the return value or anything passed by reference. In addition, since every case is a separate function its a bit easier to edit the code. So what does this look like? It’s actually pretty straightforward:

Extending from there, you could also call the function with arguments, potentially by reference, and even have all the functions be closures which capture the variables to avoid having to call with arguments.

Anyway, questions or comments always welcome.