AWS: What are the key Amazon Web Services components?

Over the last couple of years, the popularity of the “cloud computing” has grown dramatically and along with it so has the dominance of Amazon Web Services (AWS) in the market. Unfortunately, AWS doesn’t do a great job of explaining exactly what AWS is, how its pieces work together, or what typical use cases for its components may be. This post is an effort to address this by providing a whip around overview of the key AWS components and how they can be effectively used.

Great, so what is AWS? Generally speaking, Amazon Web Services is a loosely coupled collection of “cloud” infrastructure services that allows customers to “rent” computing resources. What this means is that using AWS, you as the client are able to flexibly provision various computing resources on a “pay as you go” pricing model. Expecting a huge traffic spike? AWS has you covered. Need to flexibly store between 1 GB or 100 GB of photos? AWS has you covered. Additionally, each of the components that makes up AWS is generally loosely coupled meaning that they can work independently or in concert with other AWS resources.

Since AWS components are loosely coupled, you’d be able to mix and match only what you need but here is an overview of the key services.

Route53

What is it? Route53 is a highly available, scalable, and feature rich domain name service (DNS) web service. What a DNS service does is translate a domain name like “setfive.com” into an IP address like 64.22.80.79 which allows a client’s computer to “find” the correct server for a given domain name. In addition, Route53 also has several advanced features normally only available in pricey enterprise DNS solutions. Route53 would typically replace the DNS service provided by your registrar like GoDaddy or Register.com.

Should you use it? Definitely. Allow it isn’t free, after last year’s prolonged GoDaddy outage it’s clear that DNS is a critical component and using a company that treats it as such is important.

Simple Email Service

What is it? Simple Email Service (SES) is a hosted transactional email service. It allows you to easily send highly deliverable emails using a RESTful API call or via regular SMTP without running your own email infrastructure.

Should you use it? Maybe. SES is comparable to services like SendGrid in that it offers a highly deliverable email service. Although it is missing some of the features that you’ll find on SendGrid, its pricing is attractive and the integration is straightforward. We normally use SES for application emails (think “Forgot your password”) but then use MailChimp or SendGrid for marketing blasts and that seems to work pretty well.

Identity and Access Management

What is it? Identity and access management (IAM) provides enhanced security and identity management for your AWS account. In addition, it allows you to enable “multi factor” authentication to enhance the security of your AWS account.

Should you use it? Definitely. If you have more than 1 person accessing your AWS account using IAM will allow everyone to get a separate account with fine grained permissions. Multi factor authentication is also critically important since a compromise at the infrastructure level would be catastrophic for most businesses. Read more about IAM here.

Simple Storage Service

What is it? Simple storage service (S3) is a flexible, scalable, and highly available storage web service. Think of S3 like having an infinitely large hard drive where you can store files which are then accessible via a unique URL. S3 also supports access control, expiration times, and several other useful features. Additionally, the payment model for S3 is “pay as you go” so you’ll only be billed for the amount of data you store and how much bandwidth you use to transfer it in and out.

Should you use it? Definitely. S3 is probably the most widely used AWS service because of its attractive pricing and ease of use. If you’re running a site with lots of static assets (images, CSS assets, etc.), you’ll probably get a “free” performance boost by hosting those assets on S3. Additionally, S3 is an ideal solution for incremental backups, both data and code. We use S3 extensively, usually for hosting static files, frequently backing up MySQL databases, and backing up git repositories. The new AWS S3 Console also makes administering S3 and using it non-programmatically much easier.

Elastic Compute Cloud

What is it? Elastic Compute Cloud (EC2) is the central piece of the AWS ecosystem. EC2 provides flexible, on-demand computing resources with a “pay as you go” pricing model. Concretely, what this means is that you can “rent” computing resources for as long as you need them and process any workload on the machines you’ve provisioned. Because of its flexibility, EC2 is an attractive alternative to buying traditional servers for unpredictable workloads.

Should you use it? Maybe. Whether or not to use EC2 is always a controversial discussion because the complexity it introduces doesn’t always justify its benefits. As a rule of thumb, if you have unpredictable workloads like sporadic traffic using EC2 to run your infrastructure is probably a worthwhile investment. However, if you’re confident that you can predict the resources you’ll need you might be better served by a “normal” VPS solution like Linode.

Elastic Block Store

What is it? Elastic block store (EBS) provides persist storage volumes that attach to EC2 instances to allow you to persist data past the lifespan of a single EC2. Due to the architecture of elastic compute cloud, all the storage systems on an instance are ephemeral. This means that when an instance is terminated all the data stored on that instance is lost. EBS addresses this issue by providing persistent storage that appears on instances as a regular hard drive.

Should you use it? Maybe. If you’re using EC2, you’ll have to weigh the choice between using only ephemeral instance storage or using EBS to persist data. Beyond that, EBS has well documented performance issues so you’ll have to be cognizant of that while designing your infrastructure.

CloudWatch

What is it? CloudWatch provides monitoring for AWS resources including EC2 and EBS. CloudWatch enables administrators to view and collect key metrics and also set a series of alarms to be notified in case of trouble. In addition, CloudWatch can aggregate metrics across EC2 instances which provides useful insight into how your entire stack is operating.

Should you use it? Probably. CloudWatch is significantly easier to setup and use than tools like Nagios but its also less feature rich. We’ve had some success coupling CloudWatch with PagerDuty to provide alerts in case of critical service interruptions. You’ll probably need additional monitoring on top of CloudWatch but its certainly a good baseline to start with.

Anyway, the AWS ecosystem includes several additional services but these are the ones that I felt are key to getting started on AWS. We haven’t had a chance to use it yet but Redshift looks like it’s an exciting addition which will probably make this list soon. As always, comments and feedback welcome.

Tips: A Framework for Communicating Your Ideas to Engineers

One of the hardest things to do when building a software product is effectively communicating your ideas to your development team. In addition, these issues are usually magnified when you’re early on in your product process. Generally, communication issues between stakeholders and developers fall into “macro details”. “expected behavior”, or “user interface concerns”.

Despite being an oxymoron, “macro details” captures the types of errors that occur at the architecture level. For example, imagine you were building a calendaring system for intramural sports teams where each team manages their games for the season using the calendar. A “macro detail” would be that each “team” owns a single “calendar”, a potential error then would be the development team coming back with a solution where every “player” on a “team” owns a “calendar”. Following from that, the “expected behaviors” of the system would capture issues like “what happens if two teams try to schedule a game for the same day?” or “can a player be on more than one team?”. A defect then would be that the stakeholders expected the system to not allow multiple games on the same day but the system allowed it. Finally, “user interface” concerns would typically encompass how the various components of the application are organized and what the different screens are.

Great, so how do you avoid these problems? What follows below is a loose framework for mitigating these issues, it certainly isn’t original but rather a synthesis of techniques that I’ve seen help power effective communication.

Oh hey, from 10,000 ft.

For a totally new product, getting started is often the hardest part. Usually, the entrepreneur or executive who is building the product has been frantically planning, talking about, and “living” the product for sometime before the project actually kicks off. The result of this, is that the stakeholder intimately knows the ins and outs of the project but the rest of the team won’t necessarily be aligned with the primary stakeholder. Because of this, starting by writing a 1 – 2 sentence “10,000 ft. view” is usually a great first step. Using the sports calendering example from above, you might end up with:

I want to build a calendaring application to allow intramural (think old man kickball) sports teams to collaboratively plan their games for the season.

Who, What, Where?

With the “big idea” down on paper, the next step is to start defining the primary objects in the system and how they are all related. At this stage, other people normally develop “user personas”, which describe typical users of the system, and then elaborate on how each persona will use the system. In my experience, this approach is effective but inexperienced stakeholders tend to get caught up in the details and ultimately have a difficult time reaching the right level of abstraction. Looking at the calendaring app, some of the objects you’d end up listing would include:

  • Teams
    • Composed of players
    • Own a single calendar
    • Linked to a single league
  • Players
    • Primary actor in the system
    • Each user will be represented by a player and own a single profile
    • Linked to a single team
  • Calendars
    • Owned by a single team
    • Contain games on specific days
  • Leagues

    • Comprised of multiple teams

Anyone that is familiar with Unified Modeling Language will notice the format is similar to how UML uses nouns and verbs but its a bit different. I’ve always had better luck using natural language to write out the relationships between entities and then converting that into an entity relationship model and then RDMS tables.

Rules, Rules, Rules

Once the primary entities are defined, the next step is to write out the business logic to minimize the confusion around how the system should work. There are formal structures for this as well, including behavior driven development, but in my experience its easier to kickstart the conversation with natural language specifications and then introduce BDD later in the development cycle. Anyway, running with the example you’d end up writing out something like:

  • Users (Players)
    • Can create accounts
    • Can create games on a calendar
    • Can join “Teams”
    • Can’t join two Teams
  • Teams

    • Created by “Players”
    • Can’t be deleted after there’s at one member
  • Calendars

    • Automatically created when a team is created
    • Won’t allow two games on the same day
    • Can’t be deleted
    • Will only allow users to edit their own events/games

Like with the entity relationships, I usually start with natural language and keep the definition of “business logic” pretty loose so that I can capture as much information with the stakeholder as possible and then formalize from there.

Draw me a map

Finally, the last piece is usually communicating what the UI of the application should be. Depending on the makeup of the team, the amount of input that business stakeholders have in this respect might be minimal. However, if there isn’t a dedicated UI/UX resource available UI mockups will definitely be a valuable asset in helping communicate the vision of the application.

There’s several tools for developing UI mockups including Balsamiq, OmniGraffle and Mockflow. Generally, any tool you’re comfortable with will work fine but the key concern should be not “lowering” the designs too far to allow people to concentrate on the ideas, not the specific implementation. I’ve always been partial to Balsamiq since its attractively priced, easy to use, and the final mockups are more like sketches than a finished design.

Anyway, sticking with the calendaring project you might end up making a screen that looks like this:

Like everything else, the definition of UI mockups are generally flexible. Some people like to “lower” pretty fair to high fidelity designs and other people like to leave them at “pen and paper” type sketches. At the end of the day, it’ll be a project by project decision.

Wrapping up

Well thats a crash course, with examples, of how I generally work with key stakeholders to sketch out projects. Ultimately, figuring out the best way to communicate your vision to your engineering team is going to be a personal decision. You’ll probably end up having to mix together several techniques that take advantage of the skills on your team and match the requirements of your project.

Symfony2: Configuring VichUploaderBundle and Gaufrette to use AmazonS3

Last week, I was looking to install the VichUploaderBundle into a Symfony2 project to automatically handle file uploads. As I was looking through the Vich documentation I ran across a chunk describing being able to use Gaufrette to skip the local filesystem and push files directly to Amazon S3. Since we’d eventually need to load balance the app and push uploaded files to S3 anyway, I decided to set it up out of the gate. Unfortunately, the documentation for setting up Vich with Gaufrette is a bit opaque so here’s a step by step guide to getting it going.

Install Everything

The first thing you’ll want to do is install all the required packages. If you’re using Composer, the following will work:

Once all the packages are installed, you’ll need to configure *both* Gaufrette and Vich. This is where the documentation broke down a bit for me. You’ll need your Amazon AWS “Access Key ID” and “Secret Key” which are both available at https://portal.aws.amazon.com/gp/aws/securityCredentials if you’re logged into AWS.

Configure It

Once everything is configured at the YAML level, the final step is adding the Vich annotations to your entities.

Make sure you add the “@Vich\Uploadable” annotation to your Entity or Vich will fail silently.

The “mapping” specified in “@Vich\UploadableField(mapping=”logo”, fileNameProperty=”logo”)” needs to match the value under “vich_uploader.mappings” which you defined in config.yml

Finally, one last “gotcha” to be cognizant of is this bug – https://github.com/dustin10/VichUploaderBundle/issues/123. Since Vich uses Doctrine lifecycle callbacks to manage files, if no Doctrine fields are changed then the Vich code isn’t executed. The easiest way to get around this (and what we used), is just to manually update the “updated_at” column every time a form is submitted to ensure that the upload handling code is executed.

Anyway, as always, questions and comments are welcome.

Tips: Five guidlines for exchanging services for equity

Being a services business, we’ve always been lured by the forbidden fruit of developing a SaaS product, selling it to a bunch of users, and then sitting back and watching the revenue pour in every month. Over the last few years, we’ve been approached a couple of times by people looking to have us serve as the engineering team of a company in exchange for equity which would eventually begin generating revenue. Despite this situation being relatively common, there are relatively few guidelines on how to approach receiving equity in exchange for work as a services company. With that in mind, here are some guidelines we use in these situations. Also, these suggestions are probably also equally applicable if you’re a services company looking to develop a product in house.

Treat the project or company like a billable client

An issue that we’ve repeatedly noticed is that equity or internal projects are often treated as “second class” citizens within service companies. Since they aren’t billable, companies end up doing things like leaving them off the planning schedule, pushing the work off until Fridays, or only assigning interns to the project.

One strategy to help combat these issues is to on-board the project like you would any billable project. Depending on your process, this might mean doing things like creating a Basecamp project, adding the project to your time tracking software, and holding a “kickoff meeting”. Whatever your process may be, the takeaway is to treat the non-billable project like a normal one throughout your organization.

Set a budget and stick to it

Budgets are important for any project but they’re especially important when equity is involved or it’s an internal project. When dealing with a non-billable project, you’ll want to avoid the “equity guilt” where you’re guilted into “doing just a little more” since you own a measurable % of the company as well as eventually be able to calculate an ROI for your investment. A proper budget will help you do both since you’ll be able to know when you’ve contributed your fair share and hopefully one day calculate your ROI against the budget.

Developing a budget in these situations is relatively easy, just take how you’d normally bill and decide on a number for the version one of your buildout. If you bill hourly, pick a number of hours, if you bill by the week then pick a number of weeks. Since the project is already set up as a “first class” project you should be able to just add a budget against it.

Someone needs to “own it”

One of the problems that arises in equity deals and internal project is that there isn’t really a client and subsequently there often isn’t a single person responsible for making key decisions. Having a single person that ultimately “owns” the project will mitigate “design by committee” issues and also help keep the momentum of the project going.

Fair warning though, “owning” a failed project carries a heavy emotional and psychological toll so as an executive you’ll have to be ready to support an employee that accepts this responsibility.

Clearly define responsibilities

Clearly spelling out who is responsible for what is important when a services company is being brought in for equity. Detailing responsibilities will help you avoid a situation where your team started off as the marketing team but then ended up fielding support issues and ultimately resenting the project.

Agree upon KPIs and know when to quit

Knowing what you want to track and how you’ll measure success is important when money isn’t being exchanged because it helps keep everyone accountable and prevents the project from slowly stagnating. Having a good handle on your KPIs also helps motivate the team if things are going well and they’ll ultimately be a primary factor in deciding if you should continue the project.

Unfortunately, one of the hardest aspects of any project is knowing when to quit. This decision is usually harder in equity only or internal projects since there’s no pressure of burning capital and there won’t be the finality of the money running out. Despite this, knowing when to pull the plug will help you facilitate an orderly shutdown of the project and also give everyone involved the chance to debrief.

So should you do an equity only project? Well it depends. Given the the reality of startups, chances are you’ll not going to enjoy an eight figure exit and retire a millionaire. But chances are, if your team gets involved with an interesting project they’ll get a chance to learn a lot, experiment outside their comfort zone, and maybe even leverage the project into new business. So my answer would be a “maybe” depending on where your business is and what sorts of opportunities you’re being presented with.

PHP: Some thoughts on using array_* with closures

The other day, I was hacking away on the PHP backend for the “Startup Institute” visualization and I realized it was going to need a good deal of array manipulation. Figuring it was as good a time as any, I decided to try and leverage PHP 5.3+ new closures along with the array_* functions to manipulate the arrays. I’m not well versed with functional programming but I’ve used Underscore.js’s array/collection functions so this is mostly in comparison to that.

The Array

The entire shebang is on GitHub but here is the gist of what we’re intersted in:

There is a CSV file that looks like ssdata.csv.sample except with more entries that is read into a list ($data) where every object has keys cooresponding to the values in the header. Thinking in JSON, the array ends up looking like:

Ok great, but now what can we do with it?

Sorting:

Using the usort function is particularly natural with closures. Compare the following:

It’s pretty clear the version with closures is much shorter, more conscience, and ultimately easier to follow. Being able to “capture” the local $sortKey variable is also a key feature on the closure version since with the static version there’s no easy way to introduce variables into the sorting function.

Mapping:

In the linked example, I used array_map to basically convert an array of characters into an array of ASCII values for those characters.

With such a small map function, it’s hard to see or appreciate the benefits of using the closure along with array_map. With the closure though, you’ll get a couple of benefits including isolated scope so that you won’t inadvertently rely on the value of a variable that isn’t directly related to transforming the array values.

Using the closure would also “look” much cleaner if the array had non-numeric keys, since without being able to use integer indexes the for(…) loop would be more confusing.

Filter it:

This isn’t used but it could have been to return only the elements that were selected.

Looking at the the version with the closure, its a bit easier to follow and since it’ll enforce scope isolation if the “truth test” was a bit more complicated you’d only have to debug what’s actually inside the closure. Also, not having to “skip” some elements leaves the code with a nicer feel and overall I’d argue its just better looking.

Overall Thoughts:

Overall, using closures with the array_* functions will definitely lead to cleaner, more concise, and easier to follow code. Unfortunately, there are a few rough spots. Like with most of the standard library, the argument order is inconsistent which is always a constant irritation. For example, for no apparent reason array_map is “callback, array” but array_filter is “array, callback”. Also, another irritation is that the “index” isn’t available inside several of the callbacks like on array_reduce or array_map.

Personally though, the biggest limitation is that none of the array_* functions will work with classes that implement the Traversable or Iterator interfaces. That means if you have a Doctrine_Collection and you want to reduce down to a single result you’re still stuck with a foreach(…).

Anyway, as always I’d love to hear other opinions in the comments.