Ashish Datta, Author at {5} Setfive - Talking to the World

Published in 2011, Eric Reis’s book The Lean Startup became the blueprint that dozens of “web 2.0” companies were built against. At a high level, the book promotes an “agile” like approach to developing a new company. Loosely speaking, the “happy path” to building a new startup would involve synthesizing ideas into a minimal viable product (MVP), receiving feedback, rapidly iterating, and finally reaching product market fit. Due to its simplicity and wide applicability, the “lean startup methodology” has become wildly popular especially among first time entrepreneurs. With this popularity, the concept of a “MVP” has basically become a buzzword rallying cry to justify what people are building.

Unfortunately, a common problem we’ve noticed is that people focus solely on the MVP and end up neglecting the other parts of the business, notably sales and marketing. To combat this, you should really be thinking of building a minimal viable business instead of specifically an MVP. Ok great, so what are the components of an MVB?

Product (MVP)

The product that you’re actually building is ultimately going to be one of the most important parts of your new business. Surprisingly, I’d argue that the exact features that make it into the MVP aren’t terribly important. What is key, is that your users experience an “aha!” moment while using the product which will help you convey the value. Another key takeaway is less is more. Start small and grow the product to avoid leaving users overwhelmed, confused, and discouraged.

User Acquisition

Awesome. So you’ve built a fantastic product, now how are you going to get in front of people? First time tech founders often overlook an effective user acquisition strategy and it ends up being a major risk factor for them. Hallmarks of an effective strategy are that it needs to be replicable, measurable, and generally affordable. Because of this, “getting great PR”, “going viral” or “buying a super bowl commercial” generally don’t qualify as tractable strategies. Instead, things like paid search, affiliate marketing, and native content ads would be more reasonable strategies to consider.

Community

Community is another area that is often overlooked. Even though it’s usually associated with B2C companies, it’s still important for B2B companies. On the B2C side, the majority of users are less likely to engage with “empty” social sites since no one wants to be the only person at the party. You’ll need a strategy to seed any social features your site has and also keep the heartbeat there once you launch. From a B2B perspective, you’ll still need to think about things like answering support issues, writing newsletters, and generating blog content. Although they seem trivial, having an actionable plan for these “community” issues helps establish user trust and win brand champions.

Metrics

The last piece of the “minimum viable business” are the KPIs and metrics that you’re looking to track. Tracking key indicators is important because they provide a yardstick to let you know if you’re moving in the the right direction. A key point is to make sure you’re tracking useful numbers. Vanity metrics like “# of followers” or “page views” aren’t really going to help you determine the health of your business. You’ll need numbers like “customer acquisition cost” or “lifetime value” which help you distill how your company is doing.

That’s a wrap

Wrapping up, building an MVP is just one of the components of building a successful startup. You’ll need to consider several other important aspects which will help you build, measure, and iterate along your path to building a successful company.

As with all advice, just remember that 90% of all advice is bullshit.

Yesterday afternoon, PandoDaily’s Hamish McKenzie published a post titled Move fast, break things: The sad story of Platform, Facebook’s gigantic missed opportunity. The post outlined the lofty expectations and ultimate failures of the Facebook Platform. Central to Hamish’s piece was the thesis that a series of missteps by Facebook alienated developers and eventually pushed the platform into obscurity.

With the benefit of hindsight, I’d argue there were actually only three major mistakes that ended up dooming the Facebook Platform.

Lack of payments

Hamish mentions this, but I think the lack of payments across the platform was the source of many of its problems. With no seamless way to charge users for either “installs” themselves or “in-app purchases”, developers were forced to play the eyeball game and as a consequence were left clinging to the “viral loop”. Facebook Credits ended up being a non-starter and as the Zynga spat demonstrated, the 30% haircut was intractable. In a world where Facebook launched “card on file” style micropayments with the Platform, maybe we’d be exchanging “Facebook Credits” at Christmas.

No sponsored feed placements

Without on platform payments, developers were essentially left chasing Facebook’s “viral loop” to drive new users, eyeballs, and hopefully eventually revenues. Developers eventually started gaming the system, generating what users perceived as spam, and ultimately forcing Facebook to change notifications. I’d argue that had developers originally had some way to pay for sponsored feed placements they would have been less likely to chase virility. Along with the functionality to sponsor feed posts, Facebook undoubtedly would of ended up building rate limits and other spam fighting measures in order to protect the “sponsored post” product and ultimately helped the platform.

Everything tied to Connect

Even today, one of the most popular components of the Facebook Platform is the Connect single sign on piece. The problem was, and to some extent still is today, was that everything was tied to Connect. Even if you were just logging into a site with Connect, it still had access to your entire Facebook account. Facebook eventually fixed this, but it opened the floodgates of every site posting unwanted updates, breaching user trust, and hurting the credibility of the entire platform.

The PandoDaily piece has a deeper exploration of what drove the decline of the Facebook Platform but I think lack of payments, sponsored feed posts, and the tie in with Connect put the platform in a difficult position from day one.

Last week, a client of ours as us to look at some code that was running particularly slowly. The code was powering an autocompleter that searched a list of high schools in the US and returned the schools that matched and an identifying code. We took a look at the code, and it turns out the original developers had implemented a naivete solution that was choking up since the list had gotten to ~45k elements and I imagine they had only tested with a dozen or so. During the process of implementing a slicker solution, we decided to benchmark a couple of different approaches to see how much the differences in “big-o” complexity really mattered.

The Problem

What we were looking at was the following:

– There is a CSV file that looks something like:

ID, STATE, SCHOOL NAME
2,NMSC DEPT OF ED & SVCS,IL
3,MY SCHOOL IS NOT LISTED DOMEST,NY
4,MY SCHOOL IS NOT LISTED-INTRNT,NY
8,DISTRICT COUNCIL 37 AFSCME,NY
20,AMERICAN SAMOA CMTY COLLEGE,AS
81,LANDMARK COLLEGE,VT

With data for about 45k schools.

On the frontend, there was a vanilla jQuery UI autocompleter that passed a state as well as “school name part” to the backend to retrieve autocomplete results.
The endpoint basically takes the state and school part, parses the available data, and returns the results as a JSON array.
So as an example, the function accepts something like {state: “MA”, query: “New”} and returns:

[
  {name: "New School", code: 1234}.
  {name: "Newton South", code: 1234},
  {name: "Newtown High", code: 1234},
]

The Solutions

In the name of science, we put together a couple of solutions, benchmarked them by running them 1000 times and calculating the min/max/average times, and those values are graphed below. Each of the solutions is briefly described below along with how they’re referenced in the graph.

The initial solution that our client had been running read the entire CSV into a PHP array, then searched the PHP array for schools that matched the query. (readMemoryScan)

A slightly better approach is doing the search “in-place” without actually reading the entire file into memory. (unsortedTableScan)

But can we take advantage of how the data is structured? Turns out we can. Since we’re looking for schools in a specific state whose name’s start with a search string we can sort the file by STATE then SCHOOL NAME which will let us abort the search early. (sortedTableScan)

Since we’re always searching by STATE and SCHOOL NAME can we exploit this to cut down on the number of elements that need to be searched even further?

Turns out we can by transforming the CSV file into a PHP array indexed by state and then writing that out as a serialized PHP object. Another detail we can exploit is that the autocompleter has a minimum search length of 3 characters so we can actually build sub-arrays inside the list of schools keyed on the first 3 letters of their name (serializednFileScan).

So the data structure we’d end up creating looks something like:

{
...
  "MA": {
  ...
   "AME": [...list of schools in MA starting with AME...],
   "NEW": [...list of schools in MA starting with NEW...],
  ...
  },
  "NJ": {
  ...
   "AME": [...list of schools in NJ starting with AME...],
   "NEW": [...list of schools in NJ starting with NEW...],
  ...
  },
  "CA": {
  ...
   "AME": [...list of schools in CA starting with AME...],
   "NEW": [...list of schools in CAA starting with NEW...],
  ...
  },
...
}

The results

Running each function 1000 times, recording the elapsed time between results, and calculating the min / max / and average times we ended up with these numbers:

test_name	min (sec.)	max (sec.)	average (sec.)
readMemoryScan	.662	.690	.673
unsortedTableScan	.532	.547	.536
sortedTableScan	.260	.276	.264
serializednFileScan	.149	.171	.154

And then graphing the averages gets you a graphic that looks like:

The most interesting metric is how the different autocompleters actually “feel” when you use them. We setup a demo at http://symf.setfive.com/autocomplete_test/ Turns out, a few hundred milliseconds makes a huge difference

The conclusion

Looking at our numbers, even with relatively small data sets (<100k elements), the complexity of your algorithms matter. Even though the actual number differences are small, the responsiveness of the autocompleter between the three implementations varies dramatically. Anyway, so long story short? Pay attention in algorithms class.

A couple of days ago, one of our developers mentioned wanting to log all the requests that hit a specific Symfony2 controller. Back in Symfony 1.2, you’d be able to easily accomplish this with a “preExecute” function in the specific controller that you want to log. We’d actually set something similar to this up and the code would end up looking like:

Symfony2 doesn’t have a “preExecute” hook in the same fashion as 1.2 but using the event system you can accomplish the same thing. What you’ll basically end up doing is configuring an event listener for the “kernel.controller” event, inject the EntityManager (or kernel) and then log the request.

The pertinent service configuration in YAML looks like:

And then the corresponding class looks something like:

And thats about it.

Hope everyone is having a fantastic 4th of July. Make sure to drink some brews and eat some BBQ!

Author: Ashish Datta

Startups: You should be building a minimum viable business