Tech: The 3 mistakes that doomed the Facebook Platform

Yesterday afternoon, PandoDaily’s Hamish McKenzie published a post titled Move fast, break things: The sad story of Platform, Facebook’s gigantic missed opportunity. The post outlined the lofty expectations and ultimate failures of the Facebook Platform. Central to Hamish’s piece was the thesis that a series of missteps by Facebook alienated developers and eventually pushed the platform into obscurity.

With the benefit of hindsight, I’d argue there were actually only three major mistakes that ended up dooming the Facebook Platform.

Lack of payments

Hamish mentions this, but I think the lack of payments across the platform was the source of many of its problems. With no seamless way to charge users for either “installs” themselves or “in-app purchases”, developers were forced to play the eyeball game and as a consequence were left clinging to the “viral loop”. Facebook Credits ended up being a non-starter and as the Zynga spat demonstrated, the 30% haircut was intractable. In a world where Facebook launched “card on file” style micropayments with the Platform, maybe we’d be exchanging “Facebook Credits” at Christmas.

No sponsored feed placements

Without on platform payments, developers were essentially left chasing Facebook’s “viral loop” to drive new users, eyeballs, and hopefully eventually revenues. Developers eventually started gaming the system, generating what users perceived as spam, and ultimately forcing Facebook to change notifications. I’d argue that had developers originally had some way to pay for sponsored feed placements they would have been less likely to chase virility. Along with the functionality to sponsor feed posts, Facebook undoubtedly would of ended up building rate limits and other spam fighting measures in order to protect the “sponsored post” product and ultimately helped the platform.

Everything tied to Connect

Even today, one of the most popular components of the Facebook Platform is the Connect single sign on piece. The problem was, and to some extent still is today, was that everything was tied to Connect. Even if you were just logging into a site with Connect, it still had access to your entire Facebook account. Facebook eventually fixed this, but it opened the floodgates of every site posting unwanted updates, breaching user trust, and hurting the credibility of the entire platform.

The PandoDaily piece has a deeper exploration of what drove the decline of the Facebook Platform but I think lack of payments, sponsored feed posts, and the tie in with Connect put the platform in a difficult position from day one.

PHP: Does “big-o” complexity really matter?

Last week, a client of ours as us to look at some code that was running particularly slowly. The code was powering an autocompleter that searched a list of high schools in the US and returned the schools that matched and an identifying code. We took a look at the code, and it turns out the original developers had implemented a naivete solution that was choking up since the list had gotten to ~45k elements and I imagine they had only tested with a dozen or so. During the process of implementing a slicker solution, we decided to benchmark a couple of different approaches to see how much the differences in “big-o” complexity really mattered.

The Problem

What we were looking at was the following:

– There is a CSV file that looks something like:

ID, STATE, SCHOOL NAME
2,NMSC DEPT OF ED & SVCS,IL
3,MY SCHOOL IS NOT LISTED DOMEST,NY
4,MY SCHOOL IS NOT LISTED-INTRNT,NY
8,DISTRICT COUNCIL 37 AFSCME,NY
20,AMERICAN SAMOA CMTY COLLEGE,AS
81,LANDMARK COLLEGE,VT

With data for about 45k schools.

  • On the frontend, there was a vanilla jQuery UI autocompleter that passed a state as well as “school name part” to the backend to retrieve autocomplete results.
  • The endpoint basically takes the state and school part, parses the available data, and returns the results as a JSON array.
  • So as an example, the function accepts something like {state: “MA”, query: “New”} and returns:
[
  {name: "New School", code: 1234}.
  {name: "Newton South", code: 1234},
  {name: "Newtown High", code: 1234},
]

The Solutions

In the name of science, we put together a couple of solutions, benchmarked them by running them 1000 times and calculating the min/max/average times, and those values are graphed below. Each of the solutions is briefly described below along with how they’re referenced in the graph.

The initial solution that our client had been running read the entire CSV into a PHP array, then searched the PHP array for schools that matched the query. (readMemoryScan)

A slightly better approach is doing the search “in-place” without actually reading the entire file into memory. (unsortedTableScan)

But can we take advantage of how the data is structured? Turns out we can. Since we’re looking for schools in a specific state whose name’s start with a search string we can sort the file by STATE then SCHOOL NAME which will let us abort the search early. (sortedTableScan)

Since we’re always searching by STATE and SCHOOL NAME can we exploit this to cut down on the number of elements that need to be searched even further?

Turns out we can by transforming the CSV file into a PHP array indexed by state and then writing that out as a serialized PHP object. Another detail we can exploit is that the autocompleter has a minimum search length of 3 characters so we can actually build sub-arrays inside the list of schools keyed on the first 3 letters of their name (serializednFileScan).

So the data structure we’d end up creating looks something like:

{
...
  "MA": {
  ...
   "AME": [...list of schools in MA starting with AME...],
   "NEW": [...list of schools in MA starting with NEW...],
  ...
  },
  "NJ": {
  ...
   "AME": [...list of schools in NJ starting with AME...],
   "NEW": [...list of schools in NJ starting with NEW...],
  ...
  },
  "CA": {
  ...
   "AME": [...list of schools in CA starting with AME...],
   "NEW": [...list of schools in CAA starting with NEW...],
  ...
  },
...
}

The results

Running each function 1000 times, recording the elapsed time between results, and calculating the min / max / and average times we ended up with these numbers:

test_name min (sec.) max (sec.) average (sec.)
readMemoryScan .662 .690 .673
unsortedTableScan .532 .547 .536
sortedTableScan .260 .276 .264
serializednFileScan .149 .171 .154

And then graphing the averages gets you a graphic that looks like:

The most interesting metric is how the different autocompleters actually “feel” when you use them. We setup a demo at http://symf.setfive.com/autocomplete_test/ Turns out, a few hundred milliseconds makes a huge difference

The conclusion

Looking at our numbers, even with relatively small data sets (<100k elements), the complexity of your algorithms matter. Even though the actual number differences are small, the responsiveness of the autocompleter between the three implementations varies dramatically. Anyway, so long story short? Pay attention in algorithms class.

Should you be using a CSS preprocessor? Probably.

We’ve been using CSS preprocessors for some time now but it wasn’t until recently that the reasons for using them really started coalescing for me. CSS preprocessors, like LESS or Sass, basically allow you to write CSS in a more powerful intermediate language which is then compiled down to normal CSS. Like a lot of developers, when I first started using LESS I had some reservations about introducing another layer of abstraction to our development stack. However, after developing a few reasonably sized projects with LESS I’m convinced that using a preprocessor is probably a great idea for many types of projects.

Avoiding the LESS vs. Sass discussion and looking at preprocessors as a class of tools, the clearest benefits are more expressiveness and better reusability.

More expressiveness

As a language, CSS is amazingly straightforward, it basically consists of selectors and rules which are written in flat, plain text files and then combine to style your HTML. Since there are no constructs for variables, conditionals, or functions writing CSS is simple – just fire up and editor and start making changes. The ease of writing and understanding CSS is certainly a benefit but it comes at the price of sacrificing how expressive the language can be.

Using selector specificity as an example, the benefits of the preprocessed files are clear.

Using regular CSS you might end up having rules that look like:

Versus the corresponding LESS:

Looking at the two examples, the LESS uses its structure to express how the nesting rules work and because of this has a higher information density than the regular CSS.

Another common issue where increased expressiveness is helpful is in writing semantically meaningful CSS class names. With regular CSS, the tendency was to normally end up with CSS rules that end up looking like:

The trouble of course being that the CSS class names are tied to their physical appearance as opposed to what they actually mean in your app. Although its possible to write semantic class names in vanilla CSS, the difficulty arises when you’re trying to uniformly apply things like colors, padding, and borders across a range of elements. Without variables and mixins it becomes significantly harder to manage or change these semantically named classes. Looking at how Twitter Bootstrap defines buttons, its clear how much more expressive the declarations are by keeping the colors in easily changed variables:

Better reusability

Another benefit which preprocessors introduce is better code reusability. Although CSS has imports, the amount of code which you can functionally reuse is pretty limited since existing rules can’t be included into new ones. The best you can really hope for is being able to reuse a common stylesheet across projects. Comparatively, preprocessed CSS offers mixins and functions both of which foster more reuse.

Some of the best real world examples of this are in the Twitter Bootstrap mixins.less file which includes mixins used throughout the framework. Additionally, projects that build upon the Bootstrap framework would also be able to leverage any functions or mixins Bootstrap defines further increasing reuse.

Anyway, looking at the benefits compared to the overhead of involving a preprocessor I think for any reasonably sized project you’d almost always be better of developing with one. It’s obviously going to come down to the specific project but I’d be interested in hearing everyone else’s opinion.

Brainstorming: Three opportunities in the online dating space

I was hanging out with a friend of mine a couple of days and we started chatting about the online dating space. As we were talking, we both started nodding our heads in agreement that as an outsider, the space seems relatively interesting for a couple of reasons:

  1. Unlike the vast majority of the consumer web, people are actually conditioned to pay for online dating and there is an established SaaS model.
  2. Users seem to be open to trying new apps and sites – in the last few years several new sites have become popular like tinder, HowAboutWe, and grindr.
  3. There’s a high degree of fragmentation with dozens of sites targeting specific user segments – Christianmingle.com, Farmersonly.com, and Bbbwdating.com

Obviously, life isn’t all rosy starting or running an online dating site since there’s an obvious “chicken/egg” problem, a significant user churn rate, and of course strong competition. Anyway, since Spark Networks (owns JDate and Christian Mingle among others) is a publicly traded company I decided to skim through their annual report in search of interesting tidbits. Here’s a couple of the most interesting things:

From The Spark Networks Annual Report

  • On most of our Web sites, the ability to initiate most communication with other members requires the payment of monthly subscription fees, which represents our primary source of revenue.
  • We hold two United States patents for our Click! technology, the first of which expires January 24, 2017, that pertain to an automated process for confidentially determining whether people feel mutual attraction or have mutual interests.
  • Click! is important to our business in that it is a method and apparatus for detection of reciprocal interests or feelings and subsequent notification of such results. The patents describe the method and apparatus for the identification of a person’s level of attraction and the subsequent notification when the feeling or attraction is mutual.
  • For the year ended December 31, 2012, we had 259,244 average paying subscribers, representing an increase of 32.0% from the year ended December 31, 2011.
  • Revenue for the year ended December 31, 2012 increased 27.3% to $61.7 million from $48.5 million in 2011.
  • Net (loss) Income and Net (loss) Income Per Share. Net loss was $15.0 million, or $0.72 per share, for the year ended December 31, 2012, compared to a net loss of $1.6 million or $0.08 per share in 2011.
  • For 2011, the CEO of Spark took home a total of $990,000 between cash and equity.

Some interesting stuff there. Looking at their revenue number, a little back of the envelope math would suggest they’re averaging around $20/mon for subscription revenue since 260k users x $20/mon x 12 months is roughly $62 million.

Ok great, but where are the opportunities? Looking at AngelList the majority of startups seem to be focussing on building traditional dating websites with a unique hook like “date friends of friends”, “ivy league dating”, “gamified dating”. Unfortunately, given the intense competition, chicken/egg problem, and capital that companies like Spark and IAC are spending on marketing I don’t think starting a new dating site is currently the best play. I think focussing on “selling the tools to the miners” is the best bet right now, potentially around a couple of themes.

Concierge Service

Think a virtual assistant like FancyHands but specifically aimed at online dating. Need a couple of great date ideas? The concierge has you covered. Need some help writing or polishing up your profile? Covered. As a business perspective, you’d generate monthly subscription revenue with the opportunity to generate additional revenue via referrals while costs generally increased linearly. There are a couple of companies doing this already but no one has serious traction.

Meta Search Engine

It’s 2013 and users are still effectively stuck searching one dating website at a time looking for “mr right”. You would fix this by building a “meta search engine” to allow users to search all the sites they’re a member of at the same time from a single interface. There are obviously technological as well as legal constraints to this but I think the potential value add is huge. Running this as a business would be tough since you’d always be at the mercy of the individual sites who would be well within their rights to shut you down like Craigslist and PadMapper. But who knows, maybe you could negotiate favorable terms in exchange for pushing users to register for new sites.

Browser Extension Power Tools

The goal would to build a handful of “power tools” to improve the overall experience of online dating. Example tools might be a Bayesian spam filter to learn what users flag as “spammy” and automatically block similar messages. Or maybe an “expert advisor” that analyzes messages you send and recommends changes in order to improve your response rate. The business model would be simple, sell the extension in the Chrome Webstore and charge a monthly subscription fee.

Anyway, just some off the cuff ideas – would love to hear any feedback or other ideas.