Tech: If I were Yahoo!, what would I build?

Yahoo’s acquisition of RockMelt last week kicked off another round of armchair quarterbacking questioning the wisdom of aquiring so many startups purely for talent. Only time will ultimately validate the strategy but I think an interesting discussion is given Yahoo!’s position, what would you build given the influx of new talent?

Build a low cost, gaming focused tablet

With the popularity and penetration of tablets accelerating, now would be a great time for Yahoo! to enter the space. In addition to consumer interest, Android 4.3 is rock solid, OEMs are comfortable building “decent” tablets, and Amazon has proved that alternate Android app stores work. So what if Yahoo! built a low cost, gaming focused tablet, with an alternative app store and developer friendly terms? I think they’d be able to successfully capture the low to mid market and then primarily drive revenue via app and in-app purchases.

Double down on Fantasy Sports

One of the few Yahoo! properties that I actually see people visit are it’s Fantasy Sports offerings. Given that, I think it makes sense for Yahoo! to double down on Fantasy and make it an absolutely killer offering. Things like an open API, facilitating playing with real money, and Nate Silver style statistics would set Yahoo! Fantasy apart and ultimately restore the “cool” around the Yahoo! brand.

A killer second screen experience

Remember that tablet Yahoo! just built? Well why not leverage it to put relevant “pop culture” and “celeb gossip” content in front of Yahoo!’s users? It’s not sexy to talk about, but a lot of Yahoo!’s traffic is driven by this type of content and if they can monetize it more effectively than display advertising it should be an easy win. In addition, with the “new” Fantasy Sports available Yahoo! would be able to provide relevant content during live sports and monetize those users as well.

Anyway, whatever Yahoo! ends up building I’m sure it’ll be a bold departure from it’s old path. They have the cash, talent, and hopefully the vision to reinvent a once great Internet giant into a real contender.

PHP: Does “big-o” complexity really matter?

Last week, a client of ours as us to look at some code that was running particularly slowly. The code was powering an autocompleter that searched a list of high schools in the US and returned the schools that matched and an identifying code. We took a look at the code, and it turns out the original developers had implemented a naivete solution that was choking up since the list had gotten to ~45k elements and I imagine they had only tested with a dozen or so. During the process of implementing a slicker solution, we decided to benchmark a couple of different approaches to see how much the differences in “big-o” complexity really mattered.

The Problem

What we were looking at was the following:

– There is a CSV file that looks something like:

ID, STATE, SCHOOL NAME
2,NMSC DEPT OF ED & SVCS,IL
3,MY SCHOOL IS NOT LISTED DOMEST,NY
4,MY SCHOOL IS NOT LISTED-INTRNT,NY
8,DISTRICT COUNCIL 37 AFSCME,NY
20,AMERICAN SAMOA CMTY COLLEGE,AS
81,LANDMARK COLLEGE,VT

With data for about 45k schools.

  • On the frontend, there was a vanilla jQuery UI autocompleter that passed a state as well as “school name part” to the backend to retrieve autocomplete results.
  • The endpoint basically takes the state and school part, parses the available data, and returns the results as a JSON array.
  • So as an example, the function accepts something like {state: “MA”, query: “New”} and returns:
[
  {name: "New School", code: 1234}.
  {name: "Newton South", code: 1234},
  {name: "Newtown High", code: 1234},
]

The Solutions

In the name of science, we put together a couple of solutions, benchmarked them by running them 1000 times and calculating the min/max/average times, and those values are graphed below. Each of the solutions is briefly described below along with how they’re referenced in the graph.

The initial solution that our client had been running read the entire CSV into a PHP array, then searched the PHP array for schools that matched the query. (readMemoryScan)

A slightly better approach is doing the search “in-place” without actually reading the entire file into memory. (unsortedTableScan)

But can we take advantage of how the data is structured? Turns out we can. Since we’re looking for schools in a specific state whose name’s start with a search string we can sort the file by STATE then SCHOOL NAME which will let us abort the search early. (sortedTableScan)

Since we’re always searching by STATE and SCHOOL NAME can we exploit this to cut down on the number of elements that need to be searched even further?

Turns out we can by transforming the CSV file into a PHP array indexed by state and then writing that out as a serialized PHP object. Another detail we can exploit is that the autocompleter has a minimum search length of 3 characters so we can actually build sub-arrays inside the list of schools keyed on the first 3 letters of their name (serializednFileScan).

So the data structure we’d end up creating looks something like:

{
...
  "MA": {
  ...
   "AME": [...list of schools in MA starting with AME...],
   "NEW": [...list of schools in MA starting with NEW...],
  ...
  },
  "NJ": {
  ...
   "AME": [...list of schools in NJ starting with AME...],
   "NEW": [...list of schools in NJ starting with NEW...],
  ...
  },
  "CA": {
  ...
   "AME": [...list of schools in CA starting with AME...],
   "NEW": [...list of schools in CAA starting with NEW...],
  ...
  },
...
}

The results

Running each function 1000 times, recording the elapsed time between results, and calculating the min / max / and average times we ended up with these numbers:

test_name min (sec.) max (sec.) average (sec.)
readMemoryScan .662 .690 .673
unsortedTableScan .532 .547 .536
sortedTableScan .260 .276 .264
serializednFileScan .149 .171 .154

And then graphing the averages gets you a graphic that looks like:

The most interesting metric is how the different autocompleters actually “feel” when you use them. We setup a demo at http://symf.setfive.com/autocomplete_test/ Turns out, a few hundred milliseconds makes a huge difference

The conclusion

Looking at our numbers, even with relatively small data sets (<100k elements), the complexity of your algorithms matter. Even though the actual number differences are small, the responsiveness of the autocompleter between the three implementations varies dramatically. Anyway, so long story short? Pay attention in algorithms class.

Brainstorming: Three opportunities in the online dating space

I was hanging out with a friend of mine a couple of days and we started chatting about the online dating space. As we were talking, we both started nodding our heads in agreement that as an outsider, the space seems relatively interesting for a couple of reasons:

  1. Unlike the vast majority of the consumer web, people are actually conditioned to pay for online dating and there is an established SaaS model.
  2. Users seem to be open to trying new apps and sites – in the last few years several new sites have become popular like tinder, HowAboutWe, and grindr.
  3. There’s a high degree of fragmentation with dozens of sites targeting specific user segments – Christianmingle.com, Farmersonly.com, and Bbbwdating.com

Obviously, life isn’t all rosy starting or running an online dating site since there’s an obvious “chicken/egg” problem, a significant user churn rate, and of course strong competition. Anyway, since Spark Networks (owns JDate and Christian Mingle among others) is a publicly traded company I decided to skim through their annual report in search of interesting tidbits. Here’s a couple of the most interesting things:

From The Spark Networks Annual Report

  • On most of our Web sites, the ability to initiate most communication with other members requires the payment of monthly subscription fees, which represents our primary source of revenue.
  • We hold two United States patents for our Click! technology, the first of which expires January 24, 2017, that pertain to an automated process for confidentially determining whether people feel mutual attraction or have mutual interests.
  • Click! is important to our business in that it is a method and apparatus for detection of reciprocal interests or feelings and subsequent notification of such results. The patents describe the method and apparatus for the identification of a person’s level of attraction and the subsequent notification when the feeling or attraction is mutual.
  • For the year ended December 31, 2012, we had 259,244 average paying subscribers, representing an increase of 32.0% from the year ended December 31, 2011.
  • Revenue for the year ended December 31, 2012 increased 27.3% to $61.7 million from $48.5 million in 2011.
  • Net (loss) Income and Net (loss) Income Per Share. Net loss was $15.0 million, or $0.72 per share, for the year ended December 31, 2012, compared to a net loss of $1.6 million or $0.08 per share in 2011.
  • For 2011, the CEO of Spark took home a total of $990,000 between cash and equity.

Some interesting stuff there. Looking at their revenue number, a little back of the envelope math would suggest they’re averaging around $20/mon for subscription revenue since 260k users x $20/mon x 12 months is roughly $62 million.

Ok great, but where are the opportunities? Looking at AngelList the majority of startups seem to be focussing on building traditional dating websites with a unique hook like “date friends of friends”, “ivy league dating”, “gamified dating”. Unfortunately, given the intense competition, chicken/egg problem, and capital that companies like Spark and IAC are spending on marketing I don’t think starting a new dating site is currently the best play. I think focussing on “selling the tools to the miners” is the best bet right now, potentially around a couple of themes.

Concierge Service

Think a virtual assistant like FancyHands but specifically aimed at online dating. Need a couple of great date ideas? The concierge has you covered. Need some help writing or polishing up your profile? Covered. As a business perspective, you’d generate monthly subscription revenue with the opportunity to generate additional revenue via referrals while costs generally increased linearly. There are a couple of companies doing this already but no one has serious traction.

Meta Search Engine

It’s 2013 and users are still effectively stuck searching one dating website at a time looking for “mr right”. You would fix this by building a “meta search engine” to allow users to search all the sites they’re a member of at the same time from a single interface. There are obviously technological as well as legal constraints to this but I think the potential value add is huge. Running this as a business would be tough since you’d always be at the mercy of the individual sites who would be well within their rights to shut you down like Craigslist and PadMapper. But who knows, maybe you could negotiate favorable terms in exchange for pushing users to register for new sites.

Browser Extension Power Tools

The goal would to build a handful of “power tools” to improve the overall experience of online dating. Example tools might be a Bayesian spam filter to learn what users flag as “spammy” and automatically block similar messages. Or maybe an “expert advisor” that analyzes messages you send and recommends changes in order to improve your response rate. The business model would be simple, sell the extension in the Chrome Webstore and charge a monthly subscription fee.

Anyway, just some off the cuff ideas – would love to hear any feedback or other ideas.

Musings: 3 reasons if Google Wallet owns the pipes it’ll be a win

Last week, at Google I/O Google announced a set of sweeping changes to their Wallet product. CNet has a decent run down of what they announced but basically it boils down to the ability to “send money with GMail”, Wallet integration into Chrome to decrease payment friction, and “instant buy” with Google+. All in all, the announcements are interesting but I think what’s more exciting is the potential for Google to truly innovate in the payments space.

In the last few years, companies like Square, Dwolla, and Stripe have been innovating in the payments space but they’ve all been reliant on existing credit card infastructure. With the exception of using Dwolla as a replacement for a check, each of the companies still relies on charging a user’s credit card to complete the transaction. I think this infrastructure piece is the key pinchin for Google Wallet. If Google can sidestep the existing payments infrastructure for Wallet, like they did with the telcos for Fiber, they’ll end up redefining how digital payments work.

Ok, so they own the infrastructure now what can they do?

Better risk analysis, lower costs

As far as processing payments go, cost is ultimately one of the most important factors used in picking a processor. The pricing is so opaque that FeeFighters basically built and sold a business simply by explaining in straightforward terms which processor was the best for your business. If Google had the freedom of controlling the pipes, they’d be able to lower their pricing below everyone else by introducing better risk analysis tools into their payment solutions.

Looking at how the APIs from companies like Authorize.net work, they basically only accept the minimum information required to charge a credit card and nothing more. Google would be able to modernize this by incorporating additional “verifying details” about a user to reduce the risk on a transaction. For example, a charge originating from a 2-factor authenticated Google Wallet user that is at their “home” computer is obviously much less of a risk than an anonymous user using a credit card for the first time. By segmenting risk by user, device, as well as transaction type Google would be able to offer the best rates for “normal” transactions and also accept “high risk” transactions.

Give NFC payments some teeth

Google has tried to push out the NFC powered version of Google Wallet in 2011/12 but it was immediately blocked by major American carriers because it competed directly with their ISIS solution. It shouldn’t come as a surprise that the telcos didn’t want to get relegated to “dumb pipes” for payments as well but it’s also not like ISIS has garnered any real traction either.

If Google controlled the entire stack and could successfully convert Android users to Wallet users, they’d be able to essentially pay the carriers “blood money” to lift the Wallet ban to drive adoption and then hopefully reach a more permanent deal.

Ultimately, true mobile payments need to be freed from the existing credit card restrictions and Google could be poised to deliver just that.

Micropayments that work

People have been talking about “easy” micropayments on the Internet for several years but they haven’t really shaken out. Even today, charging someone $1 for something is a huge PITA and it really isn’t even practical. Between fees and long payment forms, the micropayments still aren’t economically feasible.

With Wallet integrated into Chrome and the infrastructure under their control, Google would be able to tackle this head on by reducing the friction to completing a payment and offering different pricing models for micropayments. Think 2 click checkouts for transactions under $5 and a monthly fee of $5 for merchant accounts in good standing instead of transaction fees.

Despite some reservations, I’m excited to see what Google ends up doing with Wallet and how it ultimately influence the payments space. Another big question is what’s Facebook going to do? Revamp Facebook Credits? Start offering co-branded Facebook credit cards?

Anyway, thoughts or comments welcome.

Gadgets: 5 gadgets for your summer wishlist

Over the weekend, Fred Wilson posted an awesome video of the unboxing and flight of a Parrot AR drone along with a note that he was planning to grab one and develop some custom node.js code for it. After seeing the video, and with spring finally here I started brainstorming about what gadgets I’d want to play with over the summer.

Parrot AR Drone

Shown in the video linked above, the Parrot AR Drone is a remote controlled 4 rotor helicopter that is controlled via an iOS or Android device. What sets the Parrot apart from other similar devices is that it there is an node.js library for simplifying development of custom functionality on the Parrot platform.

Not exactly sure what we’d be looking to build with an AR drone but the Red Bull Air Race comes to mind.

Sphero Robotic Ball

Built by Boulder, CO based Orbotix the Sphero robotic ball is a gyroscopically stabilized ball that can be controlled using an iOS or Android device. The Sphero has a software development SDK and there’s also an active app store to download pre-built apps that work with your Sphereo.

Just brainstorming, but something awesome to build with a Sphero would be an app to draw out large drawings using the Sphero to actually draw the lines. Imagine drawing a 50’x50′ line art graphic by uploading some art and then letting the Sphero roll around the canvas.

Pebble watch

Born on Kickstarter, the Pebble watch is an indie entrant into the “smartwatch” space. Sporting iOS and Android integration via Bluetooth along with a scriptable watch face, the Pebble is shaping up to be an interesting player in a developing market.

As far as development, writing custom faces to visualize information differently or pull data off a smartphone seems to be pretty exciting. It still seems a bit early to get a sense of how the Pebble will fare long term as a platform though.

Jawbone UP

Although primarily known for their speaker systems and Bluetooth headsets, the Jawbowne UP is a personal activity monitor that helps users track their physical activity, sleep cycles, and eating habits. The UP fits into the trending theme of the quantified self, where users track KPIs about their daily life in an effort to iterate and improve. Pulling data off the UP is relatively easy and it also plugins in to RunKeeper.

The “quantified self” concept sounds like it would be interesting to experiment with and using the UP to try it out seems like an obvious choice. Leveraging the UP would also make it easy to “compete” with anyone else looking to jump into activity tracking.

Raspberry Pi

Released last year after intense anticipation, the Raspberry Pi is basically a six square inch board with a fully featured computer including video output and USB ports. Coming in at $25 or $35, the Raspberry Pi is cheap enough to experiment with, hack it, and if it happens break it. With full Linux support, the Raspberry Pi is also robust enough to handle “serious business”.

Looking at the list of Rasberry Pi Hacks, theres definetely some awesome inspiration to build something cool. Using a Pi to power a TV screen with real time interactive content seems like it might be an early winner though – we’ll see where that goes.

Anyway, that’s my list, unfortunately I’m not sure what I’ll actually get around to hacking on this summer. Would love to hear about any other cool gadgets or hacks.