End of Movembend-of-movemberer,
Just Under 2k Raised

November 30th marked the end of Movember for us.  We raised a total of $1,848 dollars for men’s health!  I think also over the month, those of us that participated and their significant others, have learned that we just aren’t built for mustaches.   Here are some before and after shots of each of us from the month.  Thanks for donating!

ashish ashish

jaredjared

daum
daum

Let us know if you have any suggestions for our next run at it! Until next Movember, stay healthy.

Big Data: Black Friday & Twitter Streaming API

It’s that time of year again. Lines forming outside the most popular retailers filled with turkey-gorged shoppers eagerly awaiting this years biggest Black Friday deals. In efforts to curb their boredom, these shoppers take to Twitter to pass the time in line and share their shopping experiences. Since we’re not big shoppers ourselves, and certainly not fans of waiting in lines, we took a different approach to participating in Black Friday.

We decided to flex our big data muscles and hook into Twitter’s streaming API sample which represents a random sampling of twitter’s 400 million tweets per day and recorded all tweets mentioning Black Friday.  In order to handle the streaming data from Twitter, we set up a Storm cluster which processed close to 1 million Black Friday related tweets,  and then saved the data in a MySQL database we spun up on AWS.

For those of you not familiar, Storm is an open source distributed real-time computation system which can be used to reliably process unbounded streams of data.  If you’re interested in the technical details, stay tuned because we’ll be putting out a separate blog post that will walk you through what we did. Also, if you’d like a copy of the mySQL table with the tweet data, you can download it here.

We put together the below infographic based on the data we collected over the 24 hour period beginning Thurs 8pm EST to Friday 8pm EST. We hope you enjoy.

black_friday_infographic_setfive_consulting

 

Doctrine2: Using ResultSetMapping and MySQL temporary tables

Note: I haven’t actually tried this in production, it’s probably a terrible idea.

We’ve been using MySQL temporary tables to run some analytics lately and it got me wondering how difficult would it be to hydrate Doctrine2 objects from these tables? We’ve primarily been using MySQL temporary tables to allow us to break apart complicated SQL queries, cache intermediate steps, and generally make debugging analytics a bit easier. Anyway, given that use case this is a bit of a contrived example but it’s still an interesting look inside Doctrine.

For arguments sake, lets say we’re using the FOSUserBundle and we have a table called “be_user” that looks something like:

Now, for some reason we’re going to end up creating a separate MySQL table (temporary or otherwise) with a subset of this data but identical columns:

So now how do we load data from this secondary table into Doctrine2 entities? Turns out it’s relatively straightforward. By using Doctrine’s createNativeQuery along with ResultSetMapping you’ll be able to pull data out of the alternative table and return regular User entitites. One key point, is that by using DisconnectedClassMetadataFactory it’s actually possible to introspect your Doctrine entities at runtime so that you can add the ResultSetMapping fields dynamically.

Anyway, my code inside a Command to test this out ended up looking like:

Boston Tech Startup Spotlight: Tomorrowish

This is the first of a series of short blog posts highlighting interesting local Boston tech startups – up to bat is Tomorrowish, whose company tagline is “The First Social Media DVR”.

Website:  http://tomorrowish.com
Twitter:  @tomorrowish
Headquarters:  Cambridge, MA

Selection_030

Initially founded 2011 under the name “TweePlayer”, the company was re-branded to “Tomorrowish” in 2012 and is currently headquartered in Cambridge, MA. The privately held company is now looking to raise $1.5 million in funding with a focus on the US Market.

Tomorrowish’s tools and services are targeted towards creators and viewers of digital media.  For creators, their platform lets them capture, curate, and stream social media commentary about their broadcast. That conversation is archived and synchronized in time with the media. When their content is viewed at a later date, their audience can engage in both the current conversation as well as what others have said about certain moments during of the show. Tomorrowish supplies creators with APIs and white-labeled customizable widgets and services to stream the content.

Viewers have access to content available from the Hulu content library as well as other from other media players such as Youtube and Vimeo. They can access content on http://www.tomorrowish.tv/ or through a similar feed setup on their content provider’s website.

The “brains” behind their service offerings lies within what they refer to as the Tomorrowish Machine Curation (TMC), an algorithmic system which filters through the thousands of social media comments about a show and chooses the most interesting ones to display.  It also uses standard metrics such as popularity and language, along with customized black-listing and white-listing rules to further filter content.  Additional manual filtering can also be applied if the content provider wants to make sure a certain phrase, person, or keyword is included (or not).

Here’s a link to a youtube video and accompanying slideshare posted by Mick Darling, CEO at Tomorrowish from a presentation given at Turner Broadcasting’s Media Camp Demo-day on Sept 12th 2013.

http://www.youtube.com/watch?v=3hjV0Q-Sl1o&feature=youtu.be

http://www.slideshare.net/mickdarling/tomorrowish-pitchdeck

You can read more about how it works at http://tomorrowish.com/.

Stay tuned for the next Boston tech start-up spotlight!