NodeJS: Running code in parallel with child_process

One of the nice things about nodejs is that since the majority of its libraries are asynchronous it boasts strong support for concurrently performing IO heavy workloads. Even though node is single threaded the event loop is able to concurrently progress separate operations because of the asynchronous style of its libraries. A canonical example would something like fetching 10 web pages and extracting all the links from the fetched HTML. This fits into node’s computational model nicely since the most time consuming part of an HTTP request is waiting around for the network during which node can use the CPU for something else. For the sake of discussion, let’s consider this sample implementation:

Request debugging is enabled so you’ll see that node starts fetching all the URLs at the same time and then the various events fire at different times for each URL:

So we’ve demonstrated that node will concurrently “do” several things at the same time but what happens if computationally intensive code is tying up the event loop? As a concrete example, imagine doing something like compressing the results of the HTTP request. For our purposes we’ll just throw in a while(1) so it’s easier to see what’s going on:

If you run the script you’ll notice it takes much longer to finish since we’ve now introduced a while() loop that causes each URL to take at least 5 seconds to be processed:

And now back to the original problem, how can we fetch the URLs in parallel so that our script completes in around 5 seconds? It turns out it’s possible to do this with node with the child_process module. Child_process basically lets you fire up a second nodejs instance and use IPC to pass messages between the parent and it’s child. We’ll need to move a couple of things around to get this to work and the implementation ends up looking like:

What’s happening now is that we’re launching a child process for each URL we want to process, passing a message with the target URL, and then passing the links back to the parent. And then running along with a timer results in:

It isn’t exactly 5 seconds since there’s a non-trivial amount of time required to start each of the child processes but it’s around what you’d expect. So there you have it, we’ve successfully demonstrated how you can achieve parallelism with nodejs.

React Native: 48 hrs with React

Here at Setfive, when not helping our clients with their technology woes, we love experimenting with fun new technology and continuously growing our professional tool kit. As of late we have been throwing around some potential ideas for a Setfive iPhone app (get Chicken Pad Thai delivered no matter where you are in the world) and have been looking at a couple of tools to turn this lofty dream into a reality.

Since no one in the office has any significant experience with Objective C or Swift we decided that, rather than bang our heads against the wall trying to learn the nuances of yet another programming language, we would look to one that we already know, and decided to dip our toes into the JavaScript driven React Native ecosystem.

In the past we have taken our chances with other cross-platform native apps (PhoneGap in particular) that allow you to wrap a web app into a web view however, these methods fall short; if the looks of your app don’t bother you then the serious performance hit that comes from interfacing directly with native objects will.

If you are unfamiliar with React Native, it’s an open source JavaScript framework, built by Facebook, that allows the production of iOS and Android applications using a syntax familiar to HTML. What separates React from other frameworks is that it runs a separate JavaScript thread to control the UI of your application so it can utilize native mobile components. The idea is that this should lead to a seamless user experience that feels both polished and native. After some hands on experience we can say that React did not disappoint.

What we built

In order to test React a little bit farther than the simple hello world example we decided to build a simple application on top of our preexisting Rotorobot API. This app allows users to see the available players for a daily fantasy sports contest on that night.

To get started, we needed an index page that would show up when the app was loaded. Just to keep things simple we incorporated a minimalist layout with a UIButton that responds when pressed.

Upon pressing the button a new scene is added to the storyboard of our application. This scene is a ListView Component that has a row for every available slate of games that will be played that night. In addition, each one of these rows is also wrapped in a TouchableHighlight Component, which allows them to respond to touches.

Any row that you touch results in an AJAX request being made to the Rotorobot API and the available players in that slate are displayed, sorted in descending order based on salary.

Our Experience

Getting something up and running with React Native was definitely a lot easier than expected. It was honestly as simple as using npm to install react-native-cli and then creating a new react native project using react-native init. The init function provides everything that you need to run a React Native application.

After your project has been setup it’s time for all of you Xcode naysayers to either bite the bullet and download Xcode or look for guidance in the form of another SaaS solution.

In order to help you out we’ve provided links to a couple:

If you’re old fashioned like us, and opted for the traditional route then you installed X-Code. The react-native init function creates an Xcode project file inside the ios folder in your new project’s directory. Simply open this up using Xcode and you are off to the races and ready to build/run your project at will.

There was only one minor gotcha that reared it ugly head while trying to get our application up and running. The React Native Packager runs underneath node and requires a port for its functionality. The default port that it runs on is 8081, and there is a chance that you could have a process already running on that port so your application will not be able to run. So before you try and run your Xcode project for the first time it is worth doing a quick check to make sure that port 8081 is free using:

sudo lsof -i :8081

Other than this minor inconvenience you should be all set for development!

After an hour or two of playing with React Native and building a pretty simple app, the power and simplicity of this framework became clear to us. First and foremost it was very refreshing to only have to run one or two npm commands and then be writing code in minutes afterwards. Setup was quick and painless which is always appreciated. During development we immediately noticed that developing our app felt just like developing for the web. Laying out the application was done using the CSS flex box, and was both quick and intuitive. Additionally, and probably more importantly, the framework just works. The UI components are native UIViews so naturally they look, feel, and behave the same as normal native components. We would definitely consider using React in the future and look forward to seeing how it improves and progresses from here.

Scala: Building with Eclipse and Maven

We’ve been writing a bit of Scala lately (more on that later) and one of the “gotchas” we ran into was adding a Maven project in the Scala IDE (Eclipse). We wanted to use Maven because we needed to manage some Java dependencies, are generally more familiar with it, and didn’t want to deal with figuring out sbt. It turns out, there’s an existing Maven archetype for building Scala projects but it takes a bit of finagling to get it to work in Eclipse.

From Eclipse

The first thing you’ll need to do is add a “Remote Catalog” to your list of available Maven archetypes. To do this, click through Windows > Preferences and then on the left navigate through > Maven > Archetypes > Add Remote Catalog. From there, you’ll need to add a “Remote Catalog” with the catalog file set to http://repo1.maven.org/maven2/archetype-catalog.xml.

Once this is done, you’ll be able to File > New > Other and select Maven > Maven Project. On the archetype selection screen you’ll now be able to search for “net.alchim31.maven” which is what you’ll want to select.

When I tested this, there were a couple of problems with the project that the archetype created. To solve these issues I had to do the following:

  • The pom.xml was generated with a placeholder for my Scala version so I had to replace all the instances of “${scala.version}” in the pom with “2.11.7”. You’ll want to match this with the version of Scala you have installed.
  • junit wasn’t properly importing so the classes in test/ were throwing a compile error. I didn’t have any immediate testing needs so I deleted the entire test/ directory and removed the test related dependencies: junit, org.specs2, org.scalatest
  • The pom passes an invalid “-make:transitive” option to scalac which I just removed. It’s around line 51 inside the “args” block for scala-maven-plugin
  • The archetype also sets the compiler version to 1.6 which I bumped to 1.8

Creating a runnable JAR

Another common “gotcha” with Scala and Maven is creating a runnable JAR, so basically something you can run with “java -jar yourjar.jar”. This is a bit tricky with Scala since you have to package in the scala library along with your dependencies. And then on the Maven side, it seems like there’s a dozen ways to accomplish this successfully. I ended up using the maven-assembly-plugin with the following configuration:

And then you can compile and run like any other Maven project:

A working pom.xml

Copied below is the pom.xml file in all of its glory. Let me know if you run into any issues.

Apps: Five unique branded mobile apps

Unless you’ve been living under a rock for the last couple of years, it’s clear that smartphones are kind of a big deal. Today there are close to 2.6 billion subscriptions globally, and this number stands to grow rapidly as less developed markets turn into substantial electronic consumers.

With new applications and games hitting app stores daily, it’s no surprise that people are spending more and more time with their eyes glued to the glass screens of these compact, universal media agents. Phones have single handedly changed the way that people live, becoming pivotal for people to communicate, go online, and access and share information.

The ubiquitous nature of the smartphone has opened the floodgates of opportunity, creating new markets in the process and forcing pre-existing ones to modernize or be strangled at the hands of innovation. One market in particular that has been significantly impacted by the mobile revolution is the advertising and marketing industry.

The rise of the Internet era has led to a rapid decrease in the effectiveness of traditional forms of advertising media, and have forced the hand of the industry to take the plunge into to more digital forms. Marketing companies have to continuously find new and improved ways of reaching their targets in a world where TV and print simply will not do. One unique strategy that companies have used to connect with the masses and promote brand awareness is branded Apps. Smartphone users spend close to 90% of their time on devices using apps so they offer an incredible opportunity to connect with consumers.

Recently here at Setfive we have taken a look at building some brandable tools for mobile, and looked for the most original branded apps out there for some inspiration. Here are some of the most interesting apps that we found during our quest for inspiration.

Pop Secret Perfect Pop App

This app uses your phones speakers to listen to the pops coming from the bag of popcorn in your microwave, and tells you the precise moment when your bag of popcorn is a peak popped-ness.

Charmin SitOrSquat App

This app uses a map to display available public restrooms in your area and lets you know how clean they are (hence the name SitOrSquat). Additionally this app utilizes crowd sourcing, letting users to rate and write reviews about the public bathrooms that they use.

The Snow Report by North Face

So what has the North face been up to besides making incredibly epic TV commercials? That’s easy, they’ve been building a location based mobile app that helps users check the condition on powder before they head to the slopes. Check the status of runs around you or the 10 top slopes globally.

Tide Stain Brain

When stains happen, StainBrain gives you simple solutions on the spot. Blot or soak? Cold water or hot? Get the scoop on how to rid yourself of more than 85 different stains with on-the-go tips and easy, step-by-step washing instructions from the laundry pros at Tide.

Bosch Unit Converter: Professional converter for over 50 units

The Bosch Professional Unit Converter turns a smartphone into a universal unit converter. The ideal app for quick conversions on the building site or in the workshop. Completely free of charge and ads.

ML: Taking AWS machine learning for a spin

I’ll preface this by saying that I know just enough about machine learning to be dangerous and get myself into trouble. That said, if anything is inaccurate or misleading let me know in the comments and I’ll update it. Last April Amazon announced Amazon Machine Learning, a new AWS service aimed at developers to help them build and deploy machine learning solutions. We’ve been excited to experiment with AWS ML since it launched but haven’t had a chance until just now.

A bit of background

So what is “machine learning”? Looking at Wikipedia’s definition machine learning is ‘is a subfield of computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. In 1959, Arthur Samuel defined machine learning as a “Field of study that gives computers the ability to learn without being explicitly programmed”.’ That definition in turn translates to using a computer to solve problems like regression or classification. Machine learning powers dozens of the products that internet users interact with everyday from spam filtering to product recommendations to Siri and Google Now.

Looking at the Wikipedia article, ML as a field has existed since the late 1980s so what’s been driving its recent growth in popularity? I’d argue key driving factors have been compute resources getting cheaper, especially storage, which has allowed companies to store orders of magnitude more data than they were 5 or 10 years ago. This data along with elastic public cloud resources and the increasing maturity of open source packages has made ML accessible and worthwhile for an increasingly large number of companies. Additionally, there’s been an explosion of venture capital funding into ML focussed startups which has certainly also helped boost its popularity.

Kicking the tires

The first thing we need to do before testing out Amazon ML was to pick a good machine learning problem to tackle. Unfortunately, we didn’t have any internal data to test with so I headed over to Kaggle to find a good problem to tackle. After some exploring I settled on Digit Recognizer since its a “known problem”, the Kaggle challenge had benchmark solutions, and no additional data transformations would be neccessary. The goal of the Digit Recognizer problem is to accept bitmap representations of handwritten numerals and then correctly output what number was written.

The dataset is a modified version of the Mixed National Institute of Standards and Technology which is a well known dataset often used for training image processing systems. Unlike the original MNIST images, the Kaggle dataset has already been converted to a grayscale bitmap array so individual pixels are represented by an integer from 0-255. In ML parlance, the “Digit Recognizer” challenge would fall under the umbrella of a classification problem since the goal would be to correctly “classify” unknown inputs with a label, in this case a 0-9 digit. Another interesting feature of the MNIST dataset is that the Wikipedia provides benchmark performance for a variety of approaches so we can have a sense of how AWS ML stacks up.

At a high level, the big steps we’re going to take are to train our model using “train.csv”, evaluate it against a subset of known data, and then predict labels for the rows in “test.csv”. Amazon ML makes this whole process pretty easy using the AWS Console UI so there’s not really any magic. One thing worth noting is that Amazon doesn’t let you select which algorithm will be used in the model you build, it selects it automatically based on the type of ML problem. After around 30 minutes your model should be built and you’ll be able to explore the model’s performance. This is actually a really interesting feature of Amazon ML since you wouldn’t get these insights with visualizations “out of the box” from most open source packages.

Performance

With the model built the last step is to use it to predict unknown values from the “test.csv” dataset. Similar to generating the model, running a “batch prediction” is pretty straightforward on the AWS ML UI. After the prediction finishes you’ll end up with a results file in your specified S3 bucket that looks similar to:

Because there are several possible classifications of a digit the ML model generates a probability per classification with the largest number being the most likely. Individual probabilities are great but what we really want is a single digit per input sample. Running the input through the following PHP will produce that along with a header for Kaggle:

And finally the last step of the evaluation is uploading our results file to Kaggle to see how our model stacks up. Uploading my results produced a score of 0.91671 so right around 92% accuracy. Interestingly, looking at the Wikipedia entry for MNIST a 8% error rate is right around what was academically achieved using a linear classifier. So overall, not a bad showing!

Takeaways

Comparing the model’s performance to the Kaggle leaderboard and Wikipedia benchmarks, AWS ML performanced decently well especially considering we took the defaults and didn’t pre-process the data. One of the downside of AWS ML is the lack of visibility into what algorithms are being used and additionally not being able to select specific algorithms. In my experience, solutions that mask complexity like this work great for “typical” use cases but then quickly breakdown for more complicated tasks. Another downside of AWS ML is that it can currently only process text data that’s formatted into CSVs with one record per row. The result of this is that you’ll have to do any data transformations with your own code running on your own compute infrastructure or AWS EC2.

Anyway, all in all I think Amazon’s Machine Learning product is definitely an interesting addition to the AWS suite. At the very least, I can see it being a powerful tool to be able to quickly test out ML hypothesis which can then be implemented and refined using an open source package like skit-learn or Apache Mahout.