A feature request we get fairly frequently is the ability to convert an HTML document to a PDF. Maybe it’s a report of some sort or a group of charts but the goal is the same – faithfully replicate a HTML document as a PDF. If you try Google, you’ll get a bunch of options from the open source wkhtmltopdf to the commercial (and pricey) Prince PDF. We’ve tried those two as well as a couple of others and never been thrilled with the results. Simple documents with limited CSS styles work fine but as the documents get more complicated the solutions fail, often miserably. One conversion method that has consistently generated accurate results has been using Chrome’s “Print to PDF” functionality. One of the reasons for this is that Chrome uses its rendering engine, Blink, to create the PDF files.

So then the question is how can we run Chrome in a way to facilitate programmatically creating PDFs? Enter, Electron. Electron is a framework for building cross platform GUI applications and it provides this by basically being a programmable minimal Chrome browser running nodejs. With Electron, you’ll have access to Chrome’s rendering engine as well as the ability to use nodejs packages. Since Electron can leverage nodejs modules, we’ll use Gearman to facilitate communicating between our Electron app and clients that need HTML converted to PDFs.

The code as well as a PHP example are below:

As you can see it’s pretty straightforward. And you can start the Electron app by running “./node_modules/electron/dist/electron .” after running “npm install”.

One caveat is you’ll still need a X windows display available for Electron to connect to and use. Luckily, you can use Xvfb, which is a virtual framebuffer, on a server since you obviously wont have a physical display. If you’re on Ubuntu you can run the following to grab all dependencies and setup the display:

sudo apt-get install chromium-browser libgconf-2-4 xvfb
Xvfb :19 -screen 0 1024x768x16 &
export DISPLAY=:19

After that, you can launch your Electron app normally and it’ll use a virtual display.

Anyway, as always let me know if you have any questions or feedback!

Posted In: Javascript

Tags: ,

We recently started a new project and decided to use TypeScript along with Angular 1.5. Angular 1.5 introduces a new abstraction called a “component” which closely resembles Angular 2’s component based approach. Surprisingly, there isn’t a lot of simple TypeScript sample code available for Angular 1.5 so I decided to throw something together in case anyone else is looking. The code is available at https://github.com/Setfive/ng_typescript_starter and a live demo of it is running at http://code.setfive.com/ng_typescript_starter.

So what are some standouts with TypeScript and Angular 1.5?

  • The 1 way bindings components introduce are easier to reason about but having to explicitly add functions for “outputs” does add some verbosity
  • Related to that, there’s a fair amount of boilerplate to create a single component since you have to define 2 classes
  • Dropping $scope in favor of automatically binding the controller object to $ctrl in templates is great – especially with TypeScript classes
  • Related to that, without $scope for events it’s unclear when it’s appropriate to use $rootScope for an event bus
  • You can write typesafe code for almost all of your controller business logic
  • It’s really unfortunate the TypeScript compiler can’t typecheck your Angular templates
  • Using the $inject annotation with component classes looks “right” versus the “array like” syntax
  • You need to be somewhat cognizant of matching your @types annotations with the correct version of the library you’re using
  • Using components with ui-router makes it fairly difficult to communicate between sibling views

Anyway, beyond fighting with build tools to convert a TypeScript project into usable JavaScript the language part has been great to work with. We ended up using Browserify with tsify but it was pretty frustrating to get it working. I might of missed something but it seems like I needed tsify available in a separate node_modules directory from the project source. The demo app is setup this way for that reason.

As always, questions and comments are welcome!

Posted In: AngularJS, Javascript

Tags: ,

Over the past few day we’ve been evaluating using Angular 1.x vs. Angular 2 for a new project on which in the past we would have used Angular 1.x without much debate. However, with release of Angular 2 around the corner we decided to evaluate what starting a project with Angular 2 would involve. As we started digging in it became clear that using Angular 2 without programming in TypeScript would be technically possible but painful to put it lightly. Because of the tight timeline of the project we decided that was too large of a technical risk so we decided to move forward with 1.x. But I decided to spend some time looking at TypeScript anyway, for science. I didn’t have anything substantial to write but needed to hammer out a quick HTML scraper so I decided to whip it up in TypeScript.

Getting started with TypeScript is easy you just use npm to install the tranpiler and you’re off to the races. As I started experimenting, I fired up PhpStorm 10+ and was thrilled to learn it has good TypeScript support out of the box (thanks JetBrains!). The scraper I was writing is pretty simple – make a series of HTTP requests, extract some elements out of the HTML via CSS selectors, and write the results out to a CSV. Coming from a JavaScript background, jumping right into TypeScript was easy enough since TypeScripts’ syntax is basically ES2015 with additional Java or C# like type declarations. The scraper is less than 100 lines so I didn’t get a great sense of what programming with TypeScript would be like but here are some initial takeaways.

It’s easy to end up missing out on the benefits. Since TypeScript is a superset of JavaScript you’re free to ignore all the type features and write TypeScript that is basically ES2015. Combine that with the fact that the tsc transpiler will produce JavaScript even with type errors and you can quickly find yourself not enjoying any of the benefits TypeScripts introduces. This issue isn’t unique to TypeScript since you can famously write You Can Write FORTRAN in any Language but I think since its a superset of an existing, popular langue the temptation is much stronger.

Discovering functionality in modules is easier. In order to properly interface with nodejs modules you’ll need to grab type definitions from somewhere like DefinitelyTyped. The definition files are similar to “.h” files from C++, code stubs that just provide function type signatures to TypeScript. An awesome benefit of this is that it’s much easier to “discover” the functionality of nodejs modules by looking at how the functions transform data between types. It also makes it much easier to figure out the parameters of a callback without having to dig into docs or code.

Typed generics will unequivocally reduce bugs. I’d bet a beer or two that most web developers spend the majority of their day writing code that deals with lists. Creating them, filtering them, transforming them, etc. Unfortunately, most of the popular scripting languages don’t have support for typed generics and specifically enforcing uniform types within arrays. Specifically with JavaScript, it’s pretty easy to end up at a point where you’re unsure of what’s contained in a list and moreover if the objects within it share any of the same properties. Because of this, I think TypeScript’s typed generics will cut down on bugs almost immediately.

TypeScript is definitely interesting and it’s tight coupling to Angular 2 only bolsters how useful it’ll be in the future. Next up, I’d be interested in building something more substantial with both a client and server component and hopefully share some of the same code on both.

As always, questions and comments are more than welcome!

Posted In: Javascript

Tags: , ,

Note: This is a bad idea™. Don’t do it unless you know why you’re doing it.

AngularJS’s form control to Javascript object binding is pretty core to how the framework works but at some point you might find yourself wishing you could just serialize a form. Maybe you’re getting the HTML for the form from a 3rd party source so it would be hard to bind to an object. Or maybe the form you have is huge and you don’t really care about validating it. So how can you serialize a form in AngularJS like you would in jQuery?

What you need to do is create a custom directive to render you form, use the directive’s “link” function to grab the form element, and then use jQuery’s serialize() function to generate a string that you could shoot off in a POST request.

Here’s a sample implementation:

Again, you should really only do this if you have a valid reason since you’re really fighting the framework by manipulating data this way.

Posted In: AngularJS


Following up on our previous post, after evaluating Flume we decided it was a good fit and chose to move forward with it. In our specific use case the data we are gathering is ephemeral so we didn’t need to enforce any deliverability or durability guarantees. For us, missing messages or double delivery is fine as long as the business logic throughput on the application side wasn’t affected. Concretely, our application is high volume, low latency HTTP message broker and we’re looking to record the request URLs via Flume into S3.

One of the compelling aspects of Flume is that it ships with several ways to ingest and syndicate your data via sources and sinks. Since we’re targeting S3 we’d settled on using the default HDFS sink but we have some options on the source. For a general case with complex events the Avro source would be the natural choice but since we’re just logging lines of text the NetCat source looked like a better fit. One of the issues we had with the NetCat source is that it’s TCP based so on the application side we’d need to implement timeouts and connection management on the application side. In addition to that, looking at the code of the NetCat source you’ll notice it’s implemented using traditional Java NIO sockets but if you check out the Avro source it’s built using Netty NIO which can leverage libevent on Linux.

Given those issues and our relaxed durability requirements we started looking at the available UDP sources. The Syslog UDP source looked the promising but it actually validates the format of the inbound messages so we wouldn’t be able to send messages with just the URLs. The code for the Syslog UDP source looked pretty straightforward so at this point we decided to build a custom source based on the existing Syslog UDP source. Our final code ended up looking like:

The big changes were in the implementation of messageReceived and the creation of the new extractEvent method. Including your new source in Flume is straightforward, you just need to build a JAR and drop that into Flume’s “lib/” folder. The easiest way to do this is with javac and jar to package it up. You’ll just need a binary copy of Flume so that you can reference its JARs. Build it with:

And then, you can test this out by creating a file named “agent1.conf” in your Flume directory containing:

Finally, you need to launch Flume by running:

ashish@ashish:~/Downloads/apache-flume-1.6.0-bin$ bin/flume-ng agent --conf conf --conf-file agent1.conf --name a1 -Dflume.root.logger=INFO,console

And then to test it you can use “netcat” to fire off some UDP packets with:

ashish@ashish:~/Downloads$ echo "hi flume" | nc -4u -w1 localhost 44444

Which you should see come across your console that’s running Flume. Be aware, the Flume logger truncates messages so if you send a longer string you won’t see it in the logger.

And that’s it. Non-durable, UDP source built and deployed. Anyway, we’re still pretty new to Flume so any feedback or comments would be appreciated!

Posted In: Big Data, Java

Tags: ,