PHP: How fast is HipHop PHP?

In the last post, we walked through how to install HipHop PHP on a Ubuntu 13.04 EC2. Well thats great but it now leads us to the question of how fast HipHop PHP actually is. The problem with “toy benchmarks” is they tend to not really capture the real performance characteristics of whatever you’re benchmarking. This is why comparing the performance of a “Hello World” app across various languages and frameworks is generally a waste of time, since its not capturing a real world scenario. Luckily, I actually have some “real world”‘ish benchmarks from my PHP: Does “big-o” complexity really matter? post a couple of months ago.

Ok so great, lets checkout the repository, run the benchmark with HipHop and Zend PHP, and then marvel at how HipHop blows Zend PHP out of the water.

Wtf?

Well so that is weird, in 3 out of the 4 tests HipHop is an order of magnitude slower than Zend PHP. Clearly, something is definitely not right. I double checked the commands and everything is being run correctly. I started debugging the readMemoryScan function on HipHop specifically and it turns out that the problem function is actually str_getcsv. I decided to remove that function as well as the array_maps() since I wasn’t sure if HipHop would be able to optimize given the anonymous function being passed in. The new algorithms file is algorithms_hiphop.php which has str_getcsv replaced with an explode and array_map replaced with a loop.

Running the same benchmarks again except with the new algorithms file gives you:

Wow. So the HipHop implementation is clearly faster but what’s even more surprising is that the Zend PHP implementation gains a significant speedup just by removing str_getcsv and array_map.

Anyway, as expected, HipHop is a faster implementation most likely due to its JIT compilation and additional optimizations that it’s able to add along the way.

Despite the speedup though, Facebook has made it clear that HipHop will only support a subset of the PHP language, notably that the dynamic features will never be implemented. At the end of the day, its not clear if HipHop will gain any mainstream penetration but hopefully it’ll push Zend to keep improving their interpreter and potentially incorporate some of HipHop’s JIT features.

Per Dan’s comment below, HHVM currently supports almost all of PHP 5.5s features.

Why is str_getcsv so slow?

Well benchmarks are all fine and well but I was curious why str_getcsv was so slow on both Zend and HipHop. Digging around, the HipHop implementation looks like:

So basically just a wrapper around fgetcsv that works by writing the string to a temporary file. I’d expect file operations to be slow but I’m still surprised they’re that slow.

Anyway, looking at the Zend implementation it’s a native C function that calls into php_fgetcsv but doesn’t use temporary files.

Looking at the actual implementation of php_fgetcsv though its not surprising its significantly slower compared to explode().

PHP: Installing HipHop PHP on Ubuntu

A couple of weeks ago, a blog post came across /r/php titled Wow HHVM is fast…too bad it doesn’t run my code. The post is pretty interesting, it takes a look at what the test pass % is for a variety of PHP frameworks and applications. This post was actually the first time I’d heard an update about HipHop in awhile so I was naturally curious to see how the project had evolved in the last year or so.

Turns out, the project has undergone a major overhaul and is well on its way to achieving production feature parity against the Zend PHP implementation. Anyway, I decided to give installing HipHop a shot and unfortunately their installation guide seems to be a bit out of date so here’s how you do it.

Quickstart

To keep things simple, I used a 64-bit Ubuntu 13.04 AMI (https://console.aws.amazon.com/ec2/home?region=us-east-1#launchAmi=ami-e1357b88) on a small EC2. One thing to note is that HipHop will only work on 64-bit machines right now.

Once you have the EC2 running, Facebook’s instructions on GitHub are mostly accurate except that you’ll need to manually install libunwind.

After that, you can test it out by running:

Awesome, you have HipHop running. Now just how fast is it?

Well you’ll have to check back for part 2…

PHP: Is PHP losing popularity?

I was on Quora earlier this week and ran across a question asking Why is PHP losing popularity? along with the elaboration being:

I’ve been keeping my eye on job boards and tech media, and it seems that the new trend is up with Node.js and down with PHP, Ruby staying about the same.
I know every tool has its purpose, but most web applications could be built in any language and framework and fare all the same. PHP is surely scalable, fast enough, safe enough, and heavily supported. Facebook has been happy enough to keep it around, and they have a lot to own up to.

The top two answers basically declare that PHP is on its way out because developers have more options (Ruby, Python NodeJS, etc.) and because the PHP ecosystem is “standing still”. It’s pretty clear that both of these answers are incomplete, if not outright wrong but then where does the truth lie?

Since this isn’t high school there isn’t a universal metric for how to evaluate the popularity of a given programming language. People tend to use things like search trends, job salaries, or the number of questions tagged on StackOverflow. These KPIs are fine for vanity comparisons but they don’t really reveal anything about the velocity, evolution, or enthusiasm for a language or framework. Since we’re concerned specifically with PHP, it’s easier to pick out specific examples that demonstrate its continuing popularity.

Drupal and WordPress

Although treated with disdain by developers, both applications power hundreds of millions of websites. According to Wikipedia, “WordPress is used by more than 18.9% of the top 10 million websites as of August 2013.” and Drupal “is used as a back-end system for at least 2.1% of all websites worldwide”. Which, to put in perspective, are both absolutely staggering numbers. On top of that, Automatic and Acquia, the companies commercially backing WordPress and Drupal, have been raising money and growing at a phenomenal pace.

So what does that mean for PHP? Velocity. With two well funded commercial companies actively developing, selling, and supporting PHP software there will be an increasing number of PHP sites coming online every day.

Symfony, Doctrine, and Zend

In the last few years, all 3 projects have completely overhauled their architectures and rewritten their code bases to incorporate from their respective “version 1s”. A total rewrite is a heroic feat for any project, let alone a popular open source project with thousands of users. All three rewrites have had a positive effect on the PHP community as a whole, including powering new frameworks (Symfony2, Laravel, etc.), enabling consolidation (Drupal 8 powered by Symfony Components), and of course pushing developers to evaluate PHP for new projects.

For PHP, this certainly signals a continuing evolution and a willingness of the community to learn, adapt, and evolve as the web changes.

The Language / PHP Internals

Unfortunately, this is one facet that strongly negatively affects the perception of the PHP ecosystem. Looking at PHP the language, plenty has been written bemoaning the inconsistencies, “wtfs”, and generally bizarre paradigms that the language constructs introduce. Although things have certainly gotten better, a lot of the same issues people were complaining about in PHP4 still exist today with no roadmap for them to be resolved.

Related to the language, is the PHP internals mailing list where core devs discuss language changes and generally how PHP will evolve. Most recently, core dev Anthony Ferrara outlined the major problems with the internals mailing list and generally why he feels the project is in trouble.

The issues with the PHP the language and its apparent lack of real evolution clearly affect the enthusiasm for the ecosystem. Outsiders look in and ask why we’re dealing with a “shitty” language while insiders are stuck defending PHP by pointing to features it got in 5.4 while meekly dodging the fact that there’s still no real unicode support.

So is PHP becoming less popular? Almost certainly not. In the last few years, dozens of interesting new tools and frameworks have been built and at least two VC funded companies have built successful business on software powered by PHP. Unfortunately, the perception of PHP the language as a ghetto is still persistent and the internals team seems to have no plans to change it.

isset(), empty(), is_null() – What’s the difference?

I came across an article on phpro.org about the difference between isset, empty and isnull methods that I found it informative so I’m going to summarize and re-post it here.

There are often times where you need to check for empty or null values or if a variable is set.  It’s pointed out that in many circumstances the wrong function is used to make these assertions. The code may end up working; however, in some cases using the wrong function returns a value that programmer didn’t expect and leads to errors.

Actions speak much louder than words so I’ll cut to the example. The example script tests the following functions and operators:

  • isset()
  • empty()
  • is_null()
  • ==
  • ===

against the following values:

  • no value set
  • null
  • zero
  • false
  • numeric value
  • empty string

and builds out a comparison table (see below) of the results. The notice above the table is because the isset() function is trying to check a variable that has not been initialized(Not Set).

I’ll be honest, I never knew passing a zero into the empty() function returns a true!

Anyways, the chart below ends up becoming a useful reference guide as well.  The code used to produce the table is at the bottom of this post.

The Code

WordPress: Modify Native Calendar Widget for Event Post Types

Recently I was working on WordPress website  and the client’s theme had an existing “Event” custom post type. The client wanted a way to calendarize the event start date and display with a widget.

There is a calendar widget built into WordPress; however, it has a couple shortcomings for what the client was trying to do. First, it calendarises all posts regardless of type (and in this case all we want is event posts). Secondly, it also calendarises posts based on date posted rather than a field such as event date.  As an added constraint, the non-profit client did not have a budget for building a custom plugin from scratch to fill the need.

So I went digging through the PHP script that powers the native calendar widget and found a way to modify the plugin while only having to bill about a half hour of custom development. It actually took me longer to write this post!

It’s not the most ideal or elegant solution but it saved the client a bunch of money by spending almost no time on this.  Couple drawbacks without further modifications 1) the widget won’t display multi-day events properly 2) The calendar widget now only works with event post types (i.e., can’t calendarize post dates of normal blog posts).

Here’s what it ends up looking like:

The Code

PHP

The file I had to modify is located in the WordPress 3.5 installation in wordpressroot>wp-includes>general-template.php

In order to know which days on the calendar to highlight as links to posts, the WordPress devlopers built a query that pulls the day numbers of the posts that were posted in the current month.

There’s a couple issues here as it relates to getting this to work for events as opposed to regular posts. First, the query is pulling post_type equal to “post” so we’ll need to change that “event” so only events are being pulled. Secondly, the original query is selecting posts based on post date in the current month.   For events, when we’re looking at a calendar, we don’t care when it was posted, but rather the start date of the event.

To deal with these issues, I had to rewrite the query. Unfortunately, the event start date field was one of the custom fields added for the event post type so it resides in the `wp_postmeta` table as opposed to `wp_posts so you have to join the `wp_posts` and `wp_postmeta` tables to get access to all the required fields.  The query returns the event start day of the month (dom) , post id (post_id) , url of the event (guid) and event title (post_title).  You’ll see soon why I pulled more fields than just the day numbers like before. Also, I changed the return value to an array of objects instead of numerical array per my own preferences. The new query looks something like this:

The next block of the original code (see gist below) executes a second query almost identical to the first, but instead of pulling just day numbers like before, it pulls the post id, post titles and day numbers. The block following the query goes through each post and puts the titles into an associative array indexed by the day of the month. The array of post titles ($ak_titles_for_day) will used to display the post titles on their corresponding day cell in the calendar when it builds out the calendar table.

Use of two queries seemed a bit duplicative to me since they are so similar so I got rid of the second one and pulled the additional fields in the first query.  Next I modified the foreach to record both the event title and event page url for every event post (this is so we can display a list of event links when a user hovers over a calendar day). I also consolidated the foreach code with the loop that follows the first query.  Based on the results form the first query, two arrays are built.  The first array, $dayswithpost, is simply an array containing all the day numbers of the month that have events to be used when building out the calendar table. The $ak_titles_for_day array, as mentioned above, is an associative array indexed by day numbers. It stores the event title and event url under the corresponding day number index and will be used to display to the user when they hover over a day number on the calendar that has event starting dates.

Lastly, the  code to build out the table/calendar had to be modified.  The original block of code compares the day number of the calendar cell being generated to the $dayswithposts array to determine if any posts are listed for that day.  If the day number has corresponding posts in the $dayswithposts array, an anchor tag is wrapped around the day number and href generated using the built in WordPress function get_day_link(). This function returns the url of an archive page for all posts on a given day. This works nicely for days with multiple posts because you can see all the posts by navigating to that daily archive url which is a built in WordPress template. The post title(s) for days with posts are also inserted into the title attribute of the day number link from the values stored in the $ak_titles_for_day array so the user will see the post titles when they hover over a calendar day link.

For events, a single anchor wrapped around the calendar day number would not work because clicking on the day would just take you to the archive page for events posted on that day not the start date of actual event day.  Therefore the code needs to be modified so that a user is able to click on each individual event title and have it link to the events page url instead of an archive page.

Instead of wrapping the day number in an anchor tag, I switched it to a span so it can be selected later by css/js.  Also, I created a <div> containing event titles and links for all the events for a given day and inserted it after the calendar number </span>.  In the CSS, I set the <div> to display:none, and position:absolute and positioned it relative to the <td> it resides in.  Then using jQuery I set a hover on both the <span> and the <div> so when a user hovers over a day with posts, you can display the <div> with the event links and titles.

JS and CSS