We’re always on the hunt for talented LAMP developers and as a consequence we end up evaluating a decent amount of fairly diverse PHP code. We always ask potential employees for a code sample so that we can get a sense of their style and generally make sure they have their heads screwed on right. Because of this, we’ve been evaluating PHP samples from everything from Drupal modules, to batch processing scripts, and even “hardware hacks”.
During this process, one of the issues we’ve had is coming up with an objective rubric to evaluate the relative skill of a PHP developer. Although there are several broad criteria for evaluating code, I’ve been interested in coming up with PHP-centric benchmarks since they’re more directly applicable. Here’s a list of criteria that we’ve been working on to help us identify how familiar an engineer is with PHP.
Unfortunately, it’s sometimes easier to spot negative signals so here are a few PHP specific “code smells” we’ve identified.
Often, inexperienced engineers won’t search for standard functions that’ll do exactly what they’re looking for. Not always a bad thing but it’s a sign of inexperience. An example would be:
This one isn’t really PHP specific but we’ve noticed it a couple of times anyway. We’ll often see code that looks something like:
As you can see, the code returns all the results and then evaluates the criteria as opposed to passing the selection criteria to the ORM or SQL.
Due to PHP’s global keyword it’s unfortunately really easy to throw encapsulation to the wind and just make everything a global. Because of that, we’ve seen code that looks like:
What we look for next is usually positive signals that an engineer is generally familiar with PHP. These are generally things you’d pick up after you’ve written a fair amount of PHP.
I know this one will be controversial, but my sense is that if an engineer implements __toString() somewhere in PHP they probably have a decent familiarity with the language since its something you have to “seek out”. A canonical example would be something like:
Output buffering is a bit exotic but it’s indispensable when building web apps without a framework. It also certainly demonstrates a level of familiarity with PHP. A good example would be capturing output from a template and returning it inside some JSON:
Finally, the last couple of things we’ve been looking for are “exotic” techniques that really demonstrate that someone “gets” PHP. Granted, some (most?) of these are bad ideas in production they do convey a certain level of understanding.
Since PHP relies so heavily on arrays, building classes that implement the ArrayAccess interface makes them work more naturally in PHP and definitely demonstrates a strong level of familiarity with the language. An example would be:
Although, using any of these in production is questionable at best, they undeniably do convey a sense that whoever used them knows their way around PHP. An example (from the documentation) would be:
Anyway, like everything I’d take this list with a big grain of salt. We’d also love any input or feedback.
Posted In: PHP
Last week, I was working on a Symfony2 app where I wanted to generate optgroup tags inside the select box of an Entity form type. After poking around, I ran across a StackOverflow answer explaining how to do it. Basically, it turns out what you have to do is manually return a “choices” array from a class that has access to the Entity Manager. I ended up adding a method to my custom repository and passing that repository into my form:
It’s a bit messy and I’m surprised there isn’t an option on the Entity Type to pass in a callback with access to the Entity Manager to generate a choice list. Looking at the source of DoctrineType it looks like you could potentially create a custom type to extend the Entity type and then access the em from your custom function. Even that though, seems like overkill to accomplish something that is reasonably common.
Posted In: Symfony
Earlier this week I was putting together a block of code which ended turning into a switch statement with a tangled mess of long case blocks, complicated fall throughs, and ultimately became impossible to follow. Being a stand up guy, I decided to refactor the block using a technique where the case blocks are converted into anonymous functions, indexed into an associative array, and then the correct function is called depending on the value. I haven’t seen this show up too often in PHP code so I thought I’d share.
The life of a switch statement usually starts out relatively benign, you have a few simple conditions and each block is relatively compact:
But then, the switch grows and each case becomes complicated enough that it the entire block becomes mostly unreadable: Ellipsis for effect.
At this point, its hard to reason about what’s going to happen because each case statement has presumably gotten so large and different conditions are “falling” through so the side effects are difficult to trace through.
An alternative to using a normal switch statement is to use a dispatch table, which is basically an array of functions indexed by whatever variable you’d normally be “switching” on. The primary benefit to structuring the code this way is that you can easily reason about side effects since the only variables that can be changed are what captures the return value or anything passed by reference. In addition, since every case is a separate function its a bit easier to edit the code. So what does this look like? It’s actually pretty straightforward:
Extending from there, you could also call the function with arguments, potentially by reference, and even have all the functions be closures which capture the variables to avoid having to call with arguments.
Anyway, questions or comments always welcome.
Posted In: PHP
The other day, I was hacking away on the PHP backend for the “Startup Institute” visualization and I realized it was going to need a good deal of array manipulation. Figuring it was as good a time as any, I decided to try and leverage PHP 5.3+ new closures along with the array_* functions to manipulate the arrays. I’m not well versed with functional programming but I’ve used Underscore.js’s array/collection functions so this is mostly in comparison to that.
The entire shebang is on GitHub but here is the gist of what we’re intersted in:
There is a CSV file that looks like ssdata.csv.sample except with more entries that is read into a list ($data) where every object has keys cooresponding to the values in the header. Thinking in JSON, the array ends up looking like:
Ok great, but now what can we do with it?
Using the usort function is particularly natural with closures. Compare the following:
It’s pretty clear the version with closures is much shorter, more conscience, and ultimately easier to follow. Being able to “capture” the local $sortKey variable is also a key feature on the closure version since with the static version there’s no easy way to introduce variables into the sorting function.
In the linked example, I used array_map to basically convert an array of characters into an array of ASCII values for those characters.
With such a small map function, it’s hard to see or appreciate the benefits of using the closure along with array_map. With the closure though, you’ll get a couple of benefits including isolated scope so that you won’t inadvertently rely on the value of a variable that isn’t directly related to transforming the array values.
Using the closure would also “look” much cleaner if the array had non-numeric keys, since without being able to use integer indexes the for(…) loop would be more confusing.
This isn’t used but it could have been to return only the elements that were selected.
Looking at the the version with the closure, its a bit easier to follow and since it’ll enforce scope isolation if the “truth test” was a bit more complicated you’d only have to debug what’s actually inside the closure. Also, not having to “skip” some elements leaves the code with a nicer feel and overall I’d argue its just better looking.
Overall, using closures with the array_* functions will definitely lead to cleaner, more concise, and easier to follow code. Unfortunately, there are a few rough spots. Like with most of the standard library, the argument order is inconsistent which is always a constant irritation. For example, for no apparent reason array_map is “callback, array” but array_filter is “array, callback”. Also, another irritation is that the “index” isn’t available inside several of the callbacks like on array_reduce or array_map.
Personally though, the biggest limitation is that none of the array_* functions will work with classes that implement the Traversable or Iterator interfaces. That means if you have a Doctrine_Collection and you want to reduce down to a single result you’re still stuck with a foreach(…).
Anyway, as always I’d love to hear other opinions in the comments.
A few days ago I ran across 2012: A Year in PHP which is a blog post highlighting what changed in PHP during 2012 and what upcoming changes we can expect in 2013. The post sparked a lively discussion on Hacker News which unfortunately basically devolved into a mix of anti-PHP rants and some “meta” commentary. Anyway, as someone that uses PHP daily I started thinking about what irks me about PHP and what would fix it. Thinking through the issues, it feels like fixing PHP’s type system by making primitives real objects would significantly improve the readability, consistency, and attractiveness of the language.
This one is subjective but I think one of the reasons that PHP code looks so ugly is because the procedural array_* and str* functions look jarring mixed in with object oriented code.
Check out this snippet from the Doctrine ORM framework. Even though the code is “object oriented” and nicely spaced, the array_* and str* functions are a serious eye sore. In addition to looking “off”, the procedural functions have inconsistent argument ordering which leads to “needle or haystack?” bugs.
So what would I switch to? How about a fluent interface replacement for the array and string functions that operate as if they were real objects.
PHP arrays are in a funny place in terms of how they interact with the standard library and the syntax of PHP. Arrays in PHP are a primitive type and they are arguably the de-facto data structure for most PHP applications. Like strings though, arrays aren’t objects so programmers are stuck using the procedural array_* functions to manipulate arrays. Similar to above, if they were actually objects we could do away with the procedural functions and manipulate arrays with object oriented style functions.
Compounding the “foreach” issues is the existence of the Iterator interface which allows PHP classes to specifcy that they can be traversed using a foreach loop. This introduces a frustrating limitation in the sense that you can make an object “look” like an array but since the array_* functions only operate on the primitive array type, you can’t leverage any of them on iterable objects. If arrays were actually objects, additional interfaces could be specified to allow some subset of the array_* functions to work on a given class.
In true PHP fashion, arrays as objects actually “sort of” exist within the Standard PHP Library (SPL) Datastructures extension. The SplFixedArray provides a fixed length, integer only array data structure that is actually a PHP object. The problem is you can’t easily just “switch” between using an array versus one of the SPL data structures since they aren’t subsets or supersets of regular primitive arrays, they are PHP classes making them difficult to convert between.
Unfortunately, I don’t know anything about how primitives PHP types work internally so I can’t speak to how difficult it would be to implement these changes. From a compatibility standpoint, it would naively seem like these changes could be made without seriously breaking backwards compatibility while slowly phasing out the old primitive types. On the whole, as long as we don’t end up with Java’s type boxing issues I think we’ll be in a much better place with PHP as a language.
Posted In: PHP