PHP: Quick and dirty CLI tasks

Something that comes up every so often in a sufficiently large PHP project is having to write helper scripts that run on the command line to complete various tasks. It might be periodically processing some images, updating cached analytics, etc. If the project is a Symfony project, it’s usually easy enough to add a Symfony task and be able to leverage the Symfony infrastructure to manage the individual “scripts” as tasks. This is equally true with Drupal, using Drush tasks to manage the individual scripts works well and lets you have a single, central spot for all your “helpers”. But what if its a vanilla PHP project or WordPress?

A technique I’ve started using is to create a class and then add each of the tasks as static functions. This allows you to keep all the tasks in one place, reuse code and configurations, and generally mimic how Symfony tasks and Drush work. From there, the file pulls off $argv to figure out what function to call and just passes $argv in as an argument as well.

Here’s a stub of a class to set something like this up:

Use Exim and pipes to replicate MailGun routes

Over the last month or so I’ve been working with a friend of mine on a little side project (more on that later). During the course of the project, we ended up using Mailgun to receive email which was then passed to a PHP script using Mailgun routes.

Mailgun was pretty easy to setup and things were great but unfortunately in the last day or so our free app started creeping over the 200 message/day limit on the Mailgun free tier. Since the app is currently free, we couldn’t really justify paying the $19/mon minimum for the first Mailgun tier so I started exploring self-hosted options.

After a big of Googling, the first thing that caught my eye was the LamsonProject. From my understanding, Lamson basically provides a SMTP server which binds to a MVC application framework. After checking out the docs and the code it seemed like Lamson would allow you to build a RESTful application that responded to SMTP messages instead of HTTP requests. It certainly seems like a cool project but seemed like a bit overkill for what I needed to do.

Going further down the rabbit hole, I found out that the Exim allows messages to be delivered to a regular *nix process via a pipe and that the configuration was supposedly relatively straightforward under Ubuntu. Given that, I decided to give it a shot, it turned out to be relatively easy to setup but here is a quick rundown.

Step 1: Setup Exim

By default, Ubuntu systems are configured with sendmail and only setup to receive local mail. You’ll need to remove sendmail, install Exim, and then configure it.

To do this you’ll roughly need to run the following:

sudo apt-get remove sendmail
# for some reason the dependencies aren't being installed correctly
# see http://ubuntuforums.org/showthread.php?p=10411454
sudo apt-get install exim4-base exim4-config
sudo apt-get install exim4
sudo dpkg-reconfigure exim4-config

Next, follow the guide on https://help.ubuntu.com/8.04/installation-guide/hppa/mail-setup.html to configure Exim.

Step 2: Configure Pipes and Aliases

Per https://answers.launchpad.net/ubuntu/+source/mailman/+question/120447 it turns out something isn’t enabled in the default Ubuntu configuration of Exim. To solve this, I added the following

.ifndef SYSTEM_ALIASES_PIPE_TRANSPORT
SYSTEM_ALIASES_PIPE_TRANSPORT = address_pipe
SYSTEM_ALIASES_USER = Debian-exim
SYSTEM_ALIASES_GROUP = daemon
.endif

Into the bottom of /etc/exim4/conf.d/transport/30_exim4-config_address_pipe

After you edit an Exim setting, you’ll need to always run the following for the settings to take effect:

sudo update-exim4.conf.template -r
sudo update-exim4.conf
udo service exim4 restart

Next, you’ll need to enable the catch-all router configuration. This was surprisingly difficult to track down how to do:

Per, http://pontus.ullgren.com/view/Setting_SMTP_server_development_test_server_running_Ubuntu you need to create a file at /etc/exim4/conf.d/router/950_exim4-config_catchall containing

catch_all:
   debug_print = "R: catch_all for $local_part@$domain"
   driver = redirect
   data = ${lookup{*}lsearch{/etc/aliases}}
   # NOTE: I added this line for the catch all pipes to work
   pipe_transport = address_pipe

Finally, add a catch-all alias that pipes to a script to your /etc/aliases file. Mine looks like:

*: "|/home/ubuntu/exim4.php"

Step 2: The PHP script

I did this with PHP since my original Mailgun script was in PHP but anything should work. Key notes, the script needs to be accessible AND executable by the Exim user. Also, you’ll need the appropriate shebang to make the script work without a named interpreter.

Your script is going to receive raw SMTP email so you’ll need to parse that out to do anything meaningful with it. Thankfully there are plenty of libraries to do this. I ended up using MailParse along with PHP MimeMail Parse to abstract the nitty gritty into some cleaner object oriented code.

The part of my script that accepts and parses the email looks like:

From there, you’d be free to do anything you wanted with $msgInfo, just like on the receiving end of a Mailgun route.

Anyway, as always let me know if you have any feedback, questions, or comments.

Gmail Reporting Tons of Used Space? This may help!

Recently I had to upgrade my Gmail account for additional storage. I was nearing the 8 free gigs of data they give you and didn’t want to keep seeing the big red “Buy more storage!”. I bought the $5/year 20 gig plan. A few weeks later I noticed that my Gmail was now reporting i was already using ~18 gigs of my total 30 gigs of data. I couldn’t believe it, how did I manage to more than double my used Gmail space within 2 weeks? I had used Gmail for 8 years to get to 7 gigs of data.

After looking around there didn’t seem to be anyone who could report the problem, nevertheless have a fix. I then tried to empty my Trash, which at the time said ~200 messages. As soon as I emptied it my used space dropped to 7.3 gigs, which is what I expected.

Long story short, it would appear that Gmail has a bug in reporting the number of actual messages in the trash, or doesn’t truly ’empty’ it unless click it. If you think you are using much less space than it is reporting, try emptying your trash manually. It worked for me and a number of other guys in the company.

AJAX Request Slow With PHP? Here’s Why

Recently I was working on a project where we had a page which loads tons of data from numerous sources. I decided after a while that we wanted to AJAX each section of data so that the page would load a bit quicker. After splitting up the requests and sending them asyncronously, there was little improvement. I thought at first it may be due to the fact we were pinging a single API for most of the data multiple times, that wasn’t it. Maybe it was a browser limit? Nope was still far below the 6 requests most allow. I setup xdebug and kcachegrind and to my surprise it was the session_start() that was taking the most time on the requests.

I looked around the web for a while trying to figure out what in the world was going on. It turns out that PHP’s default session_start will block future session_starts for the same session until the session is closed. This is because the default method uses a file on the filesystem which it locks until you close it. If you want more information on this and how to close it you can read a bit more here.

We switched over to database based sessions and it fixed it. In symfony 1.4 the default session storage uses the file system, however switching over to sfPDOSessionStorage is very easy and quick.

Deleting files older than specified time with s3cmd and bash

Update: Amazon has now made it so you can set expiration times on objects in S3, see more here: https://forums.aws.amazon.com/ann.jspa?annID=1303

Recently I was working on a project where we upload backups to Amazon S3.  I wanted to keep the files around for a certain duration and remove any files that were older than a month.  We use the s3cmd utility for most of our command line based calls to S3, however it doesn’t have a built in “delete any file in this bucket that is older than 30 days” function.  After googling around a bit, we found some python based scripts, however there wasn’t any that was a simple bash script that would do what I was looking for.  I whipped this one up real quick, it may not be the best looking but it gets the job done: