On a few of our projects we have a few different needs to either queue items to be processed in the background or we need a single request to be able to process something in parallel. Generally we use Gearman and the GearmanBundle. Let me explain a few different situations where we’ve found it handy to have Gearman around.
Often we’ll need to do something which takes a bit more time to process such as sending out a couple thousand push notifications to resizing several images. For this example lets use sending push notifications. You could have a person sit around as each notification is sent out and hope the page doesn’t timeout, however after a certain number of notifications, not to mention a terrible user experience, this approach will fail. Enter Gearman. With Gearman you are able to basically queue the event that a user has triggered a bunch of notifications that need to be processed and sent.
What we’ve done above is sent to the Gearman server a job to be processed in the background which means we don’t have to wait for it to finish. At this point all we’ve done is queued a job on the Gearman server, Gearman itself doesn’t know how to run the actual job. For that we create a ‘worker’ which reads jobs and processes them:
The worker will consume the job and then process it as it sees fit. In this case we just loop over each user ID and send them a notification.
One one of our applications users can associate their account with multiple databases. From there we go through each database and create different reports. On some of the application screens we let users poll each of their databases and we aggregate the data and create a real time report. The problem with doing this synchronously is that you have to go to each database one by one, meaning if you have 10 databases and each one takes 1 seconds to get the data from, you have at least ten seconds the user is waiting around; this doesn’t go well when you have 20 databases and so on. Instead, we use Gearman to farm out the task of going to each database and pull the data. From there, we have the request process total up all the aggregated data and display it. Now instead of waiting 10 seconds for each database, we farm out the work to 10 workers, wait 1 second and then can do any final processing and show it to the user. In the example below for brevity we’ve just done the totaling in a controller.
What we’ve done here is created a job for each connection. This time we add them as tasks, which means we’ll wait until they’ve completed. On the worker side it is similar to except you return some data, ie `return json_encode(array(‘total’=>50000));` at the end of the the function.
What this allows us to do is to farm out the work in parallel to all the databases. Each worker runs queries on the database, computes some local data and passes it back. From there you can add it all together (if you want) and then display it to the user. With the job running in parallel the number of databases you can process is no longer limited on your request, but more on how many workers you have running in the background. The beauty with Gearman is that the workers don’t need to live on the same machine, so you could have a cluster of machines acting as ‘workers’ and be able to process more database connections in this scenario.
Anyways, Gearman has really made parallel processing and farming out work much easier. As the workers are also written in PHP, it is very easy to reuse code between the frontend and the workers. Often, we’ll start a new report without Gearman; getting logic/fixing bugs in a single request without the worker is easier. After we’re happy with how the code works, we’ll move the code we wrote into the worker and have it just return the final result.
Good luck! Feel free to drop us a line if you need any help.
Recently on a project I had a situation where I was using the Symfony2 forms component without an entity. In addition to each field’s constraints, I needed to something similar to symfony 1.4’s conditional validator so that I could make sure that the form on the whole was valid. There are a bunch of docs out there on how to use callback functions on an entity to do this, however I didn’t see much on how to get the entire form that has no entity to do a callback. After reading some of the source code, found that you can set up some ‘form level’ constraints in the setDefaultOptions method. So it will look something like this:
You pass the Callback constraint an array methods which it can call. If you pass one of those methods is an array it is parsed as class::method. In my case by passing $this it uses the currently instantiated form, rather than trying to call the method statically.
From there you can do something like this:
The first parameter is the form’s data fields. From there you can add global level errors to the form, such as if a combination of fields are not valid.
Good luck out there.
Recently we were getting ready to deploy a new project which functions only over SSL. The project is deployed on AWS using the Elastic Load Balancers (ELB). We have the ELB doing the SSL termination to reduce the load on the server and to help simply management of the SSL certs. Anyways the the point of this short post. One of the developers noticed that on some of the internal links she kept getting a link something like “https://dev.app.com:80/….”, it was properly generating the link to HTTPS but then specify port 80. Of course your browser really does not like that as its conflicting calls of port 80 and 443. After a quick look into the project we found that we had yet to enable the proxy headers and specify the proxy(s), it was we had to turn on `trust_proxy_headers`. However, doing this did not fix the issue. You must in addition to enable the headers specify which ones you trust. This can be easily done via the following:
Here is a very simple example of how you could specify them. You just let it know the IP’s of the proxy(s) and it will then properly generate your links.
You can read up on this more in the Symfony documentation on trusting proxies.
Anyways just wanted to put throw this out there incase you see this and realize you forgot to configure the proxy in your app!
Recently I was working on a project in which it admins were able to impersonate other users. It’s a fairly easy task to add to Symfony2, merely adding a switch_user reference to your firewall can make it possible, consult the Symfony docs for more on that. One thing I noticed was that every now and then when testing I would get weird errors after switching between multiple users, however it didn’t always happen. After some digging around, it turns out when you switch user it does not clear that sessions attributes, ie if you set attribute ‘hello’ to value ‘world’ it would persist after you’ve impersonated another user. This caused a few issues as on this application we used the session to store a few things like which set of database connections you currently use.
After looking at the SecurityBundle configuration setup it was clear that there wasn’t any options to have it clear all session attributes on switch user. At this point it was clear I needed to use an event listener as the firewall dispatched the SwitchUserEvent when a user successfully switched user. Below is an excerpt from my services.yml
This makes it so that it will call the following code on a successful impersonation of a user:
It’s as simple as that, you can get the actual user by calling $event->getTargetUser(). Long story short, the session can have some tainted values when using switch user as all attributes are not cleared.
A couple of months ago we started building out a Symfony 1.4 project for a client that involved allowing a “super admin” to add Doctrine models and columns at runtime. I know, I know, crazy/terrible/stupid idea but it mapped so well in to the problem space that we decided that a little “grossness” to benefit the readability of the rest of the project was worth it. Since users were adding models and columns at runtime we had to subsequently perform model rebuilds as things were added. Things worked fine for awhile, but eventually there were so many tables and columns that a single model rebuild was taking ~1.5 minutes on an EC2 large.
Initially, we decided to move the rebuild to a background cron process but even that began to take a prohibitively long time and made load balancing the app impossible. Then I started wondering is it possible to partially rebuild a Doctrine model for only the pieces that have changed?
Turns out it is possible. Looking at sfDoctrineBuildModelTask it looked like you could reasonably just copy the execute() method out and update a few lines.
Then, the next piece was just building the forms for the corresponding models. Again, looking at sfDoctrineBuildFormsTask it looked like it would be possible to extract and update the execute() method.
Anyway, without further ado here is the class I whipped up:
Using it is pretty straightforward, just call FastModelRebuild::doRebuild( array(“sfGuardUser”, “sfGuardUserProfile”) ); and thats it!
Anyway, fair warning I’d only do something like this if you “Know what you are doing” ™
As always, questions and comments are welcome.