Blog

PHP: Adding variables in the current scope

February 6th, 2010 by Ashish Datta

So earlier today I was working on a Facebook App and wanted to use “partials” in a similar fashion as Symfony’s partials. At this point, I realized I had no idea how Symfony placed variables into the current execution scope when you do things like

include_partial("somePartial", array("foo" => $foo, "bar" => $bar));

A bit of digging led me to the extract() function in PHP.

From the documentation, “extract — Import variables into the current symbol table from an array”.

Pretty neat.


Cool kid stuff: Sizzle for PHP!?

January 6th, 2010 by Ashish Datta

Every now and then, I’ll end up having to scrape HTML pages for some content. I know, I know, there’s like a bazillion different ways to do this, but I *really* like doing it in PHP so I can jack right into Symfony. Usually, I just get down and dirty with the PHP DOM and use XPath to select nodes within the document. The problem with this is that the XPath sucks and the PHP implementation sucks…alot.

But hold on, we know a selector engine that doesn’t suck! The jQuery selector engine, called Sizzle is probably one of the best CSS/DOM selector engines to use. Turns out there is a PHP port! Enter phpQuery

At its root, phpQuery is a port of the jQuery selector syntax to PHP. Additionally, phpQuery includes dozens of the jQuery traversal methods like next(), prev(), find(), and so on. It also implements the CSS3 filters like :first, :last, :eq, ect.

Anyway, if you’re tired of suffering through the PHP XPath implementation and dig jQuery then you should definitely give phpQuery a whirl.


Happy Holidays!

December 25th, 2009 by Ashish Datta

Happy holidays everyone! Hope everyone had an awesome Christmas and is getting excited for a fun New Years Eve and then a great 2010.

Anyway, since the sun never sets on the Setfive empire I was actually doing some coding earlier when I ran across an interesting little problem. What I was looking to do was “match” a string input against a set of acceptable strings. The caveat was that the inputs might have spelling mistakes or typos. For example, an input might be “onnlinee ad” matching against ["online ad", "video", "news", "online"] with the goal of matching “online ad”.

Unfortunately, you can’t simply iterate over the two strings matching letters because a single wrong letter will cause you to miss all of the rest. Remembering back to some old engineering courses I found my way over to the Hamming distance article on Wikipedia. From there, I made my way over to the Levenshtein distance article which proved extremely useful.

So, at this point I figured I wanted to minimize the Levenshtein distance and that would be my matching string. Fortunately enough, PHP has a built in function to calculate Levenshtein distances! levenshtein() The Levenshtein distance works pretty well for what I was looking to do. In addition, PHP has another built in function – similar_text() for comparing two strings. similar_text will return the number of matching characters in the two input strings.

Anyway, the only thing to be aware of is that both these functions have really bad running times. similar_text clocks in at O(n^3) where n is the length of the longest string and levenshtein runs at O(m*n) where m and n are the lengths of the input strings.

Well that’s it for now. Happy string comparing.


Retrieve session timeout in Symfony

December 1st, 2009 by Ashish Datta

We were recently working on an application that required users to enter a significant amount of complex data that often meant that they had to look things up in between saves. Users kept running into the problem that their sfGuard sessions would timeout before they were able to click “Save” on the form which in turn caused them to loose all of their hard work. Obviously, this is lame so we decided to add a popup warning users that their session had expired and prompting them to login again before saving their data.

We decided to implement this by using setTimeout in Javascript to pop up a window once the user’s session had expired.

Setting the session length for a Symfony user is easy enough, open up app/config/factories.yml and add the following:

all:
  user:
    class: myUser
    param:
      timeout: 1800 # this is the default but you can change it at will (its in seconds)

As it turns out, the tricky part is how do you access this value inside the application? Un-characteristically, I couldn’t find anything in the Symfony documentation about how to access these variables. For whatever reason, sfConfig::get() doesn’t provide access to the variables in factories.yml.

In order to get that timeout value I used (inside a template):

  $userOptions = $sf_user->getOptions();
  $timeout = $userOptions["timeout"];

Anyway, once I figured that out the rest is pretty straightforward.

After $timeout a Javascript function opens a jQuery UI Dialog box informing the user that their session has expired and presents the standard sfGuard sign in form. I override the onSubmit of this form to perform the request via AJAX in the background (so the user doesn’t loose their data) and then if the credentials are valid the dialog closes and the user can go on their way. If the credentials are invalid, the form re-populates with any errors and the user can correct them to re-login to the app.

Hope everyone had a good Thanksgiving!


jQuery UI $.dialog – on the fly HTML

November 13th, 2009 by Ashish Datta

Wow its been awhile!

We’ve been insanely busy over the last month or so. We launched Setfive Ventures and are anxiously anticipating the launch of both WeGov and OmniStrat in the immediate future. There are also a handful of internal project that should be rolling out before Christmas. Get Excited.

Anyway, the jQuery UI Dialog class is pretty sweet. Basically, it provides a class to display a modal dialog box from a regular old DOM element (a div, span, or whatever.)

One of the thing that isn’t explained well (or at all?) in the documentation is that you can create a dialog with on the fly HTML! I found this out after posting on the Google Group asking why this feature didn’t exist (it does. Ashish fail.)

So if you want to create a dialog with on the fly HTML all you need to do is:

$("<p>Hello World!</p>").dialog();

Pretty sweet.


Adding ORDER BY FIELD to Propel Criterias

October 13th, 2009 by Ashish Datta

Every now and then, we use Sphinx to provide full text searching in MySQL InnoDB tables. Sphinx is pretty solid. It’s easy to set up, pretty fast, and easy to deploy.

My one big issue with Sphinx has always been making it play nice with Symfony, specifically Propel. The way Sphinx returns a result set is as an ordered list of [id, weight] for each document it matched. As outlined here the idea is to then hit your MySQL server to return the actual documents and use “ORDER BY FIELD(id, [id list])” to keep them in the right order that you received the list.

The problem is, Propel Criteria objects provide no mechanism to set an ORDER BY FIELD. This is an issue because if you drop Criterias you loose Propel Pagers which generally adds to a lot of duplicated code and is honestly just not very elegant.

Anyway, after some thought I came up with this solution.

If you read through the definition of “Criteria::addDescendingOrderByColumn()”:

	/**
	 * Add order by column name, explicitly specifying descending.
	 *
	 * @param      string $name The name of the column to order by.
	 * @return     Criteria Modified Criteria object (for fluent API)
	 */
	public function addDescendingOrderByColumn($name)
	{
		$this->orderByColumns[] = $name . ' ' . self::DESC;
		return $this;
	}

All it really does is add the second part of the ORDER BY clause to an array which then gets joined up to build the final SQL. Because of this, you can actually just add an element onto the orderByColumns array which will cause Propel to execute an ORDER BY FIELD SQL statement.

To make the magic happen, I sub-classed Criteria and then added a addOrderByField() function to let me add a field to order by as well as a list to order by.


class sfCriteria extends Criteria {

  private $myOrderByColumns = array();

  /**
   * Add an ORDER BY FIELD clause.
   *
   * @param String $name The field to order by.
   * @param Array $elements A list to order the elements by.
   * @return unknown
   */
  public function addOrderByField($name, $elements)
  {
    $this->myOrderByColumns[] = ' FIELD(' . $name . ', ' . join(", ", $elements) . ')';
    return $this;
  }

  public function getOrderByColumns(){
    return array_merge( $this->myOrderByColumns, parent::getOrderByColumns() );
  }
}

To use it, do something like this:

$ids = array(1, 3, 7);
$c = new sfCriteria();
$c->add( SomeModelPeer::ID, $ids, Criteria::IN);
$c->addOrderByField( SomeModelPeer::ID, $ids);
$results = SomeModelPeer::doSelect( $c );

And thats about it. Since sfCriteria is a sub-class of Criteria the code works seamlessly with existing PropelPagers and anything else that expects a Propel Criteria.


Regex To Extract URLs From Plain Text

October 7th, 2009 by Matt Daum

Recently for a project we had the problem that it pulled data from numerous API’s and sometimes the data would contain urls that were not HTML links (ie. they were just http://www.mysite.com instead of <a href=”http://www.mysite.com”>http://mysite.com</a> .  I searched around the web for a while and had no luck finding a regex that would extract only urls that are not currently wrapped already inside of a html tag.  I came up with the following regex:

/(?<![\>https?:\/\/|href=\"'])(?<http>(https?:[\/][\/]|www\.)([a-z]|[A-Z]|[0-9]|[\/.]|[~])*)/

Parts of it are taken from other examples of URL extractors.  However none of the examples I found had lookarounds to make sure it isn’t already linked.  I am not a master of regex, so there may be a better expression than I wrote.  The above expression is written to be compatible with PHP’s preg_replace method.  A more generic one is as follows:

(?<![\>https?://|href="'])(?<http>(https?:[/][/]|www.)([a-z]|[A-Z]|[0-9]|[/.]|[~])*)

This expression will match http://www.mysite.com and www.mysite.com and any subdomains of a website.  The first matched group is the URL.  One thing to note is if you are using this that you need to check if the URL that is matched has an http:// on the front of it, if it does not, append one otherwise the link will be relative and cause something like http://www.mysite.com/www.mysite.com .

One tool that was very helpful in making this was http://gskinner.com/RegExr it is incredibly helpful.  It gives you a visual representation in real time as you create your expression of what it will match.

Note: You will lose the battle in trying to extract URL’s using regex. For example the above expression will fail on a style=”background:url(http://mysite.com/image.jpg)”. For a more robust solution it may be worth while looking into parsing the DOM and running regex per element then.


Random acts of madness: JS+Flex+Rhino – WebWorkers for IE

October 6th, 2009 by Ashish Datta

Preface: This is a bad idea with untested code. If you deploy it on a production server bad things will happen.

A few weeks ago I was trolling the Internet and ran across an interesting piece over at John Resig’s blog about Javascript WebWorkers. Basically, WebWorkers are a draft recommendation that allow you to run Javascript functions on a background (non-UI thread) thread. Essentially, they would allow you to do long running computations without hanging the browser’s UI. Pretty neat. Problem is that they are currently only available in Firefox 3.5+, Safari 4, and Chrome 3.0

In my never ending quest to use every buzzword at least once I decided to try and implement a compatibility layer to bring support for WebWorkers to other browsers. The plan was to use Java6’s new embeded Javascript interpreter (it’s just Rhino) to execute the WebWorker code server side and then pipe the output back to the client. Again, this is really just a proof of concept.

There are three parts to the rig: the client Javascript library, a Flex/AS3 application for streaming client to server communication, and a Java application that uses Rhino to execute the Javascript.

Client Javascript

The client Javascript detects the user’s browser and then will define a “Worker” object if the user’s browser doesn’t support WebWorkers. The new “Worker” object uses the Flex application to pass messages back and forth to the server and calls the user’s onmessage function when data arrives from the server.

I sniped the browser detection code from Quirksmode and it seems to work fairly well. The rest of the code is below:

BrowserDetect.init();

var sfWebWorkers = new Array();

var SF_WORKER_SERVER = "192.168.1.102";

var SF_WORKER_PORT = "9999";

var sfWwConduitIsLoaded = false;

function sfWebWorkersRecieveData(msg){

  var obj = $.evalJSON( msg );

  var e = new Object();

  e.data = obj.data;

  sfWebWorkers[ obj.sfWebWorkerId ].onmessage( e );

}

function sfWebWorkersSWFReady(isReady){

  sfWwConduitIsLoaded = true;

}

if(!((BrowserDetect.browser == "Firefox" && BrowserDetect.version == "3.5")

	    || (BrowserDetect.browser == "Safari" && BrowserDetect.version == "4")) ){

  $(document).ready( function(){

    var params = "{\"allowscriptaccess\": \"always\"}";

    var vars = "{\"server\": \"" + SF_WORKER_SERVER + "\"" 

                + ", \"port\": \"" + SF_WORKER_PORT + "\"}";

    $("body").append( "
" ); $("body").append( "" ); }); var Worker = function(fileName){ this.messages = new Array(); this.fileName = fileName; this.id = sfWebWorkers.length; this.isLoaded = false; sfWebWorkers.push( this ); var pathToFile = "http://" + window.location.hostname + ":" + window.location.port + "/" + fileName; var myId = this.id; var loadWorker = function(){ if( sfWwConduitIsLoaded ){ sfWebWorkers[ myId ].isLoaded = true; getFlashMovie("sfWebWorker").sendDataToFlash( $.toJSON( { message_type: 1, id: myId, message: pathToFile } ) ); }else{ window.setTimeout( function(){ loadWorker(); }, 500 ); } }; loadWorker(); }; Worker.prototype.postMessage = function(data){ var myId = this.id; var isLoaded = this.isLoaded; var sendData = function(data){ if( sfWwConduitIsLoaded ){ var e = new Object(); e.data = data; getFlashMovie("sfWebWorker").sendDataToFlash( $.toJSON( { message_type: 2, id: myId, message: $.toJSON(e) } ) ); }else{ window.setTimeout( function(){ sendData(data); }, 500 ); } }; sendData(data); }; } function getFlashMovie(movieName) { var isIE = navigator.appName.indexOf("Microsoft") != -1; return (isIE) ? window[movieName] : document[movieName]; }

Flex/AS3 Application

The Flex application is basically a dumb conduit between the server and the client. All it really does is pass messages between the Java on the server and the Javascript on the client.

The trickiest part of getting this to work was Adobe’s insane rules for allowing their Socket classes to connect to servers. In order for the client to successfully connect to the server you need to serve a XML policy file from port 843. Additionally, this file can’t be served by a HTTP server but must be a custom server that only spits back the file along with a null carriage return. A detailed description of this abortion is here http://www.adobe.com/devnet/flashplayer/articles/socket_policy_files.html

This really posses two problems. One, you need to be running some random “policy file server” for Flex sockets to be of any use. And two, since 843 is a privileged port, this server can’t be started by a regular user.

The most interesting parts of the ActionScript are probably the snippets that call out to Javascript functions:

ExternalInterface.call("sfWebWorkersSWFReady", true);

Java Server

The most complicated part of this whole thing is probably the Java application that actually executes the WebWorker Javascript. All the communication between the Flex and Java is done entirely with JSON. The server basically does the following:

  1. Listen for connections from the Flex and accept them when they come in.
  2. When a message comes in – it can either be a request to create a new web worker or a postMessage() event containing some data for an existing worker.
  3. If it’s a request for a new worker, the server will download the Javascript file and then execute it inside a Rhino container.
  4. Otherwise if Flex passed a postMessage() message the server will forward that data to the running web worker.
  5. The other event that happens is that a web worker can send messages back to the Flex.

Anyway, I tested this on IE7+ and it seemed to work decently well. Per the warning on top I don’t want to leave this running on a live server anywhere.

If you want to get it to actually run, do the following:

  1. Download the zip of all the sources here.
  2. Start the JAR in WebWorkerServer/WebWorkerServer.jar with java -jar WebWorkerServer.jar 9999
  3. On the top of web/sfwwcompat.js change the IP address or the server to where your Java server is located (localhost if you want)
  4. Open web/wwsha1.html in IE or Chrome 2.0 and you should see stuff happen.

What’s in the box:

  • web/ contains the Javascript and a demo.
  • WebWorkerConduit/ contains the Flex applicaiton.
  • WebWorkerServer/ contains the Java server.

Credits: I borrowed the WebWorker SHA1 implementation from John Resig who adapted it from Ray C Morgan.

So here is another crazy idea. Instead of executing the WebWorker code on the server, would it be possible to dynamically make the WebWorker code re-entrant using setTimeout() on the client where loop structures exist?


FanFeedr Widgets Are Live!

September 30th, 2009 by Ashish Datta

Over the past few weeks we had the opportunity to work with FanFeedr to put together some widgets for their sports news platform. Previously, FanFeedr had been using Sprout to build their widgets but this required someone to hand build a Flash widget for every “resource” on FanFeedr (there are a lot). In addition, since the Sprout widgets are Flash they aren’t easily crawled by search engines.

Our widgets are different. They allow FanFeedr to generate widgets on the fly for any of their pages and allow users to customize the color schemes. Check out a widget builder for the NY Yankees here.

Basically, our widget builder works by allowing users to customize the size and colors used in the widget. This data is serialized as a JSON object and then base64 encoded so that it can be sent to the “generator” on the server. Then, the server unpacks the payload and builds a widget according to the data specified in the JSON object. In addition, our embed code includes a noscript tags so that search engines pick up the links in the widget as well.

Anyway, working with FanFeedr was a great experience and we hope to continue our relationship moving forward. Go build yourself a widget!


The Redline Challenge

September 27th, 2009 by Ashish Datta

For one reason or another we decided to sponsor a pub crawl this weekend. The plan was hatched over some beers at Underbones on Thursday night for a Saturday morning go time. We knew we basically needed three things: a list of bars, some swag (tshirt?), and obviously a website. We decided that the route of the crawl should follow the MBTA Redline so that we could start downtown and then finish in Somerville. This made picking bars pretty simple, gave us some branding, and of course we registered
REDLINECHALLENGE.COM.

We wanted the website to have some informative information, live location updates, and of course pictures of the debauchery. The biggest problem was that neither Daum nor I have location aware phones. To get around this, we decided to update Twitter with our current location along with a “#loc” hashtag and then have the site update based on that. Since we were all ready using Twitter, we decided to use Twitpic to allow us to post pictures to twitter on the fly. Additionally, we took advantage of Verizon Wireless’s email to SMS service and allowed people to contact us via the website. All told, we built the site in about 3 hours and it proved to be pretty useful. People used it to find us on the crawl and to contact us while we were out. Everyone also got a kick of seeing a live photo stream.

What’s next? Clearly, The Greenline Challenge.