Internet Explorer Extension Quickstart and Skeleton

Recently, one of our clients was looking to build a prototype/proof of concept browser extension for Firefox, Chrome, and Internet Explorer. We were basically looking to inject a script, run it in user space, and modify some of the page’s DOM – like a trimmed down Greasemonkey script.

Doing this in Chrome and Firefox is pretty straightforward since the extensions are built Javascript, this is actually how I built the prototype extensions. Unsurprisingly, the “odd man out” is Internet Explorer, this is my first time looking into writing an IE extension and the experience was pretty jarring so hopefully this synopsis can save you some time and frustration.

The first sign of trouble was that there doesn’t seem to be an official Microsoft guide on writing IE extensions. There’s just a bunch of ad-hoc tutorials, some MSDN articles, and then code samples built against every possible combination of language and library.

As it turns out, Internet Explorer has actually supported extensions since IE5 using a technology called Browser Help Objects and continues to support them via BHOs through IE9. In contrast to Chrome and Firefox’s Javascript based extensions, a BHO is a Windows DLL and consequently must be written and compiled using your choice of a Win32 compatible programming language. Given this and from the discussions I saw, the most popular choice seems to be to use C++/ATL to create a COM DLL. Being that I’m deathly afraid of C++ and that this approach was described as “COM DLL hell” I decided to see what else was possible.

After a bit more poking around I found out that it’s possible to use C# and .NET’s Interop libraries to scaffold enough to get the DLL loading into IE. This Code Project article walks through the process but it has several typos and the download containing the files seems to have gone missing. I fixed the typos and built it successfully – you can grab the files from GitHub here.

From my extremely rough understanding of C#, what the code does is create an interface from C# managed code to the unmanaged COM code that IE uses to communicate with extensions. Then, the code registers an event handler to be called once the DOM has finished rendering.

In order to actually build the project, you’ll need to do the following:

  • Install a copy of Visual Studio – I used VS Express 2010 which is free.
  • You’ll then need to import my VS project and build your DLL. If it complains about references to SHDocVw or IHTMLDocument2 you’ll just need to make sure that references to the two DLLs in Greyhound/ exist in your VS project.
  • Once the DLL is built, you’ll need to register it with RegAsm – this is a bit tricky since you need to use the correct version of RegAsm available on your system. This article explains where it should be located. Once you locate it, run the following:
C:\[path to your .NET library]\RegAsm.exe /codebase bin\Release\Greyhound.dll

Thats it. Now start Internet Explorer and once the DOM on a page finishes loading you should see an alert box being generated via Javascript that your DLL is injecting.

The magic is all happening in the following function:

public void OnDocumentComplete(object pDisp, ref object URL)
   IHTMLDocument2 doc = (IHTMLDocument2)webBrowser.Document;
   doc.parentWindow.execScript("var d=window.document,s=d.createElement('script'),h=d.getElementsByTagName('body')[0];s.src='';h.appendChild(s);");

You’ll just need to edit that script tag to load your own Javascript.

Anyway, pretty gnarly stuff. It’s also extremely frightening that IE extensions are basically full fledged programs that have full reign over your entire system. No sandboxed, no permission limitations, just a fully integrated program that people “casually” download off the Internet.