<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: crowbar</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/crowbar.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2009-01-24T23:52:55+00:00</updated><author><name>Simon Willison</name></author><entry><title>Crowbar</title><link href="https://simonwillison.net/2009/Jan/24/crowbar/#atom-tag" rel="alternate"/><published>2009-01-24T23:52:55+00:00</published><updated>2009-01-24T23:52:55+00:00</updated><id>https://simonwillison.net/2009/Jan/24/crowbar/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://simile.mit.edu/wiki/Crowbar"&gt;Crowbar&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Headless Gecko/XULRunner which exposes a web service API for screen scraping using a real browser DOM—just pass it the URL of a page and the URL of a screen scraping JavaScript script (a bit like a Greasemonkey user script) and get back RDF/XML.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/crowbar"&gt;crowbar&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dom"&gt;dom&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gecko"&gt;gecko&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/greasemonkey"&gt;greasemonkey&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mozilla"&gt;mozilla&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rdf"&gt;rdf&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scraping"&gt;scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webservice"&gt;webservice&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/xml"&gt;xml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/xulrunner"&gt;xulrunner&lt;/a&gt;&lt;/p&gt;



</summary><category term="crowbar"/><category term="dom"/><category term="gecko"/><category term="greasemonkey"/><category term="mozilla"/><category term="rdf"/><category term="scraping"/><category term="webservice"/><category term="xml"/><category term="xulrunner"/></entry></feed>