<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: sphinx-search</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/sphinx-search.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2012-02-11T18:39:00+00:00</updated><author><name>Simon Willison</name></author><entry><title>How can you build a search engine for a website built in PHP/MySQL?</title><link href="https://simonwillison.net/2012/Feb/11/how-can-you-build/#atom-tag" rel="alternate"/><published>2012-02-11T18:39:00+00:00</published><updated>2012-02-11T18:39:00+00:00</updated><id>https://simonwillison.net/2012/Feb/11/how-can-you-build/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/How-can-you-build-a-search-engine-for-a-website-built-in-PHP-MySQL/answer/Simon-Willison"&gt;How can you build a search engine for a website built in PHP/MySQL?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;There are a bunch of options.&lt;/p&gt;

&lt;p&gt;The easiest to implement is to build search on top of MySQL LIKE queries - performance will be pretty terrible (since every search will require a full table scan) but provided your tables only have a few thousand records on them and your site doesn't have to cope with more than a dozen or so hits a second it should work fine.&lt;/p&gt;

&lt;p&gt;Next easiest: use MySQL's built-in full text indexing feature. It's not particularly good, and it requires you to use MyISAM tables (InnoDB is much more reliable, but doesn't support full text indexing) - but it will do the job. You could always keep your main site data in InnoDB and denormalise in to a MyISAM table just for search - or you could use the trick Flickr used to use, which is to set up MySQL replication and run MyISAM on one of the slaves purely to support fulltext search.&lt;/p&gt;

&lt;p&gt;Past that, you're looking at adding another component to the stack. Sphinx can integrate directly with MySQL and lets you run SQL-style queries against a proper full text index. Personally I'm a big fan of Solr, which runs as a separate (Java) server and requires you to index documents over HTTP. The great thing about Solr is that you can talk to it from any language that has an HTTP client library.&lt;/p&gt;

&lt;p&gt;The last option is to go for a hosted solution. Google Custom Search is free, but not particularly flexible. IndexTank was a good option here but they were acquired by LinkedIn and are shutting down the hosted service - they've since open sourced their software and other companies such as &lt;span&gt;&lt;a href="http://www.searchify.com/"&gt;http://www.searchify.com/&lt;/a&gt;&lt;/span&gt; are starting to offer it as a hosted solution.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/php"&gt;php&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search-engines"&gt;search-engines&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sphinx-search"&gt;sphinx-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="mysql"/><category term="php"/><category term="search-engines"/><category term="sphinx-search"/><category term="quora"/></entry><entry><title>Who are major competitors to Solr?</title><link href="https://simonwillison.net/2010/Sep/2/who-are-major-competitors/#atom-tag" rel="alternate"/><published>2010-09-02T18:01:00+00:00</published><updated>2010-09-02T18:01:00+00:00</updated><id>https://simonwillison.net/2010/Sep/2/who-are-major-competitors/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/Who-are-major-competitors-to-Solr/answer/Simon-Willison"&gt;Who are major competitors to Solr?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;ElasticSearch is a really interesting one - it's the same underlying search library (Lucene) and the same integration model (an HTTP interface) but takes quite a different approach. It hasn't been around for a long time but it looks very impressive: &lt;span&gt;&lt;a href="http://www.elasticsearch.com/"&gt;http://www.elasticsearch.com/&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;Other than that, popular open source search engines include Sphinx and Xapian. I'm a big fan of talking to a search engine via HTTP, so I've been keeping an eye on the &lt;span&gt;&lt;a href="http://www.flax.co.uk/"&gt;http://www.flax.co.uk/&lt;/a&gt;&lt;/span&gt; project which does that for Xapian.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apache"&gt;apache&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/lucene"&gt;lucene&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search-engines"&gt;search-engines&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/solr"&gt;solr&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sphinx-search"&gt;sphinx-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/xapian"&gt;xapian&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apache"/><category term="lucene"/><category term="search"/><category term="search-engines"/><category term="solr"/><category term="sphinx-search"/><category term="xapian"/><category term="quora"/></entry><entry><title>How do Solr, Lucene, Sphinx and Searchify compare?</title><link href="https://simonwillison.net/2010/Aug/26/how-do-solr-lucene/#atom-tag" rel="alternate"/><published>2010-08-26T14:14:00+00:00</published><updated>2010-08-26T14:14:00+00:00</updated><id>https://simonwillison.net/2010/Aug/26/how-do-solr-lucene/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/How-do-Solr-Lucene-Sphinx-and-Searchify-compare/answer/Simon-Willison"&gt;How do Solr, Lucene, Sphinx and Searchify compare?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Lucene is a Java library for creating and searching through a full text index. If you want to make use of it, you'll need to write your own Java code that integrates with it.&lt;/p&gt;

&lt;p&gt;Solr is a web service that is built on top of the Lucene library. You can talk to it over HTTP from any programming language - so you can take advantage of the power of Lucene without having to write any Java code at all. Solr also adds a number of features that Lucene leaves out such as sharding and replication.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/lucene"&gt;lucene&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search-engines"&gt;search-engines&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/solr"&gt;solr&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sphinx-search"&gt;sphinx-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/web-development"&gt;web-development&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="databases"/><category term="lucene"/><category term="search"/><category term="search-engines"/><category term="solr"/><category term="sphinx-search"/><category term="web-development"/><category term="quora"/></entry><entry><title>Ravelry</title><link href="https://simonwillison.net/2009/Sep/3/ravelry/#atom-tag" rel="alternate"/><published>2009-09-03T18:50:20+00:00</published><updated>2009-09-03T18:50:20+00:00</updated><id>https://simonwillison.net/2009/Sep/3/ravelry/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.tbray.org/ongoing/When/200x/2009/09/02/Ravelry"&gt;Ravelry&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Tim Bray interviews Casey Forbes, the single engineer behind Ravelry, the knitting community that serves 10 million Rails requests a day using just seven physical servers, MySQL, Sphinx, memcached, nginx, haproxy, passenger and Tokyo Cabinet.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caseyforbes"&gt;caseyforbes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/haproxy"&gt;haproxy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nginx"&gt;nginx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/passenger"&gt;passenger&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rails"&gt;rails&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ravelry"&gt;ravelry&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sphinx-search"&gt;sphinx-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tim-bray"&gt;tim-bray&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tokyocabinet"&gt;tokyocabinet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tokyotyrant"&gt;tokyotyrant&lt;/a&gt;&lt;/p&gt;



</summary><category term="caseyforbes"/><category term="haproxy"/><category term="memcached"/><category term="mysql"/><category term="nginx"/><category term="passenger"/><category term="rails"/><category term="ravelry"/><category term="scaling"/><category term="sphinx-search"/><category term="tim-bray"/><category term="tokyocabinet"/><category term="tokyotyrant"/></entry><entry><title>Sphinx 0.9.9-rc2 is out</title><link href="https://simonwillison.net/2009/Apr/8/sphinx/#atom-tag" rel="alternate"/><published>2009-04-08T13:59:26+00:00</published><updated>2009-04-08T13:59:26+00:00</updated><id>https://simonwillison.net/2009/Apr/8/sphinx/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://sphinxsearch.com/news/37.html"&gt;Sphinx 0.9.9-rc2 is out&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Interesting new feature: the Sphinx search server now supports the MySQL binary protocol, so you can talk to it using a regular MySQL client library and fire off search queries using SELECT syntax and the new SphinxQL query language.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/full-text-search"&gt;full-text-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sphinx-search"&gt;sphinx-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;&lt;/p&gt;



</summary><category term="full-text-search"/><category term="mysql"/><category term="search"/><category term="sphinx-search"/><category term="sql"/></entry><entry><title>In-Depth django-sphinx Tutorial</title><link href="https://simonwillison.net/2008/Mar/5/indepth/#atom-tag" rel="alternate"/><published>2008-03-05T00:03:45+00:00</published><updated>2008-03-05T00:03:45+00:00</updated><id>https://simonwillison.net/2008/Mar/5/indepth/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.davidcramer.net/code/79/in-depth-django-sphinx-tutorial.html"&gt;In-Depth django-sphinx Tutorial&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Another neat Django extension from the guys at Curse: easy integration with the sphinx full text search engine.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/curse"&gt;curse&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/david-cramer"&gt;david-cramer&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sphinx-search"&gt;sphinx-search&lt;/a&gt;&lt;/p&gt;



</summary><category term="curse"/><category term="david-cramer"/><category term="django"/><category term="python"/><category term="search"/><category term="sphinx-search"/></entry><entry><title>django-sphinx</title><link href="https://simonwillison.net/2007/Sep/9/djangosphinx/#atom-tag" rel="alternate"/><published>2007-09-09T00:35:19+00:00</published><updated>2007-09-09T00:35:19+00:00</updated><id>https://simonwillison.net/2007/Sep/9/djangosphinx/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://code.google.com/p/django-sphinx/"&gt;django-sphinx&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
More code from Curse Gaming; this time a really nice API for adding Sphinx full-text search to a Django model.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://www.davidcramer.net/code/54/mediawiki-markup-and-sphinxsearch-for-django.html"&gt;David Cramer&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cursegaming"&gt;cursegaming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/david-cramer"&gt;david-cramer&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/full-text-search"&gt;full-text-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sphinx-search"&gt;sphinx-search&lt;/a&gt;&lt;/p&gt;



</summary><category term="cursegaming"/><category term="david-cramer"/><category term="django"/><category term="full-text-search"/><category term="orm"/><category term="python"/><category term="search"/><category term="sphinx-search"/></entry></feed>