<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: rdbms</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/rdbms.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2013-10-30T16:27:00+00:00</updated><author><name>Simon Willison</name></author><entry><title>How was FriendFeed's schema less db faster than pure MySQL?</title><link href="https://simonwillison.net/2013/Oct/30/how-was-friendfeeds-schema/#atom-tag" rel="alternate"/><published>2013-10-30T16:27:00+00:00</published><updated>2013-10-30T16:27:00+00:00</updated><id>https://simonwillison.net/2013/Oct/30/how-was-friendfeeds-schema/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/How-was-FriendFeeds-schema-less-db-faster-than-pure-MySQL?no_redirect=1"&gt;How was FriendFeed&amp;#39;s schema less db faster than pure MySQL?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The principle reason they switched to a schemaless DB was to work around the challenges of having to make schemes changes in MySQL, which can lock the table and take hours if bit days to complete in large tables.&lt;/p&gt;

&lt;p&gt;The performance improvement shown in the graph is almost certainly because they almost entirely eliminated joins and complex queries when they switched to the new mechanism. This meant that all it their database traffic was now simple queries, which have much more predictable performance characteristics. MySQL (in fact all databases) are extremely fast at primary key lookups and index scans.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/friendfeed"&gt;friendfeed&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nosql"&gt;nosql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rdbms"&gt;rdbms&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="databases"/><category term="friendfeed"/><category term="mysql"/><category term="nosql"/><category term="quora"/><category term="rdbms"/></entry><entry><title>What tools and techniques are used for relational database version control (structure and data)?</title><link href="https://simonwillison.net/2012/Dec/24/what-tools-and-techniques/#atom-tag" rel="alternate"/><published>2012-12-24T12:29:00+00:00</published><updated>2012-12-24T12:29:00+00:00</updated><id>https://simonwillison.net/2012/Dec/24/what-tools-and-techniques/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/What-tools-and-techniques-are-used-for-relational-database-version-control-structure-and-data/answer/Simon-Willison"&gt;What tools and techniques are used for relational database version control (structure and data)?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The term you are looking for is database migrations (sometimes called database change scripts).&lt;/p&gt;

&lt;p&gt;The basic concept is pretty straight forward: you set up a table in the database that records which change scripts have already been applied. When you need to make a change (adding a column, adding a table, denormalising some data for performance reasons, adding an index etc) you write a change script that applies the change - in raw SQL or in another programming language, depending on how your migration system is set up.&lt;/p&gt;

&lt;p&gt;These change scripts (let's call them migrations from here) are numbered so they can be applied in the correct order. Then you run a command which checks for scripts that have not yet been applied and runs them in the correct order - then records that they have been run to the relevant database table.&lt;/p&gt;

&lt;p&gt;The setup I've described above is a pretty good start. Some systems let you have reversible migrations: each migration includes instructions for reversing its effect (removing the index that was added, moving data back to its old location) which lets you run a command to revert back to a previous database state. In practise this is a nice-to-have but not essential: many migrations are by their nature irreversible, but it can make development faster if you can easily try out and then revert a database structure change within your development environment.&lt;/p&gt;

&lt;p&gt;Really clever migration systems can even introspect your database, figure out what has changed and attempt to generate the migration scripts automatically! South, the most popular migration system for Django, does this with surprisingly good results for many cases.&lt;/p&gt;

&lt;p&gt;If you're interested in learning more, it's worth reading through the South documentation: &lt;span&gt;&lt;a href="http://south.readthedocs.org/en/latest/"&gt;http://south.readthedocs.org/en/...&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/oracle"&gt;oracle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rdbms"&gt;rdbms&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="databases"/><category term="mysql"/><category term="oracle"/><category term="postgresql"/><category term="sql"/><category term="quora"/><category term="rdbms"/></entry><entry><title>Any source available to download sample data (in 10+ GB) for testing?</title><link href="https://simonwillison.net/2012/Oct/15/any-source-available-to/#atom-tag" rel="alternate"/><published>2012-10-15T13:21:00+00:00</published><updated>2012-10-15T13:21:00+00:00</updated><id>https://simonwillison.net/2012/Oct/15/any-source-available-to/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/Any-source-available-to-download-sample-data-in-10+-GB-for-testing/answer/Simon-Willison"&gt;Any source available to download sample data (in 10+ GB) for testing?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Wikipedia has some pretty interesting dumps, in both XML and SQL format: &lt;span&gt;&lt;a href="http://meta.wikimedia.org/wiki/Importing_a_Wikipedia_database_dump_into_MediaWiki"&gt;http://meta.wikimedia.org/wiki/I...&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;It's pretty easy to generate 10GB of random data for testing though, which may be a better option as you could better approximate the kind of data your application will be dealing with. There's a neat Ruby module for doing this called Faker (itself a port of the Perl module of the same name): &lt;span&gt;&lt;a href="http://faker.rubyforge.org/"&gt;http://faker.rubyforge.org/&lt;/a&gt;&lt;/span&gt; - and here's a Python port of the Ruby one: &lt;span&gt;&lt;a href="https://github.com/threadsafelabs/python-faker"&gt;https://github.com/threadsafelab...&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nosql"&gt;nosql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/programming"&gt;programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/web-development"&gt;web-development&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rdbms"&gt;rdbms&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="mysql"/><category term="nosql"/><category term="programming"/><category term="web-development"/><category term="quora"/><category term="rdbms"/></entry><entry><title>Is a relational database with many-to-many relationships difficult to develop into a web app?</title><link href="https://simonwillison.net/2011/Feb/8/is-a-relational-database/#atom-tag" rel="alternate"/><published>2011-02-08T18:28:00+00:00</published><updated>2011-02-08T18:28:00+00:00</updated><id>https://simonwillison.net/2011/Feb/8/is-a-relational-database/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/Is-a-relational-database-with-many-to-many-relationships-difficult-to-develop-into-a-web-app/answer/Simon-Willison"&gt;Is a relational database with many-to-many relationships difficult to develop into a web app?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Many to Many tables can be a bit of a pain to deal with using regular SQL, but a good ORM can abstract away any potential complexity almost entirely. I find using the Django ORM means I'm much less likely to shy away from a design that involves a many-to-many relationship because I know it won't increase the complexity of the application. I imagine the Rails ORM has the same effect.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/google"&gt;google&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webapps"&gt;webapps&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rdbms"&gt;rdbms&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="databases"/><category term="google"/><category term="mysql"/><category term="webapps"/><category term="quora"/><category term="rdbms"/></entry></feed>