Learned one fascinating fact tonight: when creating indices for a table in SQLite the order in which the columns are indexed seems to have a huge impact on performance. On the table in question, which has 20.2 million rows, one lookup of the following form:
SELECT a,b,c,d FROM tbl WHERE x=n EXCEPT SELECT a,b,c,d FROM tbl WHERE x=m;
where n and m are values to be chosen by the caller of the query, took 12 to 13 seconds (yielding between 0 and approximately 23000 rows) with an index like:
CREATE INDEX idx ON tbl(a,b,c,d,x);
The first attempt was to drop that index and separate it into five indices, one per column. This sped it up to approximately 8.5 seconds. However, after finding and reading The SQLite Query Optimizer Overview I dropped the five indices and created a single new one like this:
CREATE INDEX idx ON tbl(x,a,b,c,d);
Voila. After this the same query was down to 5.5 seconds. This is a considerable improvement, since I have to run z² queries (for
SELECT COUNT(*) AS z FROM (SELECT DISTINCT x FROM tbl);), where z = 16 in the most common case.
I’m not a database expert by any means, but this discovery was very enlightening indeed.
PS: In the process of learning more details about SQL in general and SQLite in particular I found the book “The Definitive Guide to SQLite” by Mike Owens very useful.