The Importance of Database Indexing
Published on Feb 27, 2009 by Jamie MunroIf you are not familiar with database indexes or even if you are, but not sure when to use them, this article will be perfect for you.
In this article, I will show you an example of a simple query without an index and then that same query again with the index. I am certain that you will be shocked by the difference.
Let's start by creating a very basic users table.
CREATE TABLE `users` ( `id` INT UNSIGNED NOT NULL AUTO_INCREMENT , `name` VARCHAR( 50 ) NOT NULL , `email` VARCHAR( 50 ) NOT NULL , PRIMARY KEY ( `id` ) ) ENGINE = MYISAM
As you can see, we simply have an id, name, and email. The id field is set to our primary key and is indexed by default. This means on a regular basis MySQL will re-build the index when new data is inserted or deleted. By building the index it allows MySQL to know exactly where each record is if we query by the id.
The next step is to create some test data. We will start by creating 10 records:
INSERT INTO users (name, email) VALUES ('test 1', 'firstname.lastname@example.org'); INSERT INTO users (name, email) VALUES ('test 2', 'email@example.com'); INSERT INTO users (name, email) VALUES ('test 3', 'firstname.lastname@example.org'); INSERT INTO users (name, email) VALUES ('test 4', 'email@example.com'); INSERT INTO users (name, email) VALUES ('test 5', 'firstname.lastname@example.org'); INSERT INTO users (name, email) VALUES ('test 6', 'email@example.com'); INSERT INTO users (name, email) VALUES ('test 7', 'firstname.lastname@example.org'); INSERT INTO users (name, email) VALUES ('test 8', 'email@example.com'); INSERT INTO users (name, email) VALUES ('test 9', 'firstname.lastname@example.org'); INSERT INTO users (name, email) VALUES ('test 10', 'email@example.com');
Now that we have some data in our table. Let's begin by running a basic query to find out the name for user with the email firstname.lastname@example.org:
SELECT name FROM users WHERE email = 'email@example.com'
If you run this query it returns test 4. The total time taken to run the query is: 0.0004 sec. So far this is pretty quick. This is where problems often occur. During development, it's quite normal to not have very many records in the database so you will never notice an issue.
To more clearly show the potential issue of not having an index, let's populate 10,000 records into the users table (I won't output the insert statements as you would have to scroll down a long, long way). The same query as above with 10,000 records in the users table now took: 0.0101 sec.
The difference might seem minimal to you, but imagine every query you run and every page taking longer and longer to load and as your site grows it will only take longer.
Let's continue this example by inserting a total of 100,000 records into the database. Here are the results of the exact same query being run with 100,000 records in the users table: 1.1338 sec.
Incredible, it is now taking over 1 second to run that exact same query. Now let's make the email field an index in our table:
ALTER TABLE `users` ADD INDEX ( `email` )
Perfect, now let's, for the last time, re-run the basic select query: 0.0005 sec. Did your jaw just drop? I know the first time I saw this mine did. If you look back to the original query when we only had 10 users in the database, with 100,000 users it is only 0.0001 second slower!
The general rule of thumb for indexes are: items that are contained in your WHERE clause are ideal to be indexed. It's important to analyze all queries that are run against each table. The reason for this is MySQL only uses ONE index per query and if it doesn't use the correct one, you may receive slow results still.
Things to look for:
- What fields are contained in the WHERE clause
- Do multiple fields often occur together in the WHERE clause
- What fields are contained in the ORDER BY clause
- Do multiple fields often occur together in the ORDER BY clause
If you find multiple fields together often, it's important to create ONE index for the multiple fields. It is also a good idea to include fields in the ORDER BY clause as indexes as well. By adding indexes for your ORDER BY clauses it will prevent MySQL from having to re-sort your data each time before it is returned to you.
Please note: There are down falls to having too many indexes. The more indexes you have, the longer each insert statement you perform will take because MySQL needs to update the indexes to include the new data.
Thanks and enjoy faster database querying!