(SQL Tutorials for Citizen Data Scientist)
SQL CREATE INDEX Statement
In this tutorial you will learn how to create indexes on tables to improve the database performance.
What is Index?
An index is a data structure associated with a table that provides fast access to rows in a table based on the values in one or more columns (the index key).
Let’s say, you have a customers table in your database and you want to find out all the customers whose names begin with the letter A, using the following statement.
SELECT cust_id, cust_name, address FROM customers
WHERE cust_name LIKE 'A%';
To find such customers, server must scan each row one by one in the customers table and inspect the contents of the name column. While it works fine for a table having few rows, but imagine how long it might take to answer the query if the table contains million of rows. In such situation you can speed things up by applying indexes to the table.
Creating an Index
You can create indexes with the
CREATE INDEX statement:
For example, to create an index on the name column in the customers table, you could use:
CREATE INDEX cust_name_idx ON customers (cust_name);
By default, the index will allow duplicate entries and sort the entries in ascending order. To require unique index entries, add the keyword
CREATE, like this:
CREATE UNIQUE INDEX cust_name_idx
ON customers (cust_name);
In MySQL you can look at the available indexes on a specific table, like this:
Tip: Terminate a SQL statement with
G instead of
; to display the result vertically rather than normal tabular format if they are too wide for the current window.
Creating Multi-column Indexes
You can also build the indexes that span multiple columns. For example, suppose you’ve a table in your database named users having the columns first_name and last_name, and you frequently access the user’s records using these columns then you can build an index on both the columns together to improve the performance, as follow:
Tip: You can consider a database index as index section of a book that helps you quickly find or locate a specific topic within the book.
The Downside of Indexes
Index should be created with care. Because, every time a row is added, updated or removed from a table, all indexes on that table must be modified. Therefore, the more indexes you have, the more work the server needs to do, which finally leads to slower performance.
Here are some basic guidelines that you can follow while creating index:
- Index columns that you frequently use to retrieve the data.
- Don’t create indexes for columns that you never use as retrieval keys.
- Index columns that are used for joins to improve join performance.
- Avoid columns that contain too many NULL values.
Also, small tables do not require indexes, because in the case of small tables, it is usually faster for the server to scan the table rather than look at the index first.
Note: Most database system like MySQL, SQL server, etc. automatically creates the indexes for
PRIMARY KEY and
UNIQUE columns, when the table was created.
You can drop indexes that are no longer required with the following statement.
The following statement will drop the index cust_name_idx from the customers table.
DROP INDEX cust_name_idx ON customers;
Moreover, if you drop a table then all associated indexes are also dropped.
Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.
Latest end-to-end Learn by Coding Projects (Jupyter Notebooks) in Python and R:
There are 2000+ End-to-End Python & R Notebooks are available to build Professional Portfolio as a Data Scientist and/or Machine Learning Specialist. All Notebooks are only $29.95. We would like to request you to have a look at the website for FREE the end-to-end notebooks, and then decide whether you would like to purchase or not.