Clustered indexes & Non-Clustered indexes - All You Need to Know
Clustered indexes sort and store the data rows in the table based on their key values. There can only be one clustered index per table, because the data rows themselves can only be sorted in one order.
In SQL Server, indexes are organized as B-trees. Each page in an index B-tree is called an index node. The top node of the B-tree is called the root node. The bottom level of nodes in the index is called the leaf nodes. Any index levels between the root and the leaf nodes are collectively known as intermediate levels. In a clustered index, the leaf nodes contain the data pages of the underlying table. The root and intermediate level nodes contain index pages holding index rows. Each index row contains a key value and a pointer to either an intermediate level page in the B-tree, or a data row in the leaf level of the index. The pages in each level of the index are linked in a doubly-linked list. [Source]
Nonclustered indexes have the same B-tree structure as clustered indexes, except for the following significant differences:
The data rows of the underlying table are not sorted and stored in order based on their nonclustered keys.
The leaf layer of a nonclustered index is made up of index pages instead of data pages.
Nonclustered indexes can be defined on a table or view with a clustered index or a heap. Each index row in the nonclustered index contains the nonclustered key value and a row locator. This locator points to the data row in the clustered index or heap having the key value. [Source]
Difference between Clustered and Non clustered index
The concept behind indexes is to change the order of the data (clustered index) or to add metadata (non-clustered index) for improving the performance of queries.
1. Physically stored in order (ascending or descending)
2. Only one per table
3. When a primary key is created a clustered index is automatically created as well.
4. If the table is under heavy data modifications or the primary key is used for searches, a clustered index on the primary key is recommended.
5. Columns with values that will not change at all or very seldom, are the best choices.
6. For use on columns that are frequently searched for ranges of data
7. For use on columns with low selectivity
1. Up to 249 nonclustered indexes are possible for each table or indexed view.
2. The clustered index keys are used for searching therefore clustered index keys should be chosen with a minimal length.
3. Covered queries (all the columns used for joining, sorting or filtering are indexed) should be non-clustered.
4. Foreign keys should be non-clustered.
5. If the table is under heavy data retrieval from fields other than the primary key, one clustered index and/or one or more non-clustered indexes should be created for the column(s) used to retrieve the data.
6. For use on columns that are searched for single values
7. For use on columns with high selectivity [Source]
What are the differences(pros/cons) between clustered and non-clustered indexes
Nonclustered index is preferred over a clustered index in the following situations:
When the index key size is large.
To avoid the overhead cost associated with a clustered index since rebuilding the clustered index rebuilds all the nonclustered indexes of the table.
To resolve blocking by having a database reader work on the pages of a nonclustered index, while a database writer modifies other columns (not included in the nonclustered index) in the data page. In this case, the writer working on the data page won't block a reader that can get all the required column values from the nonclustered index without hitting the base table.
When all the columns (from a table) referred to by a query can be safely accommodated in the nonclustered index itself.
The data-retrieval performance when using a nonclustered index is generally poorer than that when using a clusterd index, because of the cost associated in jumping from the nonclustered index rows to the data rows in the base table. In cases where the jump to the data rows is not required, the performance of a nonclustered index should be just as good as - or even better than-a clustered index. This is possible if the nonclustered index key includes all the columns required from the table. [Source: SQL Server 2008 Query Performance Tuning Distilled By Grant Fritchey, Sajal Dam]
Clustered indexes are not a good choice for the following attributes:
Columns that undergo frequent changes: This causes in the whole row to move, because the Database Engine must keep the data values of a row in physical order. This is an important consideration in high-volume transaction processing systems in which data is typically volatile.
Wide keys: Wide keys are a composite of several columns or several large-size columns. The key values from the clustered index are used by all nonclustered indexes as lookup keys. Any nonclustered indexes defined on the same table will be significantly larger because the nonclustered index entries contain the clustering key and also the key columns defined for that nonclustered index. [Source]
Some more useful stuff on indexes from around the web:
How do indexes work on views?
How to create clustered indexes?
Table and Index Architecture
Guide to Indexes
Maintain Your Indexes