SQL Server 2012 introduces a new index type known as “Column Store Index”. This feature has been introduced with BI and OLAP in mind. Before diving into how Column Index works and how to use it, we need to take our focus back to the basics of Index in SQL Server.
What are Indexes?
Indexes form the most important element in the performance enhancement list. Even though performance depends on other factors as well, but Indexing is one of the most important but often neglected factor.
The index provides a fast way to look up data based on the values in the column(s) on which index is created.Traditional row-based indexes, as the name suggests, stores rows of data on a page. A Page is the fundamental unit of data storage in SQL Server. An index is made up of various pages that are organized in a B-Tree structure. This structure is hierarchical in nature where the root node is at the top with leaf nodes branched below.
Typical row-based indexes in SQL Server are of two types:
- Clustered Indexes
- Non-clustered Indexes
A Clustered index determines the physical order of data in a table and therefore a table can have only one Clustered Index. A clustered index stores data rows at the leaf level of the index structure.Clustered Indexes are arranged either in ascending or descending order.
The leaf node in a non-clustered index structure stores the location of the actual data row. These act as pointers to the actual row and therefore the query engine has one step more to do while locating a row.
Both the traditional indexes, Clustered and Non-Clustered, are based on B-Tree. B-Tree Indexes are fast when you need to scan the entire table and filter out rows.
How Column Store Index differs from Row Store Index?
A Column Store Index stores a single column in a separate SQL Server Page. As opposed to Row Store Index mentioned above where collection of rows reside in a Page, in a Column Store Index, a column is allocated a separate Page.
Column Store Indexes are designed for Business-Intelligence (BI) queries in Online Analytical Processing (OLAP) environment where you have to play with a huge volume of data. In such BI requests, data often needs to be queried as filtered columns from a data warehouse.
The question now arises what are the benefits of allocating a separate Page to each column? Here are the benefits:
(1) Only required pages can be fetched from the disk resulting in improved performance of query processing. As each column can be accessed independently, IO is reduced.
(2) Column Store Index technology makes use of compression algorithm to eliminate redundancy in column values.
(3) Frequently accessed columns remain in memory.
The Column Store Index in SQL Server employs Microsoft’s patented Vertipaq™ technology, which it shares with SQL Server Analysis Services and PowerPivot. SQL Server columnstore indexes don’t have to fit in main memory, but they can effectively use as much memory as is available on the server. Portions of columns are moved in and out of memory on demand.
- At this point of time, Column Store Indexes are read-only in nature and therefore DML statements are not possible in regular OLTP environment.
- Computed columns cannot be part of a Column Store Index.
- Column Store Index cannot be created on indexed views.
Using Column Store Index:
You don’t need to run anything special to use Column Store Index. You just need to specify a keyword “COLUMNSTORE” while creating the index. The Query Optimizer decides when to use Column Store Index and when other indexes. The Query Optimizer also decides when to use batch execution and when row execution mode.
CREATE NONCLUSTERED COLUMNSTORE INDEX <ColumnStoreIndexName> ON <Table> (col1, col2, col3);
(TransactionDate, StockQty, ItemID)
You can only create a non-clustered Column Store Index.