Insert with TABLOCK hint - Tables with Indexes

Columnstore index wins!

Dec 16, 2024

This is part 2 of my series on “Insert with TABLOCK hint”

In Part 1, we saw how an INSERT with TABLOCK hint runs faster due to minimal logging on a HEAP table. In this post, let’s see how INSERT with TABLOCK hint behaves when there are indexes on the target table.

Scenario

DBATest database is in simple recovery model, and target table dbo.t2 is an empty heap. Before we dive into the demo, I’d like to remind you that INSERT with TABLOCK hint on a heap table completed in less than a second, as demonstrated in Part 1 of this series.

Now, I’ll demonstrate how the behavior changes when the target table has a:

Clustered Index
Non-clustered Index
Columnstore Clustered Index

DEMO 1: Clustered index

Create clustered index on empty target table dbo.t2
```
CREATE CLUSTERED INDEX CI_ID ON dbo.t2(id);
```
Run insert with TABLOCK hint.
Result: 2 seconds. Insert in this case is a serial operation, because of clustered index, even though the subtree cost is higher than cost threshold for parallelism,
Clustered index disables parallel insert, because to insert data into sorted rows, multiple threads cannot insert and only thread can insert sequentially to maintain the sorted order.
```
INSERT INTO t2 WITH (TABLOCK) (id,a,b)
SELECT * FROM t1
```
Check if this is fully or minimally logged.
Result: Out of 1 million rows inserted, 332k rows are logged. So, the operation is neither completely fully logged nor minimally logged.
During this operation, 2,202 pages were formatted. But for a heap table, the number of pages formatted were 2 and 0 rows were logged.
```
SELECT Operation,COUNT(*)  AS 'Rows Logged'  
FROM sys.fn_dblog(NULL, NULL) 
WHERE
 Operation IN 
 ( N'LOP_INSERT_ROWS'
 ,N'LOP_FORMAT_PAGE')
GROUP BY Operation
ORDER BY COUNT(*) DESC
```
Reset the environment - Drop the index CI_ID ON dbo.t2 and truncate target table dbo.t2 and issue a checkpoint to flush the log.

DROP INDEX CI_ID ON dbo.t2; 

TRUNCATE TABLE dbo.t2;  

CHECKPOINT

When there is a clustered index, the runtime is 2 seconds, the operation is serial, and it is neither fully logged nor minimally logged.

DEMO 2: Non-Clustered index

Target table is now heap(reset in previous step) and I’m going to create a non-clustered index.

CREATE NONCLUSTERED INDEX NCI_a ON dbo.t2(a)

Run insert with TABLOCK hint.
Result: 9 seconds with serial insert, again non-clustered index disables parallel insert.

INSERT INTO t2 WITH (TABLOCK) (id,a,b)
SELECT * FROM t1

Check if this is fully or minimally logged.
Result: Fully logged, because along with table data update, it has to update the non-clustered index structure as well which gets fully logged in the transaction log, making the operation run longer.

Reset the environment - Drop the index NCI_a ON dbo.t2 and truncate target table dbo.t2 and issue a checkpoint to flush the log.

DROP INDEX CI_ID ON dbo.t2; 

TRUNCATE TABLE dbo.t2;  

CHECKPOINT

When there is a non-clustered index, the runtime is 9 seconds, the operation is serial, and it is fully logged.

DEMO 3: Columnstore Clustered index

Target table is now heap and I’m going to create a columnstore clustered index.
```
CREATE CLUSTERED COLUMNSTORE INDEX CCI ON dbo.t2
```
Run insert with TABLOCK hint.
Result: < 1 second, and columnstore clustered index enables parallel insert, due to the structure of this index which is optimized for bulk loading.

INSERT INTO t2 WITH (TABLOCK) (id,a,b)
SELECT * FROM t1

Check if this is fully or minimally logged.
Result: Minimally logged. Only 32 rows are logged out of a million rows and no pages have been formatted.

When there is a columnstore clustered index, the runtime is less than a second, the operation is parallel, and it is minimally logged.

Takeaways:

Insert with tablock hint on an empty table:

Clustered columnstore index is the fastest, with minimal logging and parallel insertion.

Insert with tablock hint on a non-empty table:

I performed the same experiment on a non-empty table, with 1 million records already in it. Clustered columnstore index appears to be the fastest and also performed consistently with or without data already in the table.
Inserts on tables with clustered and non-clustered index (in red) have performed poorly on a non-empty table compared to an empty table.

Summary:

When compared to rowstore clustered index, non-clustered index and columnstore index, inserting with TABLOCK hint on a table with a columnstore clustered index results in the fastest runtime, benefiting from minimal logging and parallel insertion.

Resources:

Part-1: Insert with TABLOCK hint - Faster

https://techcommunity.microsoft.com/blog/sqlserver/real-world-parallel-insert-select-what-else-you-need-to-know/305329
More about columnstore index data loading - https://learn.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-data-loading-guidance?view=sql-server-ver16

Go High

Discussion about this post