Home > Performance, SQL Server > UPDATE STATISTICS: How to cheat the optimizer

UPDATE STATISTICS: How to cheat the optimizer

Did you ever need to see what execution plan will optimizer choose for your query on large set of data but you have a really small test database? Or you are testing your notification system for sort warning, for example, and you need to force query optimizer to think that tables used in a query are smaller? If your answer is yes, then keep reading.

Statistics is very important to the query optimizer for generating execution plans and it’s very important that statistics is up to date in a production database. To update statistics we use UPDATE STATISTICS statement. If you read BOL page for update statistics you can see that statement has some undocumented options. We can use these undocumented options, ROWCOUNT and PAGECOUNT, to cheat the optimizer. In the SQL Server Query Optimization Team blog,  we can read that ROWCOUNT and PAGECOUNT alter the internal metadata of the specified table or index by overriding the counters containing the row and page counts of the object.

In the next example, I will show how to use ROWCOUNT and PAGECOUNT options and how to cheat the optimizer. You will see that internal metadata for the table are changed and that statistics isn’t updated. Never use this in production database.

To see how it works I will use Sales.SalesOrderDetail table in AdventureWorks2008R2 database.
First, let us see statistics for table’s primary key and information’s about rows and pages in the table from sys.partitions and sys.dm_db_partition_stats.

DBCC SHOW_STATISTICS ('Sales.SalesOrderDetail',PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID);

SELECT * FROM sys.partitions
WHERE object_id = object_id('Sales.SalesOrderDetail')

SELECT * FROM sys.dm_db_partition_stats
WHERE object_id = object_id('Sales.SalesOrderDetail')

Results:

DBCC SHOW_STATISTICS tell us (1) that Aug 2 2010 is the last time when statistics was updated and that table has 121.317 rows. Sys.partition (2) also tells us that table has 121.317 rows. sys.dm_db_partition_stats (3) shows us that 1.234 pages are used for storing in-row data.

Now, let’s UPDATE STATISTICS with ROWCOUNT and PAGECOUNT options. I set 500.000 rows and 10.000 pages but you can set any other value.

UPDATE STATISTICS Sales.SalesOrderDetail
WITH ROWCOUNT = 500000,PAGECOUNT = 10000

After we updated statistics with ROWCOUNT and PAGECOUNT option let us see again what is written in statistics and in sys.partitions and sys.dm_db_partition_stats.

From (1) we can see that statistics isn’t updated. From (2) we can see that sys.partition has information that table has 500.000 rows and sys.dm_db_partition_stats thinks that table occupies 10.000 pages (3). These values are used by query optimizer during query processing.
Let’s run a simple query on Sales.SalesOrderDetail table and look at actual execution plan of the query.

SELECT SalesOrderID
FROM Sales.SalesOrderDetail

Query is very simple. We want all SalesOrderID from the table. We can see that (1) actual number of rows is 121.371 but query optimizer thought (2) that there are 500.000 rows in the table.

Using ROWCOUNT and PAGECOUNT options allow us to cheat the query optimizer and to easily see what kind of query plan will be generated on large amount of data.  You can also specify smaller number of rows and pages to simulate a sort warning for example.

Advertisements
Categories: Performance, SQL Server
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: