Click to See Complete Forum and Search --> : Optimization With Clustering


Scott.Macmaster
September 3rd, 2008, 12:15 PM
I got involved in a project to optimize a database. The primary table, Receivables, has 3 fields that we mainly use in query conditions. The primary key which is a auto incrementing int, is rarely used in our queries. It is also currently set as the clustered index. The 3 fields I'm considering using in a clustered index are year, quarter, and customer id. For normal operations I don't think adjusting the indexes would have any affect. However, since there's around 50,000 records per quarter, reports tend to take a minimum of 10 seconds to generate and will take a long time when creating a report that compares quarters from several years.

An example query for a report for several years would look something like


SELECT *
FROM Receivables
WHERE (Year = 2006 AND Quarter = 2)
OR (Year = 2007 AND Quarter = 2)
OR (Year = 2008 AND Quarter = 2)

So would creating a clustered index on Year,Quarter,CustomerId make a significant improvement or should I attempt a different approach?

Also, this table has about 3 million records and takes up about 500 mb. So when I create this index will this table be locked for a few seconds, minutes, hours, or would access just be slow until the index is created?


Thanks,

compavalanche
September 3rd, 2008, 11:26 PM
The best suggestion I can give is to try it. As this is a production database you should take a backup and bring it up on a dev system. Then try it out. Use the profiling tools of you db (query analyzer for mssql, explain command for mysql,or equivalent for other db's) and figure out what is actually taking the time.

In this way you will also know how long adding/modifying index will take without resorting to guessing.

Hope that helps.

dglienna
September 3rd, 2008, 11:34 PM
Getting rid of AutoNumber Index Fields are a good thing, as I recall.

davide++
September 4th, 2008, 02:48 AM
Hi all.
You didn't say what db you're using...

You're using OR operator in your query, that slows down the execution.
Before trying "system optimization" as clustered indexing, I suggest to rewrite the query.

For example

SELECT *
FROM Receivables
WHERE Quarter = 2
AND Year IN (2006, 2007, 2008)


This query is equivalent to the previous one, but IN operator is faster than OR operator.