Login or Sign Up to become a member!
LessThanDot Site Logo

LessThanDot

A Technical Community for IT Professionals

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Highly Rated Users

Forum
No Posts Rated

Top 50
Given
Received

Links

Wiki
Blog

Forum Statistics

Users
Members:
1879
Members Online:
1
Guests Online:
73

Total Post History
Posts:
81448
Topics:
18714

7-Day Post History
New Posts:
0
New Topics:
0
Active Topics:
0

Our newest member
mwojcik

Other

FAQ
All times are UTC [ DST ]

Google Ads

Identity Column / Clustered Index

Microsoft SQL Server
Please wait...

Identity Column / Clustered Index

Postby Remou on Fri Jan 04, 2013 4:42 pm

I found a comment:

I assumed identity columns generally made for good candidates for a clustered index. If you have found this not to be the case, do you have any further info on the topic you could point me toward?


-- http://stackoverflow.com/questions/3698 ... 24#3698824

I would imagine that a clustered index would be best if it used data that naturally formed clumps, region + something, or date + something.

Have I got all twisted about in the head again?
Stop quoting laws to us. We carry swords.
User avatar
Remou
LTD Admin
LTD Admin
LTD Gold - Rating: 971LTD Gold - Rating: 971LTD Gold - Rating: 971LTD Gold - Rating: 971LTD Gold - Rating: 971
LTD Gold - Rating: 971LTD Gold - Rating: 971LTD Gold - Rating: 971LTD Gold - Rating: 971LTD Gold - Rating: 971
LTD Gold - Rating: 971
 
Posts: 5414
Joined: Sun Oct 14, 2007 11:26 am
Unrated

Re: Identity Column / Clustered Index

Postby SQLDenis on Fri Jan 04, 2013 4:51 pm

See this section (Keep your clustered indexes narrow) in this post http://blogs.lessthandot.com/index.php/ ... of-indexes



another benefit is fragmentation since identity keys always insert at the end and not in the middle if you were to have an compound key
User avatar
SQLDenis
LTD Admin
LTD Admin
LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467
LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467
LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467
 
Posts: 21784
Joined: Wed Oct 10, 2007 6:43 pm
Location: Princeton, New Jersey, USA,World, Solar System, Milky Way, Universe and Beyond

Re: Identity Column / Clustered Index

Postby Thirster42 on Fri Jan 04, 2013 4:53 pm

it depends on what you're going for: surrogate keys vs non-surrogate keys.

surrogate keys (ie id columns) are automatic when a new record is produce. the main issue you may run into is that it doesn't prevent the duplication of data. this can be fixed with a unique index.

non-surrogate keys provide a way of uniquely identifying a record using one or more columns. the issue you may have with non-surrogate keys is that data in your key may change, and make a non-unique key. for example, social security numbers are a horrible candidate in itself for a primary key. also, it's much more difficult to join tables using non-surrogate keys, as the ENTIRE key has to exist in both locations to join correctly. so if you need to join you may end up with a surrogate key anyway.

i prefer surrogate keys over non-surrogate keys, but to each his own.
Ask three DBAs a question, and you’ll get four different answers.

-Brent Ozar
User avatar
Thirster42
Guru
Guru
LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517
LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517
 
Posts: 4646
Joined: Tue Nov 11, 2008 5:49 pm
Location: Jefferson City, Mo

Re: Identity Column / Clustered Index

Postby Thirster42 on Fri Jan 04, 2013 4:54 pm

and each has their own issues, pros and cons, and denis pointed out.
Ask three DBAs a question, and you’ll get four different answers.

-Brent Ozar
User avatar
Thirster42
Guru
Guru
LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517
LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517
 
Posts: 4646
Joined: Tue Nov 11, 2008 5:49 pm
Location: Jefferson City, Mo
Unrated

Re: Identity Column / Clustered Index

Postby SQLDenis on Fri Jan 04, 2013 4:59 pm

I also asked on twitter as well

first answer I got back is

1) very-high-volume inserts can have latch contention
2) if update data post insert to larger size, can get lots of fragmentation



I agree with those but still think it is better than a compound key especially if you also have non clustered indexes on the table
User avatar
SQLDenis
LTD Admin
LTD Admin
LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467
LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467
LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467
 
Posts: 21784
Joined: Wed Oct 10, 2007 6:43 pm
Location: Princeton, New Jersey, USA,World, Solar System, Milky Way, Universe and Beyond

Re: Identity Column / Clustered Index

Postby gmmastros on Fri Jan 04, 2013 6:29 pm

The thing to remember about clustered indexes is... the data is physically sorted and stored to disk based on the clustered index.

Under certain (relatively rare) conditions, using an identity column could be a bad idea, like Denis mentioned. The real answer to most SQL Server questions is... "It depends".

Basically, clustered indexes exist (primarily) to speed up queries. If you have many times more writes then reads from the table, then you may not want to use an identity column for the clustered index.

However, most systems have more reads than writes, so using an identity column for the clustered index is usually the best course of action.

Imagine building train tracks. You need to smooth the earth, lay gravel, lay the rail road ties, and finally the rails. If all four of these things were done simultaneously to the end of the track, there would be a lot of contention among the workers because each wants to work on the same section at a time.

I would imagine that a clustered index would be best if it used data that naturally formed clumps, region + something, or date + something.


Sometimes that is the right thing to do. Basically, it depends on the queries that you run against the table. If you do not cluster on an identity column, you run in to the potential problem of fragmentation.

Basically, fragmentation occurs when you have data pages than you really need. SQL Server stores data in 8kb pages. When you insert data, SQL server will use the clustered index to determine which page the data should be stored. If this page is in the middle of the table, and the page is too full to accommodate another row, SQL Server will split the page, putting 1/2 of the data on one page and the other 1/2 on another page. This effectively leaves you with 2 pages 1/2 full, which leads to less efficient queries because SQL server will now need to search through more data pages.

You can alleviate the fragmentation issue somewhat by allowing a certain amount of free space on each data page. 50% free space is usually way too much, but you can set it to 10%. If you do not cluster on an identity column, it is a good idea to track your fragmentation to see if it getting high. To correct the fragmentation problem, you can rebuild the index.

However, like I said earlier, it is almost always best to use an identity column as a clustered index. The exceptions are rare.
-George
User avatar
gmmastros
LTD Admin
LTD Admin
LTD Gold - Rating: 1630LTD Gold - Rating: 1630LTD Gold - Rating: 1630LTD Gold - Rating: 1630LTD Gold - Rating: 1630
LTD Gold - Rating: 1630LTD Gold - Rating: 1630LTD Gold - Rating: 1630LTD Gold - Rating: 1630LTD Gold - Rating: 1630
LTD Gold - Rating: 1630LTD Gold - Rating: 1630
 
Posts: 2367
Joined: Tue Oct 09, 2007 5:19 pm

Re: Identity Column / Clustered Index

Postby SQLDenis on Fri Jan 04, 2013 6:33 pm

well said George
User avatar
SQLDenis
LTD Admin
LTD Admin
LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467
LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467
LTD Gold - Rating: 3467LTD Gold - Rating: 3467LTD Gold - Rating: 3467
 
Posts: 21784
Joined: Wed Oct 10, 2007 6:43 pm
Location: Princeton, New Jersey, USA,World, Solar System, Milky Way, Universe and Beyond
Unrated