Home All Groups Group Topic Archive Search About

autoincrement vs uniqueidentifier (easy)

Author
8 Jul 2009 10:32 PM
yui
Hi there,

I need to create unique primary keys in my tables but I don't care about the
format of these keys. They're used simply to identify the record and serve
as foreign keys in other tables (users never see them). I've always relied
on autoincrement keys for this in the past (normally stored as integers) but
now I see SQL Server has a "uniqueidentifier" type which is just a GUID by
default (apparently you can change this though). It's been a long time since
I've done any SQL Server programming so:

1) How exactly would I create an integer field that serves as an
autoincrement key (what do I set in VS which is where I'm managing my DB for
now)
2) Why would (or should) I use a "uniqueidentifier" instead of 1 above. If
an autoincrement key is incremented for each new record and will never be
reused (and my tables will never approach 2+ billion records), what's the
point of a "uniqueidentifier" in this case (if any). It just takes up extra
space and buys me nothing in return (compared to the smaller 32 bit integer)

Thanks in advance.

Author
9 Jul 2009 2:13 AM
Gregory A. Beamer
Show quote Hide quote
"yui" <_nospam@_no_spam.com> wrote in
news:eoOMovBAKHA.4432@TK2MSFTNGP02.phx.gbl:

> Hi there,
>
> I need to create unique primary keys in my tables but I don't care
> about the format of these keys. They're used simply to identify the
> record and serve as foreign keys in other tables (users never see
> them). I've always relied on autoincrement keys for this in the past
> (normally stored as integers) but now I see SQL Server has a
> "uniqueidentifier" type which is just a GUID by default (apparently
> you can change this though). It's been a long time since I've done any
> SQL Server programming so:
>
> 1) How exactly would I create an integer field that serves as an
> autoincrement key (what do I set in VS which is where I'm managing my
> DB for now)


CREATE TABLE Table1
(
        Table1ID int IDENTITY(1,1) PRIMARY KEY
)

> 2) Why would (or should) I use a "uniqueidentifier" instead of 1
> above. If an autoincrement key is incremented for each new record and
> will never be reused (and my tables will never approach 2+ billion
> records), what's the point of a "uniqueidentifier" in this case (if
> any). It just takes up extra space and buys me nothing in return
> (compared to the smaller 32 bit integer)


If the database is spread over many servers, the GUID makes more sense, as
it is statistically guaranteed to be unique.

If you have clients that are detatched and can create new records, the GUID
makes more sense.

Peace and Grace,


--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

Twitter: @gbworld
Blog: http://gregorybeamer.spaces.live.com

******************************************
Show quoteHide quote
|     Think outside the box!             |
******************************************
Are all your drivers up to date? click for free checkup

Author
9 Jul 2009 8:30 PM
yui
> CREATE TABLE Table1
> (
>    Table1ID int IDENTITY(1,1) PRIMARY KEY

Thanks. I should be able to (now) find the equivalent in VS itself.

> If the database is spread over many servers, the GUID makes more sense, as
> it is statistically guaranteed to be unique.

That's a good reason but not applicable in my case

> If you have clients that are detatched and can create new records, the
> GUID
> makes more sense.

Also N/A.

Thanks for your help!
Author
9 Jul 2009 2:20 AM
DDD
On Jul 9, 6:32 am, "yui" <_nospam@_no_spam.com> wrote:
> Hi there,
>
> I need to create unique primary keys in my tables but I don't care about the
> format of these keys. They're used simply to identify the record and serve
> as foreign keys in other tables (users never see them). I've always relied
> on autoincrement keys for this in the past (normally stored as integers) but
> now I see SQL Server has a "uniqueidentifier" type which is just a GUID by
> default (apparently you can change this though). It's been a long time since
> I've done any SQL Server programming so:
>
> 1) How exactly would I create an integer field that serves as an
> autoincrement key (what do I set in VS which is where I'm managing my DB for
> now)
You wanna create the field in codes? in GUI?

> 2) Why would (or should) I use a "uniqueidentifier" instead of 1 above. If
> an autoincrement key is incremented for each new record and will never be
> reused (and my tables will never approach 2+ billion records), what's the
> point of a "uniqueidentifier" in this case (if any). It just takes up extra
> space and buys me nothing in return (compared to the smaller 32 bit integer)

In my opinion, an autoincrement key will be a bottleneck if there are
many requests at the same time.
Because the key has to be added in order. GUID has no this problem,
there is no relations between two GUID number.

>
> Thanks in advance.

Good luck.


Maybe contain nuts.
Author
9 Jul 2009 8:35 PM
yui
> 1) How exactly would I create an integer field that serves as an
> autoincrement key (what do I set in VS which is where I'm managing my DB
> for
> now)
You wanna create the field in codes? in GUI?

GUI. The tables are statically created and the schema never changes.

> In my opinion, an autoincrement key will be a bottleneck if there are
> many requests at the same time.
> Because the key has to be added in order. GUID has no this problem,
> there is no relations between two GUID number.

I'd have to check that. Perhaps you're right but I honestly have my doubts
(about any noticeable performance hit). GUIDs would appear to be worse since
they're longer and are more time-consuming to generate :)

Thanks for your help.
Author
9 Jul 2009 3:39 AM
--CELKO--
>> I need to create unique primary keys [what other kind are there in your world?] in my tables but I don't care about the format of these keys. <<

You have no idea what a key is at all.  It is a subset of the
attrbutes of an entity that uniquely identify each occurence of the
entity.  You should care very much about it

>>  They're used simply to identify the record [rows are not records] and serve as foreign keys in other tables (users never see them).<<

That is an exposed physical locator.  It is like a 1950's pointer
chain; you (the user) have to maintain it manually and it has nothing
whatsoever to do with the logical data model.  It lets you avoid
learning RDBMS at the expense of data integrity.

>> I've always relied on autoincrement keys [sic] for this in the past (normally stored as integers) but now I see SQL Server has a "uniqueidentifier" type which is just a GUID by default (apparently you can change this though).<<

autoincrements look just like record numbering on a magantic tape
file, so you can stay in your comfort zone.

>>  It's been a long time since I've done any SQL Server programming so: <<

No, it sounds like you have been doing 1950's COBOL file sytrem
programming with T-SQL and have never actually written a proper
schema.  Get a book on basics and read it.  Look at most of the
posters in the newsgroups.  They do not post DDL -- often because they
have no idea what it is -- and invent their own pseudo-code.  They
have no idea what a spec is, but post requests for ways to implement
an approach they have locked into -- and it is usually a non-
relational approach at that.

What we have are some really lazy programmers who want to use the
newsgroups to do their job or homework for them.  Even worse, they
want to get an instant college education which is not possible in a
short reply.  The questions they ask can most often be answered by (1)
RTFM, well BOL (2) "Try it and see" (3) a quick Google search in the
newsgroup to which they are posting.

What they get instead from most replies is a kludge to get rid of
them.  If this was a woodworking newsgroup and someone posted "What is
the best kind of rocks to pound screws into fine furniture?" are you
really helping them when you say "Granite! Use big hunks of granite!"
I am the guy who replies with "Your question is bad.  Don't you know
about screwdrivers?"  And I like to remind them that it takes six
years to become a Journeyman Union Carpenter in New York State.  Not
Master, Journeyman.
Author
9 Jul 2009 9:20 AM
Erland Sommarskog
yui (_nospam@_no_spam.com) writes:
> 2) Why would (or should) I use a "uniqueidentifier" instead of 1 above.
> If an autoincrement key is incremented for each new record and will
> never be reused (and my tables will never approach 2+ billion records),
> what's the point of a "uniqueidentifier" in this case (if any). It just
> takes up extra space and buys me nothing in return (compared to the
> smaller 32 bit integer)

One possibly advantage with GUIDS is that these can be generated client-
side, and you don't have to make a round-trip to get the value. This can
make things easier if you need to insert, say, both Orders and OrderDetails.

On the other hand a GUID is 16 bytes, four times an integer. GUIDs can
also lead to performance problems with fragmentation if you don't deal
with them correctly.

In short, as long you don't have any problems with integer keys, stick to
them, and only use GUIDs if you find that you actually need to.


--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
9 Jul 2009 8:56 PM
yui
> On the other hand a GUID is 16 bytes, four times an integer. GUIDs can
> also lead to performance problems with fragmentation if you don't deal
> with them correctly.

Can you elaborate on "don't deal with them correctly". What control do you
have over fragmentation without defragmenting the DB itself (during some
maintenance period in most cases)

> In short, as long you don't have any problems with integer keys, stick to
> them, and only use GUIDs if you find that you actually need to.

This is how I see it. Thanks for your insight.
Author
9 Jul 2009 10:25 PM
Erland Sommarskog
yui (_nospam@_no_spam.com) writes:
>> On the other hand a GUID is 16 bytes, four times an integer. GUIDs can
>> also lead to performance problems with fragmentation if you don't deal
>> with them correctly.
>
> Can you elaborate on "don't deal with them correctly". What control do you
> have over fragmentation without defragmenting the DB itself (during some
> maintenance period in most cases)

A good start is to not make the index on the GUID column clustered.

If you make it clustered, you should give it a low fillfactor, 30-50%.
When new rows will be added in the gaps, and there will be no page splits,
until you start to fill up the gaps. Then it's time for an index rebuild
with a low fillfactor again.

An index on GUIDs with a default fillfactor will see lots of page splits,
which leads to poor INSERT performance and also fragmentation.


--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
9 Jul 2009 11:45 PM
yui
Show quote Hide quote
>>> On the other hand a GUID is 16 bytes, four times an integer. GUIDs can
>>> also lead to performance problems with fragmentation if you don't deal
>>> with them correctly.
>>
>> Can you elaborate on "don't deal with them correctly". What control do
>> you
>> have over fragmentation without defragmenting the DB itself (during some
>> maintenance period in most cases)
>
> A good start is to not make the index on the GUID column clustered.
>
> If you make it clustered, you should give it a low fillfactor, 30-50%.
> When new rows will be added in the gaps, and there will be no page splits,
> until you start to fill up the gaps. Then it's time for an index rebuild
> with a low fillfactor again.
>
> An index on GUIDs with a default fillfactor will see lots of page splits,
> which leads to poor INSERT performance and also fragmentation.

Thanks for the info. It's something worth investigating (to reduce
fragmention in general) but I'll be sticking with integer keys in any case.
It serves my purpose. Thanks again.
Author
10 Jul 2009 9:42 AM
Erland Sommarskog
yui (_nospam@_no_spam.com) writes:
> Thanks for the info. It's something worth investigating (to reduce
> fragmention in general) but I'll be sticking with integer keys in any
> case. It serves my purpose. Thanks again.

Just to make it clear: the stuff with fillfactor etc that I discussed
pertains to GUIDs. If you use monotonically integer keys, there will
not be issues with pages splits etc as you would get with GUIDs.


--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
9 Jul 2009 9:23 AM
Alexey Smirnov
On Jul 9, 12:32 am, "yui" <_nospam@_no_spam.com> wrote:
> 2) Why would (or should) I use a "uniqueidentifier" instead of 1 above. If
> an autoincrement key is incremented for each new record and will never be
> reused (and my tables will never approach 2+ billion records), what's the
> point of a "uniqueidentifier" in this case (if any). It just takes up extra
> space and buys me nothing in return (compared to the smaller 32 bit integer)
>

"uniqueidentifier" field can help to keep id "secret"

e.g. when you have

site.com/userprofile.aspx?id=100
site.com/userprofile.aspx?id=101
site.com/userprofile.aspx?id=102

then everyone knows how many items you have because of 100, 101, 102

This

site.com/userprofile.aspx?id=6F9619FF-8B86-D011-B42D-00C04FC964FF

comes in handy if you like to hide that information.

As from the database point of view, uniqueidentifier is useful if
you're consolidating records from multiple databases into one table.
On another hand, uniqueidentifier has long values, uses more space and
it is recommended to use the IDENTITY when global uniqueness is not
required, or when having a serially incrementing key is preferred.
Author
9 Jul 2009 9:03 PM
yui
> "uniqueidentifier" field can help to keep id "secret"
> e.g. when you have
> site.com/userprofile.aspx?id=100
> site.com/userprofile.aspx?id=101
> site.com/userprofile.aspx?id=102
> then everyone knows how many items you have because of 100, 101, 102
> site.com/userprofile.aspx?id=6F9619FF-8B86-D011-B42D-00C04FC964FF
> comes in handy if you like to hide that information.

That's a good point though not applicable in my case (but useful to consider
for next time)

> As from the database point of view, uniqueidentifier is useful if
> you're consolidating records from multiple databases into one table.
> On another hand, uniqueidentifier has long values, uses more space and
> it is recommended to use the IDENTITY when global uniqueness is not
> required, or when having a serially incrementing key is preferred.

Merging DBs isn't an issue in my case but if it's occuring during some
maintenance period only, integer keys can easily be resequenced. The issue
only becomes significant IMO when dealing with a distributed DB system whose
keys must be kept unique at all times (which is what you're probably
alluding to). Based on your input and others, I should be ok with
autoincrement (integer) keys. Thanks for your feedback.
Author
9 Jul 2009 9:14 PM
Alexey Smirnov
Show quote Hide quote
On Jul 9, 11:03 pm, "yui" <_nospam@_no_spam.com> wrote:
> > "uniqueidentifier" field can help to keep id "secret"
> > e.g. when you have
> > site.com/userprofile.aspx?id=100
> > site.com/userprofile.aspx?id=101
> > site.com/userprofile.aspx?id=102
> > then everyone knows how many items you have because of 100, 101, 102
> > site.com/userprofile.aspx?id=6F9619FF-8B86-D011-B42D-00C04FC964FF
> > comes in handy if you like to hide that information.
>
> That's a good point though not applicable in my case (but useful to consider
> for next time)
>
> > As from the database point of view, uniqueidentifier is useful if
> > you're consolidating records from multiple databases into one table.
> > On another hand, uniqueidentifier has long values, uses more space and
> > it is recommended to use the IDENTITY when global uniqueness is not
> > required, or when having a serially incrementing key is preferred.
>
> Merging DBs isn't an issue in my case but if it's occuring during some
> maintenance period only, integer keys can easily be resequenced. The issue
> only becomes significant IMO when dealing with a distributed DB system whose
> keys must be kept unique at all times (which is what you're probably
> alluding to). Based on your input and others, I should be ok with
> autoincrement (integer) keys. Thanks for your feedback.

Glad to help

Bookmark and Share