In the databases course that I did during my education (approx. 4 years ago), I thought that it is recommended avoiding the use of character strings as primary key's data type. Can someone tell me what are the pros and cons for choosing a character varying data type for primary key in SQL and how much the above premise is true? N.B.: (I'm using PostgreSQL database). I'm also dealing with a situation when you need to reference such a table from another, thus putting foreign key on character varying data type. Please take in account that also.

The advantages you have for choosing a character datatype as a primary key field is that you may choose what data it can show. As an example, you could have the email address as the key field for a users table. The eliminates the need for an additional column. Another advantage is if you have a common data table that holds indexes of multiple other tables (think a NOTES table with an external reference to FINANCE, CONTACT, and ADMIN tables), you can easily know what table this came from (e.g. your FINANCE table has an index of F00001, CONTACT table has an index of C00001, etc). I'm afraid the disadvantages are going to be greater larger in this reply as I'm against such an approach. The disadvantages are as follows: <ol> <li>The serial datatype exists for exactly this reason in PostgreSQL</li> <li>Numeric indexes will be entered in order and minimal reindexing will need to be done (i.e. if you have a table with keys Apple, Carrot and want to insert Banana, the table will have to move around the indexes so that Banana is inserted in the middle. You will rarely insert data in the middle of an index if the index is numeric).</li> <li>Numeric indexes unlinked from data are not going to change.</li> <li>Numeric indexes are shorter and their length can be fixed (4 bytes vs whatever you pick as your varchar length).</li> </ol> In your case you can still put a foreign key on a numeric index, so I'm not sure why you would want to force it to be a varchar type. Searching and filtering on a numeric field is theoretically faster than a text field as the server will be forced to convert the data first. Generally speaking, you would have a numeric primary key that is non-clustered, and then create a clustered key on your data column that you are going to filter a lot. Those are general standards when writing SQL, but when it comes to benchmarking, you will only find that varchar columns are a little slower on joining and filtering than integer columns. As long as your primary keys are not changing EVER then you're alright.

What are the pros and cons for choosing a character varying data type for primary key in SQL? [closed]

Tags:

sql

sqldatatypes

postgresql-9.1

primary-key

foreign-keys

In the databases course that I did during my education (approx. 4 years ago), I thought that it is recommended avoiding the use of character strings as primary key's data type.

Can someone tell me what are the pros and cons for choosing a character varying data type for primary key in SQL and how much the above premise is true?

N.B.: (I'm using PostgreSQL database). I'm also dealing with a situation when you need to reference such a table from another, thus putting foreign key on character varying data type. Please take in account that also.

429

asked Mar 18 '13 12:03

artaxerxe

1 Answers

The advantages you have for choosing a character datatype as a primary key field is that you may choose what data it can show. As an example, you could have the email address as the key field for a users table. The eliminates the need for an additional column. Another advantage is if you have a common data table that holds indexes of multiple other tables (think a NOTES table with an external reference to FINANCE, CONTACT, and ADMIN tables), you can easily know what table this came from (e.g. your FINANCE table has an index of F00001, CONTACT table has an index of C00001, etc). I'm afraid the disadvantages are going to be greater larger in this reply as I'm against such an approach.

The disadvantages are as follows:

The serial datatype exists for exactly this reason in PostgreSQL
Numeric indexes will be entered in order and minimal reindexing will need to be done (i.e. if you have a table with keys Apple, Carrot and want to insert Banana, the table will have to move around the indexes so that Banana is inserted in the middle. You will rarely insert data in the middle of an index if the index is numeric).
Numeric indexes unlinked from data are not going to change.
Numeric indexes are shorter and their length can be fixed (4 bytes vs whatever you pick as your varchar length).

In your case you can still put a foreign key on a numeric index, so I'm not sure why you would want to force it to be a varchar type. Searching and filtering on a numeric field is theoretically faster than a text field as the server will be forced to convert the data first. Generally speaking, you would have a numeric primary key that is non-clustered, and then create a clustered key on your data column that you are going to filter a lot.

Those are general standards when writing SQL, but when it comes to benchmarking, you will only find that varchar columns are a little slower on joining and filtering than integer columns. As long as your primary keys are not changing EVER then you're alright.

answered Sep 21 '22 20:09

DF_

Related questions
                            
                                'NOT LIKE' in an SQL query
                            
                                SQL Order By list of strings?
                            
                                Conditional sum in Group By query MSSQL
                            
                                Generate script for both schema and data
                            
                                Cannot start LocalDB
                            
                                MySQL Insert Into from one Database in another
                            
                                SQL sort by version "number", a string of varying length
                            
                                Insert entire DataTable into database at once instead of row by row?
                            
                                Is there a function that takes a year, month and day to create a date in PostgreSQL?
                            
                                SQL Server convert select a column and convert it to a string
                            
                                What's a good way to check if two datetimes are on the same calendar day in TSQL?
                            
                                SQL query for getting data for last 3 months
                            
                                Should I delete or disable a row in a relational database?
                            
                                Error: Cannot create TypedQuery for query with more than one return
                            
                                PostgreSQL: Full Text Search - How to search partial words?
                            
                                Sum columns with null values in oracle
                            
                                How do I prevent a database trigger from recursing?
                            
                                Prevent Entity Framework adding ORDER BY when using Include
                            
                                In EF 4.1 DbContext how to trace generated SQL
                            
                                Difference between sparse index and dense index

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With