Why specify a length for character varying types

Tags:

Referring to the Postgres Documentation on Character Types, I am unclear on the point of specifying a length for character varying (varchar) types.

Assumption:

the length of string doesn't matter to the application.
you don't care that someone puts that maximum size in the database
you have unlimited hard disk space

It does mention:

The storage requirement for a short string (up to 126 bytes) is 1 byte plus the actual string, which includes the space padding in the case of character. Longer strings have 4 bytes of overhead instead of 1. Long strings are compressed by the system automatically, so the physical requirement on disk might be less. Very long values are also stored in background tables so that they do not interfere with rapid access to shorter column values. In any case, the longest possible character string that can be stored is about 1 GB. (The maximum value that will be allowed for n in the data type declaration is less than that. It wouldn't be useful to change this because with multibyte character encodings the number of characters and bytes can be quite different.

This talks about the size of string, not the size of field, (i.e. sounds like it will always compress a large string in a large varchar field, but not a small string in a large varchar field?)

I ask this question as it would be much easier (and lazy) to specify a much larger size so you never have to worry about having a string too large. For example, if I specify varchar(50) for a place name I will get locations that have more characters (e.g. Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch), but if I specify varchar(100) or varchar(500), I'm less likley to get that problem.

So would you get a performance hit between varchar(500) and (arbitrarily) varchar(5000000) or text() if your largest string was say 400 characters long?

Also out of interest if anyone has the answer to this AND knows the answer to this for other databases, please add that too.

I have googled, but not found a sufficiently technical explanation.

252

asked Sep 06 '11 13:09

Mr Shoubs

2 Answers

My understanding is that having constraints is useful for data integrity, therefore I use column sizes to both validate the data items at the lower layer, and to better describe the data model.

Some links on the matter:

VARCHAR(n) Considered Harmful
CHAR(x) vs. VARCHAR(x) vs. VARCHAR vs. TEXT
In Defense of varchar(x)

answered Oct 10 '22 09:10

Marco Mariani

My understanding is that this is a legacy of older databases with storage that wasn't as flexible as that of Postgres. Some would use fixed-length structures to make it easy to find particular records and, since SQL is a somewhat standardized language, that legacy is still seen even when it doesn't provide any practical benefit.

Thus, your "make it big" approach should be an entirely reasonable one with Postgres, but it may not transfer well to other less flexible RDBMS systems.

answered Oct 10 '22 09:10

Sean McMains

Related questions
                            
                                Convert String ISO-8601 date to oracle's timestamp datatype
                            
                                Room database with one-to-one relation
                            
                                Filtering by relation count in SQLAlchemy
                            
                                Do null values save storage space? [duplicate]
                            
                                Java multiple threads database access [closed]
                            
                                General techniques to work with huge amounts of data on a non-super computer
                            
                                Rails 3 - Multiple database with joins condition
                            
                                natural key vs surrogate key an innodb foreign key
                            
                                ExecuteScalar call throwing exception "Object reference not set to an instance of an object"
                            
                                ODBC vs JDBC performance
                            
                                How to get a cursor for pagination in Graphql from a database?
                            
                                What is the best default transaction isolation level for an ERP, if any?
                            
                                How to get from a MySql server to an Android app?
                            
                                SQL Injection after removing all single-quotes and dash-characters
                            
                                node-mysql connection pooling
                            
                                SQL Join three tables
                            
                                Database Pivoting - what is the purpose?
                            
                                How can I create a case-insensitive database index in Django?
                            
                                How can I list all tables existent in a Database Link (Oracle)?
                            
                                SQL Server STATISTICS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why specify a length for character varying types

Tags:

types

database

postgresql

varchar

database-design

Mr Shoubs

People also ask

2 Answers

Marco Mariani

Sean McMains

Recent Activity

Donate For Us