Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the benefit of an empty string value vs. using a null value

Here is my problem:

Where I work, I need to run reports on a very large database. I'm using basic SQL queries in PHP to filter for exactly what I want. Every so often I'll get groups in my reports that do not meet the criteria of my search query.

I noticed that all of the "stragglers" that appear in my reports seemed to have NULL values in certain fields, consistently.

Looking at the database structure, I find that these NULL values in the database have a NOT NULL flag set on them.

It turns out that these are actually not NULL values and instead empty string values or $value = ''

In my past experience, when I was starting out I would make this mistake a lot, setting something = '' instead of making it NULL.

I've asked the DBA if he can think of any legitimate reason why these are set this way ( there are 300,000 some odd records that are like this ) and he had no clue.

I'm thinking it could be another programmers mistake or someone attempting to avoid not being able to insert the record due to the "NOT NULL" flag set on that specific field.

So I'm scratching my head over here trying to find a legitimate reason for these to exist, other than my own suspicions that the database was not designed for these 2 types of records: records with this value and records without this value.

What are your thoughts?

like image 445
Kevin Collins Avatar asked Sep 09 '11 17:09

Kevin Collins


2 Answers

I've seen this pretty often, especially in cases where:

  • The database is touched by a variety of clients (a primary application, ad-hoc queries, automated import interfaces...)
  • Data has been converted or migrated from one system to another

There may not be a legitimate reason for it now, but it may have seemed like a good idea to someone else at the time. When you don't know the history of a database, it's hard to know the constraints or philosophies behind structural decisions.

I tend to be very careful around empty strings and nulls because I've been bitten by the same issues. There are often compatibility reasons that prevent changing the DB structure, so you're left with workarounds like using ISNULL(c1,'') or COALESCE(c1,'') to capture both nulls and empty strings.

like image 199
ajk Avatar answered Sep 29 '22 06:09

ajk


What is the benefit of an empty string value vs. using a null value

There are no real problems with using one or the other, exclusively. Issues arise when a single column can contain either, mainly with sorting and searching:

Sorting

SELECT '' as c1
UNION 
SELECT NULL as c1
UNION
SELECT 'a' as c1
ORDER BY c1

You'll see that NULLs will be listed first, then string values (empty first). It's something to keep in mind if you're doing multi-column ORDER BYs and you're wondering why some columns are always on top.

Searching

When a column is NULL, you need to search by IS NULL or IS NOT NULL rather than = ''. Also, when you do LENGTH( c1 ) = 0, these NULL records won't be included while empty strings will.

Finally, when you're comparing the column to itself in a self join, you won't be able to do a simple =.


All in all, it would be best to conform the column values to one or the other and make your SQL consistent.

like image 22
webbiedave Avatar answered Sep 29 '22 08:09

webbiedave