Here is my problem:
Where I work, I need to run reports on a very large database. I'm using basic SQL queries in PHP to filter for exactly what I want. Every so often I'll get groups in my reports that do not meet the criteria of my search query.
I noticed that all of the "stragglers" that appear in my reports seemed to have NULL values in certain fields, consistently.
Looking at the database structure, I find that these NULL values in the database have a NOT NULL flag set on them.
It turns out that these are actually not NULL values and instead empty string values or $value = ''
In my past experience, when I was starting out I would make this mistake a lot, setting something = '' instead of making it NULL.
I've asked the DBA if he can think of any legitimate reason why these are set this way ( there are 300,000 some odd records that are like this ) and he had no clue.
I'm thinking it could be another programmers mistake or someone attempting to avoid not being able to insert the record due to the "NOT NULL" flag set on that specific field.
So I'm scratching my head over here trying to find a legitimate reason for these to exist, other than my own suspicions that the database was not designed for these 2 types of records: records with this value and records without this value.
What are your thoughts?
I've seen this pretty often, especially in cases where:
There may not be a legitimate reason for it now, but it may have seemed like a good idea to someone else at the time. When you don't know the history of a database, it's hard to know the constraints or philosophies behind structural decisions.
I tend to be very careful around empty strings and nulls because I've been bitten by the same issues. There are often compatibility reasons that prevent changing the DB structure, so you're left with workarounds like using ISNULL(c1,'')
or COALESCE(c1,'')
to capture both nulls and empty strings.
What is the benefit of an empty string value vs. using a null value
There are no real problems with using one or the other, exclusively. Issues arise when a single column can contain either, mainly with sorting and searching:
SELECT '' as c1
UNION
SELECT NULL as c1
UNION
SELECT 'a' as c1
ORDER BY c1
You'll see that NULL
s will be listed first, then string values (empty first). It's something to keep in mind if you're doing multi-column ORDER BY
s and you're wondering why some columns are always on top.
When a column is NULL
, you need to search by IS NULL
or IS NOT NULL
rather than = ''
. Also, when you do LENGTH( c1 ) = 0
, these NULL
records won't be included while empty strings will.
Finally, when you're comparing the column to itself in a self join, you won't be able to do a simple =
.
All in all, it would be best to conform the column values to one or the other and make your SQL consistent.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With