Just to preface, I'm not asking what the difference is between a NULL boundary and an infinite boundary - that's covered in this other question. Rather, I'm asking why PostgreSQL makes a distinction between NULL and infinite boundaries when (as far as I can tell) they function exactly the same. I started using PostgreSQL's range types recently, and I'm a bit confused by what NULL values in range types are supposed to mean. The documentation says: <blockquote> The lower bound of a range can be omitted, meaning that all values less than the upper bound are included in the range, e.g., <code>(,3]</code>. Likewise, if the upper bound of the range is omitted, then all values greater than the lower bound are included in the range. If both lower and upper bounds are omitted, all values of the element type are considered to be in the range. </blockquote> This suggests to me that omitted boundaries in a range (which are the equivalent NULL boundaries specified in a range type's constructor) should be considered infinite. However, PostgreSQL makes a distinction between NULL boundaries and infinite boundaries. The documentation continues: <blockquote> You can think of these missing values [in a range] as +/-infinity, but they are special range type values and are considered to be beyond any range element type's +/-infinity values. </blockquote> This is puzzling. "beyond infinity" doesn't make sense, as the entire point of infinite values is that nothing can be greater than +infinity or less than -infinity. That doesn't break "element in range"-type checks, but it does introduce an interesting case for primary keys that I think most people wouldn't expect. Or at least, I didn't expect it. Suppose we create a basic table whose sole field is a daterange, which is also the PK: <pre class="prettyprint lang-sql prettyprint-override"><code>CREATE TABLE public.range_test ( id daterange NOT NULL, PRIMARY KEY (id) ); </code></pre> Then we can populate it with the following data with no problem: <pre class="prettyprint lang-sql prettyprint-override"><code>INSERT INTO range_test VALUES (daterange('-infinity','2021-05-21','[]')); INSERT INTO range_test VALUES (daterange(NULL,'2021-05-21','[]')); </code></pre> Selecting all the data reveals we have these two tuples: <pre class="prettyprint lang-sql prettyprint-override"><code>[-infinity,2021-05-22) (,2021-05-22) </code></pre> So the two tuples are distinct, or there would have been a primary key violation. But again, NULL boundaries and infinite boundaries work exactly the same when we're dealing with the actual elements that make up the range. For example, there is no <code>date</code> value X such that the results of <code>X <@ [-infinity,2021-05-22)</code> returns a different result than <code>X <@ (,2021-05-22)</code>. This makes sense because NULL values can't have a type of <code>date</code>, so they can't even be compared to the range (and PostgreSQL even converted the inclusive boundary on the lower NULL bound in <code>daterange(NULL,'2021-05-21','[]')</code> to an exclusive boundary, <code>(,2021-05-22)</code> to be doubly sure). But why are two ranges that are identical in every practical way considered distinct? When I was still in school, I remember overhearing some discussion about the difference between "unknown" and "doesn't exist" - two people who were smarter than me were talking about that in the context of why NULL values often cause issues, and that replacing the singular NULL with separate "unknown" and "doesn't exist" values might solve those issues, but the discussion was over my head at the time. Thinking about this weird feature made me think of that discussion. So is the distinction between "unknown" and "doesn't exist" the reason why PostgreSQL treats NULL and +-infinity as distinct? If so, why are ranges the only types that allow for that distinction in PostgreSQL? And if not, why does PostgreSQL treat functionally-equivalent values as distinct?

<blockquote> Rather, I'm asking why PostgreSQL makes a distinction between NULL and infinite boundaries when (as far as I can tell) they function exactly the same. </blockquote> But they do not. <code>NULL</code> is a syntax convenience when used as bound of a range, while <code>-infinity</code> / <code>infinity</code> are actual values in the domain of the range. Abstract values meaning lesser / greater that any other value, but values nonetheless (which can be included or excluded). Also, <code>NULL</code> works for any range type, while most data types don't have special values like <code>-infinity</code> / <code>infinity</code>. Take <code>integer</code> and <code>int4range</code> for example. For a better understanding, consider the thread in pgsql-general that a_horse provided: <ul> <li>https://www.postgresql.org/message-id/flat/OrigoEmail.bf5.ac6ff6ffeb116aec.13fc29939e0%40prod2#c9fabdc670211364636b733a79a04712</li> </ul> <blockquote> This makes sense because NULL values can't have a type of date, so they can't even be compared to the range </blockquote> Every data type can be <code>NULL</code>, even domains that are explicitly <code>NOT NULL</code>. See: <ul> <li>Why does PostgreSQL allow NULLs in domains that prohibit NULL?</li> </ul> That includes <code>date</code>, of course (like Adrian commented): <pre class="prettyprint lang-sql prettyprint-override"><code>test=> SELECT NULL::date, pg_typeof(NULL::date); date | pg_typeof ------+----------- | date (1 row) </code></pre> But trying to discuss <code>NULL</code> as value (when used as bound of a range) is a misleading approach to begin with. It's not a value. <blockquote> ... (and PostgreSQL even converted the inclusive boundary on the lower NULL bound in <code>daterange(NULL,'2021-05-21','[]')</code> to an exclusive boundary, <code>(,2021-05-22)</code> to be doubly sure). </blockquote> Again, <code>NULL</code> is not treated as value in the domain of the range. It just serves as convenient syntax to say: "unbounded". No more than that.

Why does PostgreSQL consider NULL boundaries in range types to be distinct from infinite boundaries?

Q: Is Postgress null?

In PostgreSQL, NULL means no value. In other words, the NULL column does not have any value. It does not equal 0, empty string, or spaces. The NULL value cannot be tested using any equality operator like “=” “!=

Q: What is Tsrange PostgreSQL?

Range types are data types representing a range of values of some element type (called the range's subtype). For instance, ranges of timestamp might be used to represent the ranges of time that a meeting room is reserved. In this case the data type is tsrange (short for “timestamp range”), and timestamp is the subtype.

Q: What is Tstzrange?

tsrange contains timestamp without time zone. tstzrange contains timestamp with time zone.

Tags:

Just to preface, I'm not asking what the difference is between a NULL boundary and an infinite boundary - that's covered in this other question. Rather, I'm asking why PostgreSQL makes a distinction between NULL and infinite boundaries when (as far as I can tell) they function exactly the same.

I started using PostgreSQL's range types recently, and I'm a bit confused by what NULL values in range types are supposed to mean. The documentation says:

The lower bound of a range can be omitted, meaning that all values less than the upper bound are included in the range, e.g., (,3]. Likewise, if the upper bound of the range is omitted, then all values greater than the lower bound are included in the range. If both lower and upper bounds are omitted, all values of the element type are considered to be in the range.

This suggests to me that omitted boundaries in a range (which are the equivalent NULL boundaries specified in a range type's constructor) should be considered infinite. However, PostgreSQL makes a distinction between NULL boundaries and infinite boundaries. The documentation continues:

You can think of these missing values [in a range] as +/-infinity, but they are special range type values and are considered to be beyond any range element type's +/-infinity values.

This is puzzling. "beyond infinity" doesn't make sense, as the entire point of infinite values is that nothing can be greater than +infinity or less than -infinity. That doesn't break "element in range"-type checks, but it does introduce an interesting case for primary keys that I think most people wouldn't expect. Or at least, I didn't expect it.

Suppose we create a basic table whose sole field is a daterange, which is also the PK:

CREATE TABLE public.range_test
(
    id daterange NOT NULL,
    PRIMARY KEY (id)
);

Then we can populate it with the following data with no problem:

INSERT INTO range_test VALUES (daterange('-infinity','2021-05-21','[]'));
INSERT INTO range_test VALUES (daterange(NULL,'2021-05-21','[]'));

Selecting all the data reveals we have these two tuples:

[-infinity,2021-05-22)
(,2021-05-22)

So the two tuples are distinct, or there would have been a primary key violation. But again, NULL boundaries and infinite boundaries work exactly the same when we're dealing with the actual elements that make up the range. For example, there is no date value X such that the results of X <@ [-infinity,2021-05-22) returns a different result than X <@ (,2021-05-22). This makes sense because NULL values can't have a type of date, so they can't even be compared to the range (and PostgreSQL even converted the inclusive boundary on the lower NULL bound in daterange(NULL,'2021-05-21','[]') to an exclusive boundary, (,2021-05-22) to be doubly sure). But why are two ranges that are identical in every practical way considered distinct?

When I was still in school, I remember overhearing some discussion about the difference between "unknown" and "doesn't exist" - two people who were smarter than me were talking about that in the context of why NULL values often cause issues, and that replacing the singular NULL with separate "unknown" and "doesn't exist" values might solve those issues, but the discussion was over my head at the time. Thinking about this weird feature made me think of that discussion. So is the distinction between "unknown" and "doesn't exist" the reason why PostgreSQL treats NULL and +-infinity as distinct? If so, why are ranges the only types that allow for that distinction in PostgreSQL? And if not, why does PostgreSQL treat functionally-equivalent values as distinct?

979

asked May 19 '21 20:05

Nick Muise

1 Answers

Rather, I'm asking why PostgreSQL makes a distinction between NULL and infinite boundaries when (as far as I can tell) they function exactly the same.

But they do not. NULL is a syntax convenience when used as bound of a range, while -infinity / infinity are actual values in the domain of the range. Abstract values meaning lesser / greater that any other value, but values nonetheless (which can be included or excluded).

Also, NULL works for any range type, while most data types don't have special values like -infinity / infinity. Take integer and int4range for example.

For a better understanding, consider the thread in pgsql-general that a_horse provided:

https://www.postgresql.org/message-id/flat/OrigoEmail.bf5.ac6ff6ffeb116aec.13fc29939e0%40prod2#c9fabdc670211364636b733a79a04712

This makes sense because NULL values can't have a type of date, so they can't even be compared to the range

Every data type can be NULL, even domains that are explicitly NOT NULL. See:

Why does PostgreSQL allow NULLs in domains that prohibit NULL?

That includes date, of course (like Adrian commented):

test=> SELECT NULL::date, pg_typeof(NULL::date);
 date | pg_typeof 
------+-----------
      | date
(1 row)

But trying to discuss NULL as value (when used as bound of a range) is a misleading approach to begin with. It's not a value.

... (and PostgreSQL even converted the inclusive boundary on the lower NULL bound in daterange(NULL,'2021-05-21','[]') to an exclusive boundary, (,2021-05-22) to be doubly sure).

Again, NULL is not treated as value in the domain of the range. It just serves as convenient syntax to say: "unbounded". No more than that.

answered Sep 20 '22 08:09

Erwin Brandstetter

Related questions
                            
                                Why doesn't this create a dangling reference?
                            
                                Scale and Center D3-Graphviz Graph
                            
                                Twilio Voice Recording
                            
                                How to include Django static URL using JavaScript?
                            
                                Intellij - only one tab in main window
                            
                                Scala compilation error: not found: type _$1
                            
                                Extracting string between multiple occurrence of same delimiter in python pandas
                            
                                dump mapped buffer with dd
                            
                                Crop image using a bounding box react-native-camera
                            
                                Cannot read property 'conversationId' of undefined while using a reducer function
                            
                                Does "used as non-type template parameter" make a function template implicitly instantiated?
                            
                                React/Next app keeps giving me an error about missing prerender-manifest.json

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With