Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL ORDER BY issue - natural sort

Tags:

I've got a Postgres ORDER BY issue with the following table:

em_code  name EM001    AAA EM999    BBB EM1000   CCC 

To insert a new record to the table,

  1. I select the last record with SELECT * FROM employees ORDER BY em_code DESC
  2. Strip alphabets from em_code usiging reg exp and store in ec_alpha
  3. Cast the remating part to integer ec_num
  4. Increment by one ec_num++
  5. Pad with sufficient zeors and prefix ec_alpha again

When em_code reaches EM1000, the above algorithm fails.

First step will return EM999 instead EM1000 and it will again generate EM1000 as new em_code, breaking the unique key constraint.

Any idea how to select EM1000?

like image 985
Mithun Sreedharan Avatar asked Feb 07 '12 09:02

Mithun Sreedharan


People also ask

How does Postgres ORDER BY default?

The ORDER BY clause in PostgreSQL is used together with the SELECT statement to sort table data. The table data can either be sorted in ascending or descending order. By default, the data is sorted in ascending order.

Which is natural sort order?

In computing, natural sort order (or natural sorting) is the ordering of strings in alphabetical order, except that multi-digit numbers are treated atomically, i.e., as if they were a single character.

Is Postgres ORDER BY case insensitive?

PostgreSQL is a case-sensitive database by default, but provides various possibilities for performing case-insensitive operations and working with collations.

Does Postgres guarantee order?

Sorting in PostgreSQLA particular output ordering can only be guaranteed if the sort step is explicitly chosen.


2 Answers

One approach you can take is to create a naturalsort function for this. Here's an example, written by Postgres legend RhodiumToad.

create or replace function naturalsort(text)     returns bytea language sql immutable strict as $f$     select string_agg(convert_to(coalesce(r[2], length(length(r[1])::text) || length(r[1])::text || r[1]), 'SQL_ASCII'),'\x00')     from regexp_matches($1, '0*([0-9]+)|([^0-9]+)', 'g') r; $f$; 

Source: http://www.rhodiumtoad.org.uk/junk/naturalsort.sql

To use it simply call the function in your order by:

SELECT * FROM employees ORDER BY naturalsort(em_code) DESC 
like image 188
Jonathan Avatar answered Sep 25 '22 08:09

Jonathan


The reason is that the string sorts alphabetically (instead of numerically like you would want it) and 1 sorts before 9. You could solve it like this:

SELECT * FROM employees ORDER  BY substring(em_code, 3)::int DESC; 

It would be more efficient to drop the redundant 'EM' from your em_code - if you can - and save an integer number to begin with.

Answer to question in comment

To strip any and all non-digits from a string:

SELECT regexp_replace(em_code, E'\\D','','g') FROM   employees; 

\D is the regular expression class-shorthand for "non-digits".
'g' as 4th parameter is the "globally" switch to apply the replacement to every occurrence in the string, not just the first.

After replacing every non-digit with the empty string, only digits remain.

like image 24
Erwin Brandstetter Avatar answered Sep 23 '22 08:09

Erwin Brandstetter