I've got a Postgres ORDER BY issue with the following table:

em_code  name
EM001    AAA
EM999    BBB
EM1000   CCC

To insert a new record to the table,

  1. I select the last record with SELECT * FROM employees ORDER BY em_code DESC
  2. Strip alphabets from em_code usiging reg exp and store in ec_alpha
  3. Cast the remating part to integer ec_num
  4. Increment by one ec_num++
  5. Pad with sufficient zeors and prefix ec_alpha again

When em_code reaches EM1000, the above algorithm fails.

First step will return EM999 instead EM1000 and it will again generate EM1000 as new em_code, breaking the unique key constraint.

Any idea how to select EM1000?

有帮助吗?

解决方案

The reason is that the string sorts alphabetically (instead of numerically like you would want it) and 1 sorts before 9. You could solve it like this:

SELECT * FROM employees ORDER BY substring(em_code, 3)::int DESC

It would be more efficient to drop the redundant 'EM' from your em_code - if you can - and save an integer number to begin with.


Additional answer to question in comment

To strip any and all non-digits from a string:

SELECT regexp_replace(em_code, E'\\D','','g')
FROM employees

\D is the regular expression class-shorthand for "non-digits".
'g' as 4th parameter is the "globally" switch to apply the replacement to every occurrence in the string, not just the first.

So I replace every non-digit with the empty string distilling solely digits from the string.

其他提示

One approach you can take is to create a naturalsort function for this. Here's an example, written by Postgres legend RhodiumToad.

create or replace function naturalsort(text)
    returns bytea language sql immutable strict as $f$
    select string_agg(convert_to(coalesce(r[2], length(length(r[1])::text) || length(r[1])::text || r[1]), 'SQL_ASCII'),'\x00')
    from regexp_matches($1, '0*([0-9]+)|([^0-9]+)', 'g') r;
$f$;

Source: http://www.rhodiumtoad.org.uk/junk/naturalsort.sql

To use it simply call the function in your order by:

SELECT * FROM employees ORDER BY naturalsort(em_code) DESC

This always comes up in questions and in my own development and I finally tired of tricky ways of doing this. I finally broke down and implemented it as a PostgreSQL extension:

https://github.com/Bjond/pg_natural_sort_order

It's free to use, MIT license.

Basically it just normalizes the numerics (zero pre-pending numerics) within strings such that you can create an index column for full-speed sorting au naturel. The readme explains.

The advantage is you can have a trigger do the work and not your application code. It will be calculated at machine-speed on the PostgreSQL server and migrations adding columns become simple and fast.

you can use just this line "ORDER BY length(substring(em_code FROM '[0-9]+')), em_code"

I wrote about this in detail in this related question:

Humanized or natural number sorting of mixed word-and-number strings

(I'm posting this answer as a useful cross-reference only, so it's community wiki).

I came up with something slightly different.

The basic idea is to create an array of tuples (integer, string) and then order by these. The magic number 2147483647 is int32_max, used so that strings are sorted after numbers.

  ORDER BY ARRAY(
    SELECT ROW(
      CAST(COALESCE(NULLIF(match[1], ''), '2147483647') AS INTEGER),
      match[2]
    )
    FROM REGEXP_MATCHES(col_to_sort_by, '(\d*)|(\D*)', 'g')
    AS match
  )

I thought about another way of doing this that uses less db storage than padding and saves time than calculating on the fly.

https://stackoverflow.com/a/47522040/935122

I've also put it on GitHub

https://github.com/ccsalway/dbNaturalSort

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top