Sort varchar by its numeric fields
-
10-03-2021 - |
Question
I'm sure this has been asked before, but since the question can be formulated in many different ways it's difficult to find the proper answer.
I have an orders table with a varchar field for the order number, which is formatted with 4-digits year, a dash (-), and a progressive numeric value. For example it may contain the following values:
SELECT number FROM orders ORDER BY number LIMIT 10;
number
----------
1999-13
2019-11
2020-1
2020-10
2020-100
2020-12
2020-2
2020-21
2020-3
2021-1
I need to sort that field by year and then by the progressive number, with this expected result:
number
----------
1999-13
2019-11
2020-1
2020-2
2020-3
2020-10
2020-12
2020-21
2020-100
2021-1
My questions are:
- Which is the simplest way to achieve this with an ORDER subclause?
- How to add an efficient index using this custom sort without having to modify the table?
I would like to keep at least the first answer as database-agnostic as possible (that's why I haven't included db specific tags), but if different best answers are possible for different DBMS/versions, let's assume PostgreSQL 12.
Solution
I would convert that value to an integer array, then sort on that array:
SELECT number
FROM orders
ORDER BY string_to_array(number, '-')::int[]
LIMIT 10;
OTHER TIPS
You can splitthe number and sort by it parts
CREATE TABLE test ("number" varchar(8)) ; INSERT INTO test ("number") VALUES ('1999-13'), ('2019-11'), ('2020-1'), ('2020-10'), ('2020-100'), ('2020-12'), ('2020-2'), ('2020-21'), ('2020-3'), ('2021-1') ;
SELECT "number" FROM test ORDER BY split_part("number", '-', 1)::numeric,split_part("number", '-', 2)::numeric;
| number | | :------- | | 1999-13 | | 2019-11 | | 2020-1 | | 2020-2 | | 2020-3 | | 2020-10 | | 2020-12 | | 2020-21 | | 2020-100 | | 2021-1 |
db<>fiddle here
I have an orders table with a varchar field for the order number ... year ... dash ... progressive numeric value
Sorry, but I would suggest that this is a classic case of storing multiple pieces of information in a single database field - and that's a Bad Idea.
Most order tables I've ever seen simply use a single, numeric field to uniquely identify the order and include the order date as a separate field, from which the year can be derived, if required.