Question

I have a pretty basic categorical column in a Postgres database that is currently stored as VARCHAR. I can select a count of each with:

I though adding an ORDER BY array_position() would do it:

SELECT color, count(*)
FROM research
GROUP BY color 
ORDER BY array_position(ARRAY['Red','Orange','Yellow','Green','Blue'], color);

But I'm seeing a type error:

ERROR: function array_position(text[], character varying) does not exist
SQL state: 42883
Hint: No function matches the given name and argument types. You might need to add explicit type casts.
Character: 98

What do I need to cast color as so I can use the array_position function to order it?

Was it helpful?

Solution

Sample data

You have a table like,

CREATE TABLE research(colors)
  AS VALUES ('Blue'), ('Orange'), ('Yellow');

ENUM type

You have an enumerated list of colors. So the easy thing here would be to use an ENUM type

CREATE TYPE colors AS ENUM ('Red','Orange','Yellow','Green','Blue');

Then

ALTER TABLE research
  ALTER COLUMN colors      -- myColorsColumn
  SET DATA TYPE colors
  USING (colors::colors);  -- myColorsColumn::NewType

Now it's faster, more efficient, and cleaner. MORE WIN. MORE JOY. Etc. ENUM are stored as 4-byte internally.

ORDER BY colors

Is all you'll need.

Array Sort

But, you've got a God-given right to treat this like any other database that doesn't support ENUM types,

ORDER BY array_position(ARRAY['Red','Orange','Yellow','Green','Blue'], color);

I think this is perfectly fine, but I believe there is a limitation here in the implementation,

ARRAY['foo', 'bar', 'baz']

Is essentially

ARRAY['foo', 'bar', 'baz']::text

This means you need to either (speed doesn't matter as both options perform the same)

  1. Make the color column of the native text. This can be done in the call, or you can actually modify the table and should. In PostgreSQL, nothing should be varchar without a limit (since that's just a parallel type for text), and very few things should be varchar with a limit (since there is no advantage)

    color::text            -- PostgreSQL sexy sexy cast
    CAST(color AS TEXT))   -- ANSI SQL standardized vanilla and boring
    
  2. Or, you can construct the array itself as type varchar[]

    array_position(ARRAY['Red','Orange','Yellow','Green','Blue']::varchar[], color);
    

Further notes

  • In PostgreSQL, we wouldn't use varchar for colors. Even if you insist on not using an ENUM here (though I would), that should be text
  • The array_position is a shorthand, I expect it to be substantially slower than a similar operation though

    CASE
      WHEN color='Red' THEN 1::smallint
      WHEN color='Orange' THEN 2::smallint
      WHEN color='Yellow' THEN 3::smallint
      ... etc
    END;
    
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top