Question

I have 2 tables: users and user_statuses. A user status is of type varchar. It's varchar for the purpose of readability and I don't consider using id/int in this case.

CREATE TABLE users(
 --............
 status varchar foreign key references user_statuses(name),
 --.....

The column name in user_statuses is UNIQUE

At some point I may need to rename some the name or names of some statuses in user_statuses table because I haven't decided what to name them and how many of the statuses overall I need.

Question: How would I rename a status name such that it's effective in terms of performance? And in general, how to do it properly in this case?

P.S.

In my real system I have dozens of _*statuses tables and each is subject to have some of its statuses renamed.

Each of the tables that refers to *_statuses ones may have, say, 10M-30M-50M rows. Or up to 100M ones at most.

Was it helpful?

Solution

I would consider using a code for the status. I.e

CREATE TABLE USER_STATUS
( STATUS_CODE CHAR(?) NOT NULL PRIMARY KEY
, STATUS_NAME TEXT NOT NULL UNIQUE
);

The use of char is discouraged by the postgres community, so an alternative would be to use text and an additional constraint:

CREATE TABLE USER_STATUS
( STATUS_CODE TEXT NOT NULL PRIMARY KEY
, STATUS_NAME TEXT NOT NULL UNIQUE
,     CHECK (LENGTH(STATUS_CODE) <= ?)
);

It should be more stable than the name, but more intuitive than an int.

But for your question, I would probably insert the new value:

INSERT INTO USER_STATUS (NAME) VALUES ('Valid');

then update all users:

UPDATE USERS SET STATUS = 'Valid' WHERE STATUS = *Normal';

and finally, remove the old status:

DELETE FROM USER_STATUS WHERE NAME = 'Normal';

For the more general case where there is a hierarchy of tables involved, you traverse them in DFS and insert new rows on the way down. When you reach a leaf you can use update and then delete rows as you climb the tree back up. You can use INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS to obtain the edges in the tree, and then use for example a topological sort to determine the ordering of the tables.

Another advice is to stick with the same name for the attribute throughout the model. I.e:

CREATE TABLE USER_STATUS
( STATUS_CODE ... NOT NULL PRIMARY KEY
, STATUS_NAME TEXT NOT NULL UNIQUE
);

CREATE TABLE USERS
( ...
, STATUS_CODE ... REFERENCES USER_STATUS (STATUS_CODE)
...
); 

OTHER TIPS

You can create an ENUM type and use that as the column type (instead of varchar) in both tables. Then you can use ALTER TYPE status_type RENAME VALUE... to change the spelling of one of the type's values in a performant way.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top