How to properly and effective rename a vaue (column) of a foreign key in both tables?
-
05-03-2021 - |
Pregunta
I have 2 tables: users
and user_statuses
. A user status is of type varchar
. It's varchar for the purpose of readability and I don't consider using id/int in this case.
CREATE TABLE users(
--............
status varchar foreign key references user_statuses(name),
--.....
The column name
in user_statuses
is UNIQUE
At some point I may need to rename some the name or names of some statuses in user_statuses
table because I haven't decided what to name them and how many of the statuses overall I need.
Question: How would I rename a status name such that it's effective in terms of performance? And in general, how to do it properly in this case?
P.S.
In my real system I have dozens of _*statuses
tables and each is subject to have some of its statuses renamed.
Each of the tables that refers to *_statuses
ones may have, say, 10M-30M-50M rows. Or up to 100M ones at most.
Solución
I would consider using a code for the status. I.e
CREATE TABLE USER_STATUS
( STATUS_CODE CHAR(?) NOT NULL PRIMARY KEY
, STATUS_NAME TEXT NOT NULL UNIQUE
);
The use of char is discouraged by the postgres community, so an alternative would be to use text and an additional constraint:
CREATE TABLE USER_STATUS
( STATUS_CODE TEXT NOT NULL PRIMARY KEY
, STATUS_NAME TEXT NOT NULL UNIQUE
, CHECK (LENGTH(STATUS_CODE) <= ?)
);
It should be more stable than the name, but more intuitive than an int.
But for your question, I would probably insert the new value:
INSERT INTO USER_STATUS (NAME) VALUES ('Valid');
then update all users:
UPDATE USERS SET STATUS = 'Valid' WHERE STATUS = *Normal';
and finally, remove the old status:
DELETE FROM USER_STATUS WHERE NAME = 'Normal';
For the more general case where there is a hierarchy of tables involved, you traverse them in DFS and insert new rows on the way down. When you reach a leaf you can use update and then delete rows as you climb the tree back up. You can use INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS
to obtain the edges in the tree, and then use for example a topological sort to determine the ordering of the tables.
Another advice is to stick with the same name for the attribute throughout the model. I.e:
CREATE TABLE USER_STATUS
( STATUS_CODE ... NOT NULL PRIMARY KEY
, STATUS_NAME TEXT NOT NULL UNIQUE
);
CREATE TABLE USERS
( ...
, STATUS_CODE ... REFERENCES USER_STATUS (STATUS_CODE)
...
);
Otros consejos
You can create an ENUM type and use that as the column type (instead of varchar) in both tables. Then you can use ALTER TYPE status_type RENAME VALUE...
to change the spelling of one of the type's values in a performant way.