Can I update old data and insert new data if not exists currently in a single query
-
01-03-2021 - |
Question
I was trying to wrap my head around how to go about doing an update to a table if an updated list of the same table type does not have one or more of its original rows. The list of services comes from an API and is usually the same set of services, however every now and again the list may be updated and services become inactive and no longer show up in the results from the API call so these services should be updated in our database to reflect active as FALSE. Likewise is a new service comes onboard it should be added to the current list of available services. I had in mind to just drop all records and add the new ones fetched from the API however considering I have used the Ids for the existing services in other tables and I would still need to reference them I threw that idea out the window and now I am in a bit of a bind.
Services - (Currently in DB)
Name | Id | Active
Test1 3 true
Test2 4 true
Test3 5 true
I wanted to have a query or trigger of some sort to run when trying to insert duplicate data to table Services where constrained by the following:
- If an existing 'Name' is found skip the insert and move on to the next item in the array
- If a new 'Name' comes up not found in the DB add it as a new row. Example [Test4 5 true]
- If the newly fetched list from an API does not have one of the existing 'Name's, that is Test1,Test2 or Test3 update that existing row to set the Active column to false. So if the new list does not have Test3 the existing Test3 row would be updated to show Active as false.
Solution
I think you will want something like the following - see the fiddle here
. It's based on Common Table Expressions
(CTE
s) and that fact that with PostgreSQL, you can perform not only SELECT
s, but also INSERT
s, UPDATE
s and DELETE
s (see here
also).
First, your service
table:
CREATE TABLE service
(
name VARCHAR (10) NOT NULL PRIMARY KEY,
id INTEGER NOT NULL,
active BOOLEAN NOT NULL
);
populate it with your data:
INSERT INTO service VALUES
('Test1', 3, true), ('Test2', 4, true), ('Test3', 5, true);
Now, you receive your data from your API - I'll assume that you put that into some sort of temporary table - the keyword TEMPORARY
just means that the table will be dropped at the end of your session - I've tested with both TEMPORARY
and normal tables with the fiddle and the results are the same, so we'll go with TEMPORARY
:
CREATE TEMPORARY TABLE api
(
name VARCHAR (10) NOT NULL PRIMARY KEY,
id INTEGER NOT NULL
-- active BOOLEAN NOT NULL
);
I've assumed that your API doesn't know the status of the service, so it only has two fields - the name and the name
and the id
.
Populate it:
INSERT INTO api VALUES
('Test2', 4), ('Test3', 5), ('Test4', 6);
Notice that service Test1
is missing and that service Test4
is an additional service.
So now, because of PostgreSQL's ability to perform INSERT
s and UPDATE
s within CTE
s, we can now do the following:
WITH cte1 (nom) AS
(
INSERT INTO service (name, id, active)
SELECT a.name, a.id, true FROM api a
WHERE a.name NOT IN (SELECT name FROM service) RETURNING name
),
cte2 (nom2) AS
(
UPDATE service s SET active = false
WHERE s.name NOT IN (SELECT name FROM api) RETURNING s.name
)
SELECT * FROM service;
The first CTE
INSERT INTO service (name, id, active)
SELECT a.name, a.id, true FROM api a
WHERE a.name NOT IN (SELECT name FROM service) RETURNING name
inserts new services from the api
table into the service
table and the second:
UPDATE service s SET active = false
WHERE s.name NOT IN (SELECT name FROM api) RETURNING s.name
sets the service status = false
where a service in the service
table isn't present in the api
table.
Now, the result of the SELECT * FROM service
at the end of this query is:
name id active
Test1 3 t
Test2 4 t
Test3 5 t
So, you might think "Drat, it hasn't worked!" - but in fact, it has worked!
In the next section, you rerun
SELECT * FROM service
ORDER BY name;
and you get:
name id active
Test1 3 f
Test2 4 t
Test3 5 t
Test4 6 t
So, we can see that service Test1
's active
field has been set to false
and that service Test4
has been added. The reason it doesn't show up in the SELECT
immediately after the CTE
s has to do with the scope of the transaction - the first SELECT
shows the service
table as it was at the beginning of the transaction - the second shows the state of the table after the transaction.