Tables accessed during last period
-
16-10-2019 - |
문제
I want to check which tables have been updated during a given period, for instance in descending order of access time per table.
How can I get that for PostgreSQL?
해결책
You can get some information about the last change to a table with xmin
, eg:
select max(xmin::text::bigint) from t;
But you need to be aware of many caveats including modulo and wraparound and frozen xids.
testbed:
set role dba;
create role stack;
grant stack to dba;
create schema authorization stack;
set role stack;
--
create or replace function f(p_schema in text, p_table in text)
returns integer language plpgsql immutable as $$
declare
n integer;
begin
execute 'select max(xmin::text::bigint) from '||p_schema||'.'||p_table into n;
return n;
end;$$;
--
create table foo as select generate_series(1, 100) as id;
create table bar as select generate_series(1, 100) as id;
create table baz as select generate_series(1, 100) as id;
--
method:
select table_name, f(table_schema, table_name)
from information_schema.tables
where table_schema='stack'
order by 2 desc;
/*
table_name | f
------------+--------
baz | 784657
bar | 784656
foo | 784655
*/
--
update foo set id=id+1 where id=100;
--
select table_name, f(table_schema, table_name)
from information_schema.tables
where table_schema='stack'
order by 2 desc;
/*
table_name | f
------------+--------
foo | 784658
baz | 784657
bar | 784656
*/
cleanup:
drop schema stack cascade;
set role dba;
drop role stack;
다른 팁
A vanilla PostgreSQL installation does not log access to tables.
If you need that you have to implement it yourself. I would use triggers for that. I use a setup like this for many of my tables. I add a column named log_up
to tables I want to track updates for:
log_up timestamptz DEFAULT current_timestamp;
Use timestamptz
(timestamp with time zone
) which works across time zones:
Trigger function:
CREATE OR REPLACE FUNCTION trg_log_up()
RETURNS trigger AS
$func$
BEGIN
NEW.log_up := current_timestamp;
RETURN NEW;
END;
$func$
LANGUAGE plpgsql VOLATILE;
Trigger:
CREATE TRIGGER log_up
BEFORE UPDATE ON tbl
FOR EACH ROW EXECUTE PROCEDURE trg_log_up();
There are a couple of related logging parameters you may be interested in additionally. Like log_connections
or log_statement
.
Update: Also consider "commit timestamps" added in Postgres 9.5:
Add trigger to all tables
You can create a script for all currently existing tables by querying the database catalog. For instance to generate the DDL statements for all tables in the schema public
:
SELECT string_agg(format('CREATE TRIGGER log_up BEFORE UPDATE ON %s '
'FOR EACH ROW EXECUTE PROCEDURE trg_log_up();'
, c.oid::regclass), E'\n')
FROM pg_namespace n
JOIN pg_class c ON c.relnamespace = n.oid
WHERE n.nspname = 'public';
-- AND c.relname ~~* '%tbl%' -- to filter tables by name
Returns:
CREATE TRIGGER log_up BEFORE UPDATE ON tbl1 FOR EACH ROW EXECUTE PROCEDURE trg_log_up();
CREATE TRIGGER log_up BEFORE UPDATE ON tbl2 FOR EACH ROW EXECUTE PROCEDURE trg_log_up();
CREATE TRIGGER log_up BEFORE UPDATE ON tbl3 FOR EACH ROW EXECUTE PROCEDURE trg_log_up();
...
Of course, they all need to have a column log_up
of type timestamptz
first. You can create a DDL script to add the column to all tables in a similar fashion.
Log only last UPDATE
per table
If you are only interested in the last UPDATE
per table, a simpler solution will do. Here is a demo how to keep track in one centralized table:
CREATE TABLE lastup (
schema_name text
, tbl_name text
, ts timestamptz
, PRIMARY KEY (schema_name, tbl_name)
);
Trigger. Consult the manual about the special variables I use:
CREATE OR REPLACE FUNCTION trg_lastup()
RETURNS trigger AS
$func$
BEGIN
UPDATE lastup
SET ts = current_timestamp
WHERE schema_name = TG_TABLE_SCHEMA
AND tbl_name = TG_TABLE_NAME;
RETURN NULL; -- For AFTER trigger irrelevant
END
$func$ LANGUAGE plpgsql;
Dummy table for testing:
CREATE TABLE dummy (id int);
INSERT INTO dummy VALUES (1), (2), (3);
Enter row for table in log table:
INSERT INTO lastup(schema_name, tbl_name) VALUES ('public', 'dummy');
Trigger. Note that I use an AFTER
trigger FOR EACH STATEMENT
(cheaper). More in the manual here.
CREATE TRIGGER log_up
AFTER UPDATE ON dummy
FOR EACH STATEMENT EXECUTE PROCEDURE trg_lastup();
Test:
UPDATE dummy
SET id = id + 5
WHERE id < 3;
Voilá:
SELECT * FROM lastup;
Or, if you want to exclude empty updates (nothing changed), but at a higher cost because multiple updated rows trigger multiple log updates:
CREATE OR REPLACE FUNCTION trg_lastup()
RETURNS trigger AS
$func$
BEGIN
IF OLD IS DISTINCT FROM NEW THEN -- check for changes
UPDATE lastup
SET ts = current_timestamp
WHERE schema_name = TG_TABLE_SCHEMA
AND tbl_name = TG_TABLE_NAME;
END IF;
RETURN NULL; -- For AFTER trigger!
END;
$func$ LANGUAGE plpgsql;
CREATE TRIGGER log_up
AFTER UPDATE ON dummy
FOR EACH ROW EXECUTE PROCEDURE trg_lastup(); -- per ROW instead of STATEMENT
To create triggers for all tables you want to include in this regime, use a similar DDL creation script like above.