Question

How can I implement a undo changes function to mysql database, just like Gmail when you delete/move/tag an email.

So far I have a system log table that holds the exact sql statements executed by the user.

For example, I'm trying to transform:

INSERT INTO table (id, column1, column2) VALUES (1,'value1', 'value2')

into:

DELETE FROM table WHERE id=1, column1='value1', column2='value2'

is there a built in function to do this like the cisco routers commands, something like

(NO|UNDO|REVERT) INSERT INTO table (id, column1, column2) VALUES (1,'value1', 'value2')

Maybe my approach is incorrect, should i save the current state of my row and the changed row to get back to it's original state?.

something like:

original_query = INSERT INTO table (id, column1, column2) VALUES (1,'value1', 'value2')

executed_query = INSERT INTO table (id, column1, column2) VALUES (1,'change1', 'change2')

to later transform into:

INSERT INTO table (id, column1, column2) VALUES (1,'value1', 'value2') ON DUPLICATE KEY UPDATE
 column1=VALUES(column1), column2=VALUES(column2)

But maybe it won't work with newly inserted rows or can cause troubles if i modify the primary key so i will rather let them unchanged.

This is my log table:

CREATE TABLE `log` (
 `id` int(11) unsigned NOT NULL AUTO_INCREMENT ,
 `date` datetime NOT NULL ,
 `user` int(11) NOT NULL,
 `client` text COMMENT ,
 `module` int(11) unsigned NOT NULL ,
 `query` text NOT NULL ,
 `result` tinyint(1) NOT NULL ,
 `comment` text,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8

The objective is like i said, undo changes from certain period of time based on the date of the statement execution, for example (can be in php)

function_undo(startdate, enddate)
{
    RESULT = SELECT query FROM log WHERE date BETWEEN startdate AND endate
    FOR EACH RESULT AS KEY - query
       REVERT query
}

or a undo button to revert one single action (single logged query ).

It's my concept of this 'incremental backup changes' correct or am I overcomplicating everything? Considering the obvious fact that the size of my database will be double or maybe tripple if I store the full queries. Should I store it in a different database ? or simply erase the log table once I make a programed full backup to only keep recent changes?

Any advices are welcome...

Was it helpful?

Solution

It was always problematic, SQL 2012 addresses this issue. Temporal model is simple: add interval columns (valid_from, valid_to ) but it is very complicated to implement constraints. Model manipulation is also simple:

1. insert -  new version valid_from=now, valit_to=null
2. update -  new version valid_from=now, valit_to=null, update previous version valit_to=now
3. delete -  update current version valit_to=now
4. undo delete - update last version valit_to=null 
5. undo update/insert - delete current version if you do not need redo and update valit_to=null if previous version exits 

It is more complicated with redo but it is similar, typically this model is used in data warehouse to track changes instead of redo function but it should be fine for redo too. It is also know as slowly changing dimension in data warehouse.

OTHER TIPS

I think you need to record the reverse of each insert / update / delete queries and then perform them to do the undo. Here is a solution for you but this does not take foreign key relationships (cascade operations) into account. It is just a simple solution concept. Hopefully it will give you more ideas. Here it goes:

assume u have a table like this that you want to undo

create table if not exists table1 
(id int auto_increment primary key, mydata varchar(15));

here is the table that records reverse queries

create table if not exists undoer(id int auto_increment primary key,
 undoquery text ,  created datetime );

create triggers for insert update and delete operations that saves the reverse/rescue query

create trigger  after_insert after insert on table1 for each row
    insert into undoer(undoquery,created) values 
(concat('delete from table1 where id = ', cast(new.id as char)), now());

create trigger after_update after update on table1 for each row
    insert into undoer(undoquery,created) values 
(concat('update table1 set mydata = \'',old.mydata,
        '\' where id = ', cast(new.id as char)), now());

create trigger after_delete after delete on table1 for each row
    insert into undoer(undoquery,created) values 
  (concat('insert into table1(id,mydata) 
   values(',cast(old.id as char), ', \'',old.mydata,'\') '), now());

to undo, you execute the reverse queries from undoer table between your dates sorted by date in desc order

The best solution is a soft delete in the database table, usually a column named "is_deleted", and "datetime_deleted", auto populated when the user deletes.

When the delete is completed, the response includes the ID of the record- which populates a link calling an undo method the user can click, which simply undeletes the record by updating the database again.

You can then operate a job which is either executed by the user, or on a scheduled task, to clean up all data marked "is_deleted = 1" over a period of time.

I think a combination of techniques would be needed here...

You could implement a Queue system which executes a job (sending emails etc) after a certain time.

E.g. If the user deletes an object send it to the queue for 30seconds or so just incase the user clicks undo. If the user does click undo you could just simply remove the job from the queue.

This combined with soft deleting may be a good option to look into.

I've used Laravels Queue class which is really good.



I'm not really sure if there will ever be a correct answer for this as theres no correct way of doing it. Good luck though :)

I would suggest you use something like the following table to log the changes to your database.

TABLE audit_entry_log
-- This is an audit entry log table where you can track changes and log them here.
( audit_entry_log_id    INTEGER         PRIMARY KEY
, audit_entry_type      VARCHAR2(10)    NOT NULL
     -- Stores the entry type or DML event - INSERT, UPDATE or DELETE.
, table_name            VARCHAR2(30)
    -- Stores the name of the table which got changed
, column_name           VARCHAR2(30)
    -- Stores the name of the column which was changed
, primary_key           INTEGER
    -- Stores the PK column value of the row which was changed.
    -- This is to uniquely identify the row which has been changed.
, ts                    TIMESTAMP
    -- Timestamp when the change was made.
, old_number            NUMBER(36, 2)
    -- If the changed field was a number, the old value should be stored here.
    -- If it's an INSERT event, this would be null.
, new_number            NUMBER(36,2)
    -- If the changed field was a number, the new value in it should be stored here.
    -- If it's a DELETE statement, this would be null.
, old_text              VARCHAR2(2000)
    -- Similar to old_number but for a text/varchar field.
, new_text              VARCHAR2(2000)
    -- Similar to new_number but for a text/varchar field.
, old_date              VARCHAR2(2000)
    -- Similar to old_date but for a date field.
, new_date              VARCHAR2(2000)
    -- Similar to new_number but for a date field.
, ...
, ... -- Any other data types you wish to include.
, ...
);

Now, suppose you have a table like this:

TABLE user
( user_id       INTEGER         PRIMARY KEY
, user_name     VARCHAR2(50)
, birth_date    DATE
, address       VARCHAR2(50)
)

On this table, I have a trigger that populates audit_entry_log tracking the changes to this table. I am giving this code example for Oracle, you can definitely tweak it a little to suit MySQL:

CREATE OR REPLACE TRIGGER user_id_trg
BEFORE INSERT OR UPDATE OR DELETE ON user
REFERENCING new AS new old AS old
FOR EACH ROW
BEGIN
    IF INSERTING THEN
        IF :new.user_name IS NOT NULL THEN
            INSERT INTO audit_entry_log (audit_entry_type,
                                         table_name,
                                         column_name,
                                         primary_key,
                                         ts,
                                         new_text)
            VALUES ('INSERT',
                    'USER',
                    'USER_NAME',
                    :new.user_id,
                    current_timestamp(),
                    :new.user_name);
        END IF;
        --
        -- Similar code would go for birth_date and address columns.
        --

    ELSIF UPDATING THEN
        IF :new.user_name != :old.user_name THEN
            INSERT INTO audit_entry_log (audit_entry_type,
                                         table_name,
                                         column_name,
                                         primary_key,
                                         ts,
                                         old_text,
                                         new_text)
            VALUES ('INSERT',
                    'USER',
                    'USER_NAME',
                    :new.user_id,
                    current_timestamp(),
                    :old.user_name,
                    :new.user_name);
        END IF;
        --
        -- Similar code would go for birth_date and address columns
        --

    ELSIF DELETING THEN
        IF :old.user_name IS NOT NULL THEN
            INSERT INTO audit_entry_log (audit_entry_type,
                                         table_name,
                                         column_name,
                                         primary_key,
                                         ts,
                                         old_text)
            VALUES ('INSERT',
                    'USER',
                    'USER_NAME',
                    :new.user_id,
                    current_timestamp(),
                    :old.user_name);
        END IF;
        --
        -- Similar code would go for birth_date and address columns
        --
    END IF;
END;
/

Now, consider, as a simple example, you run this query on timestamp 31-JAN-2014 14:15:30:

INSERT INTO user (user_id, user_name, birth_date, address)
VALUES (100, 'Foo', '04-JUL-1995', 'Somewhere in New York');

Next you run an UPDATE query on timestamp 31-JAN-2014 15:00:00:

UPDATE user
   SET username = 'Bar',
       address = 'Somewhere in Los Angeles'
 WHERE user_id = 100;

Thus your user table would have data:

user_id user_name birth_date  address
------- --------- ----------- --------------------------
100     Bar       04-JUL-1995 Somewhere in Los Angeles

This results in following data in the audit_entry_log table:

audit_entry_type table_name column_name primary_key ts                   old_text              new_text                 old_date new_date
---------------- ---------- ----------- ----------- -------------------- --------------------- ------------------------ -------- -----------
INSERT           USER       USER_NAME   100         31-JAN-2014 14:15:30                       FOO
INSERT           USER       BIRTH_DATE  100         31-JAN-2014 14:15:30                                                         04-JUL-1992
INSERT           USER       ADDRESS     100         31-JAN-2014 14:15:30                       SOMEWHERE IN NEW YORK 
UPDATE           USER       USER_NAME   100         31-JAN-2014 15:00:00 FOO                   BAR
UPDATE           USER       ADDRESS     100         31-JAN-2014 15:00:00 SOMEWHERE IN NEW YORK SOMEWHERE IN LOS ANGELES

Create a procedure like the following that would accept table name and timestamp to which we have to restore a particular table name. The table would be restored only upto a timestamp. There will not be a from timestamp. It is only from current to a timestamp in the past.

CREATE OR REPLACE PROCEDURE restore_db (p_table_name varchar, p_to_timestamp timestamp)
AS
CURSOR cur_log IS
SELECT *
  FROM audit_entry_log
 WHERE table_name = p_table_name
   AND ts > p_to_timestamp;
BEGIN
    FOR i IN cur_log LOOP
        IF i.audit_entry_type = 'INSERT' THEN
            -- Delete the row that was inserted.
            EXEC ('DELETE FROM '||p_table_name||' WHERE '||p_table_name||'_id = '||i.primary_key);
        ELSIF i.audit_entry_type = 'UPDATE' THEN
            -- Put all the old data back into the table.
            IF i.old_number IS NOT NULL THEN
                EXEC ('UPDATE '||p_table_name||' SET '||i.column_name||' = '||i.old_number
                      ||' WHERE '||p_table_name||'_id = '||i.primary_key);
            ELSIF i.old_text IS NOT NULL THEN
                -- Similar statement as above EXEC for i.old_text
            ELSE
                -- Similar statement as above EXEC for i.old_text
            END IF;
        ELSIF i.audit_entry_type = 'DELETE' THEN
            -- Write an INSERT statement for the row that has been deleted.
        END IF;
    END LOOP;
END;
/

Now, if you want to restore user table to a state at 31-JAN-2014 14:30:00- when the INSERT was fired and UPDATE was not fired, a procedure call like this would do a good joib:

restore_db ('USER', '31-JAN-2014 14:30:00');

I am iterating this again- treat all the above code as pseudo-code and make necessary changes when you try to run them. This is the most fail-proof design I have seen for manual query flashbacks.

Have you considered passing the old values into a separate table as XML values? Then, if you need to restore them, you can retrieve the XML values from the table.

For this kind of system, a log table is the way to go. Yes, the table will most likely be big, but it all depends on how far back you want to be able to go. You could use a time limit, as you said, and delete all logs before 6 months ago. You could also create some sort of recycle bin and don't allow users to have more than, lets say, 100 "items" in it - always keep the most recent 100 log entries for each user.

Regarding the issue of what queries to keep in your log table, there is no built in function that allows you to do what you want. But since you only log updates and deletes (no need to log inserts since users usually have the option to delete their stuff), you can easily build your own function.

Before any UPDATE or DELETE statement, you get the entire row from the database, and you create a REPLACE statement for it - it works both as an UPDATE and an INSERT. The only thing to keep in mind is that you need a PRIMARY KEY or UNIQUE index for all of your tables.

Here is an ideea on how the function should look like:

function translateStatement($table, $primaryKey, $id)
{
    $sql = "SELECT * FROM `$table` WHERE `$primaryKey` = '$id'"; //should always return one row
    $result = mysql_query($sql) or die(mysql_error());
    $row = mysql_fetch_assoc($result);

    $columns = implode(',', array_map( function($item){ return '`'.$item.'`'; }, array_keys($row)) ); //get column names
    $values = implode(',', array_map( function($item){ return '"'.mysql_real_escape_string($item).'"'; }, $row) ); //get escaped column values

    return 'REPLACE INTO `$table` ('.$columns.') VALUES ('.$values.')';
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top