Domanda

I have 50 GB of data in a table, and have to remove it if the records are older than a particular date, after taking its backup.

Currently i follow the following steps:

  1. Take backup of complete table.
  2. Run a delete query with where clause for removing the non required data as:

    DELETE FROM <some-table-name> WHERE `creation_time` <= '<some-valid-time>'
    

Problem with the current approach are:

  1. It is painfully slow.
  2. Redundant storage of data, when only incremental data is required; due to the backup is taken of whole table but removal of only selective records are done.
  3. After deletion the disk space is not returned back to the OS (until optimization is done).

I thought of breaking that table into smaller tables for weekly/monthly basis which would enable easy backup and deletion, but query them together will be very difficult and slow.

Please advice some smart and efficient way to do this.

È stato utile?

Soluzione

Use the creation_time as a partitioning key, make per-week or per-month partitions. Dropping old partitions is incredibly fast.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top