What are best practices for safely permanently deleting a database?

https://dba.stackexchange.com/questions/3139

16-10-2019
|

문제

We have an "organic" environment, meaning people piled code on code for ten years with minimal oversight or documentation. The server I use has several databases which I believe are no longer being used; I'd love to delete them and leave just the three I actually use.

At the reckless extreme, I could disable these databases and wait for someone to scream; at the other I could leave them running forever "just in case". What steps have you found valuable in identifying whether a server is being used, and how?

Also, what steps would you recommend to ensure that, as one moves forward in disabling systems, that they remain conveniently reversible for a period of time (e.g., rename objects rather than deleting them outright)?

Thanks!

해결책

You also want to make sure of the datetime stamps of every table. Search for any metadata in the system for every table, order such a list by datetime last updated, and display the output in desc order by datetime. You could also check the table size for even the slight change in size.

For example, in MySQL 5.x, you have information_schema.tables which looks like this:

mysql> desc information_schema.tables;
+-----------------+---------------------+------+-----+---------+-------+
| Field           | Type                | Null | Key | Default | Extra |
+-----------------+---------------------+------+-----+---------+-------+
| TABLE_CATALOG   | varchar(512)        | NO   |     |         |       |
| TABLE_SCHEMA    | varchar(64)         | NO   |     |         |       |
| TABLE_NAME      | varchar(64)         | NO   |     |         |       |
| TABLE_TYPE      | varchar(64)         | NO   |     |         |       |
| ENGINE          | varchar(64)         | YES  |     | NULL    |       |
| VERSION         | bigint(21) unsigned | YES  |     | NULL    |       |
| ROW_FORMAT      | varchar(10)         | YES  |     | NULL    |       |
| TABLE_ROWS      | bigint(21) unsigned | YES  |     | NULL    |       |
| AVG_ROW_LENGTH  | bigint(21) unsigned | YES  |     | NULL    |       |
| DATA_LENGTH     | bigint(21) unsigned | YES  |     | NULL    |       |
| MAX_DATA_LENGTH | bigint(21) unsigned | YES  |     | NULL    |       |
| INDEX_LENGTH    | bigint(21) unsigned | YES  |     | NULL    |       |
| DATA_FREE       | bigint(21) unsigned | YES  |     | NULL    |       |
| AUTO_INCREMENT  | bigint(21) unsigned | YES  |     | NULL    |       |
| CREATE_TIME     | datetime            | YES  |     | NULL    |       |
| UPDATE_TIME     | datetime            | YES  |     | NULL    |       |
| CHECK_TIME      | datetime            | YES  |     | NULL    |       |
| TABLE_COLLATION | varchar(32)         | YES  |     | NULL    |       |
| CHECKSUM        | bigint(21) unsigned | YES  |     | NULL    |       |
| CREATE_OPTIONS  | varchar(255)        | YES  |     | NULL    |       |
| TABLE_COMMENT   | varchar(2048)       | NO   |     |         |       |
+-----------------+---------------------+------+-----+---------+-------+
21 rows in set (0.01 sec)

The column UPDATE_TIME records the last time any INSERT, UPDATE, or DELETE was last applied to the table. You could run queries like these to find out when each database was last accessed:

Last time a table was accessed in each database:

SELECT table_schema,MAX(update_time) last_accessed
FROM information_schema.tables
WHERE table_schema NOT IN ('information_schema','mysql')
AND update_time IS NOT NULL
GROUP BY table_schema;

Last time a table was accessed in any database:

SELECT MAX(update_time) last_accessed FROM information_schema.tables
WHERE table_schema NOT IN ('information_schema','mysql');

Last 10 dates a table was accessed:

SELECT * FROM
(SELECT * FROM
(SELECT last_accessed,COUNT(1) access_count
FROM (SELECT DATE(update_time) last_accessed
FROM information_schema.tables
WHERE table_schema NOT IN ('information_schema','mysql')
AND update_time IS NOT NULL) A
GROUP BY last_accessed) AA
ORDER BY last_accessed DESC) AAA
LIMIT 10;

These are just a few examples of how to get such metadata from MySQL. I'm sure Oracle and SQL Server have similar or better methods.

Once you are sure of how often or seldom a database (or schema) is accessed, you should manually dump/export aged databases along with copies of the schema itself apart from the data. Please excuse that my answer is not DB agnostic. SQLServer and Oracle DBAs should voice their answers here as well, since the concept of a schema being a collection within a database instance is blurred in MySQL but very strictly followed in SQLServer and Oracle.

다른 팁

You could try to set up a trace that only captures connections and to what database they connect to. I would leave this running for a bit and then make sure nothing is connecting to it.

One problem with that would be if you have some code opening up on the master db but calling another DB within the code. I'm not sure how bad the code is that is pointing to your DBs.

I'd also query all your jobs and make sure none are pointing to that DB

You could also use SQL audit if you have the right version of SQL (2008 R2 enterprise).

You could also use logon triggers to update a table when someone logged on to that DB. This would show you if anything was connecting to that DB.

Also, what steps would you recommend to ensure that, as one moves forward in disabling systems, that they remain conveniently reversible for a period of time

In SQL Server, you can take databases "offline" which leaves the database present, but makes connecting to it via code not possible. If a database is "offline" it still remains available and is reversible within minutes.

At my last job we had some products that were in operation for several months per year, so turning off, or taking offline, the database for months at a time would not have been noticed by the folks working with that product. As one example, one of the products involved W-2 forms, so 98% of the business happens in January and February (for most companies, the data is not available until the first week in January, and the federal regulatory deadline for filing the information is the last business day in January). The web server was usually turned off from May/June until December.

At that company, we had a spreadsheet with the "owner" of the database - one single person responsible for the product. While others could make updates to the structure of the tables, the "owner" was the go-to person when any questions had to be asked. If the owner left the company (rare until last year), someone would be assigned to be the new owner before they left.

At other companies, we have taken databases offline for a quarter, if they stay offline with nothing breaking (such as month/quarterly reporting), they get backed up one last time and deleted. This allows someone to later come back and restore the database (which takes a few minutes) for those situations which have stories like "oh, that was for the jones project that we had to set aside while we got the fred project finished."

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 dba.stackexchange