“No database selected” or “Table 'DB.table' doesn't exist” failures in pt-upgrade run where queries from multiple DBs were collected with tcpdump

https://dba.stackexchange.com/questions/103198

26-09-2020
|

Question

I collected queries on our Master using the following tcpdump command:

tcpdump -i any -s 65535 -x -n -nn -q -tttt 'port 3306' > tcpdump.tcp

The queries are executed amongst several databases.

I then ran this file through pt-query-digest using the following command:

pt-query-digest --output=slowlog --no-report --sample 100 --type tcpdump tcpdump.tcp > slow.log

Then I ran pt-upgrade against two Slaves like this:

pt-upgrade --user user --ask-pass --run-time=1h --upgrade-table percona.pt_upgrade h=10.1.1.1 h=10.1.1.2 slow.log

But I got a bunch of failures since it does not appear to specify which database the query should be executed against.

How is one supposed to use pt-upgrade when queries are collected amongst multiple DBs? AFAICT this isn't specified in the documentation anywhere.

Are you supposed to use --filter with pt-query-digest to just output queries for a particular database and then specify --database with pt-upgrade? Rinse and repeat per database.

It takes several hours to parse my gigantic tcpdump capture so any guidance here is appreciated.

Thanks!

P.S.

I used this article as a starting point but it is outdated. e.g. pt-query-digest doesn't have a --print option anymore.

Solution

Does it still fail to work if you omit --sample?
(from manual)

Also note that pt-query-digest may fail to report the database for queries when parsing tcpdump output. The database is discovered only in the initial connect events for a new client or when is executed. If the tcpdump output contains neither of these, then pt-query-digest cannot discover the database.

Consider using the general log instead of tcpdump?

OTHER TIPS

I have a similar problem. I have missing tables and missing USE statments as well. Sometimes the USE statement is present but with the wrong database. I use a quite simple workaround.

(Prelude) My first issue : my slow.log is 60Go long so

I split it with split with option -l set to 20 millions (that gives me roughly 1Go per file) make sure you do not have statements that are in between files edit them with

head -numberoflines ab >>aa for example
Find the statement that is the root of your problem (open with less you do not want to have 1Go in RAM with VI)
sed it with

sed -e '/nameofscriptinyourfilewhichisinthefirstline.php/,+nd'

with n the number of lines you need to edit. (From my point of view the developer needs to put theses informations for debugging purposes and it is part of our historic framework anyway).

Communicate the issue to the developer and run your edited slowlog again.
Repeat steps 2 through 4 until you find no more of theses errors. You can run you list of sed through this kind of command

sed -e '...' | sed -e '...' | sed - '...' > correct_aa

Fianl issue : i found that the queries with like and "%" are a pain as well because the parser cannot understand them. (Still looking for a solution).

If this can help i'm glad. It is not clean, it is long but it allows me to find all the queries that do not fit our framework although they made it to the prod somehow. And since we are very late in our mysql versions, it is my pleasure to get them to work for me eventually. ;)

P.S. : This is my first post.

What we ended up doing is grepping out queries from the slow.log based on table name into database-specific (raw) log files. Then we specified --type rawlog and --database with pt-upgrade.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange