Pregunta

After upgrading our database from 9.3.5 to 9.4.1 last night, the server suffers from high CPU spikes. The upgrade was done with pg_dump. So the database was converted to SQL and then imported into 9.4.

During the CPU spikes, there are a lot of these messages in the logs:

process X still waiting for ExclusiveLock on extension of relation Y of database Z 
after 1036.234 ms

And:

process X acquired ExclusiveLock on extension of relation Y of database Z
after 2788.050 ms

What looks suspicious is that there are sometimes several "acquired" messages for the exact same relation number in the exact same millisecond.

Why would Postgres grow a table twice in the same millisecond? Could it be an index with a high fill factor?

Any suggestions on how to approach this issue are welcome.

P.S. I've also asked this question on the Postgres mailing list, if that's not okay let me know.

¿Fue útil?

Solución

The problem had to do with a kernel feature called Transparent Huge Pages (THP.) You can diagnose this with perf top:

 59.73%       postmaster  [kernel.kallsyms]      [k] compaction_alloc
  1.31%       postmaster  [kernel.kallsyms]      [k] _spin_lock
  0.94%       postmaster  [kernel.kallsyms]      [k] __reset_isolation_suitable
  0.78%       postmaster  [kernel.kallsyms]      [k] compact_zone
  0.67%       postmaster  [kernel.kallsyms]      [k] get_pageblock_flags_group
  0.64%       postmaster  [kernel.kallsyms]      [k] copy_page_c
  0.48%           :13410  [kernel.kallsyms]      [k] compaction_alloc
  0.45%           :13465  [kernel.kallsyms]      [k] compaction_alloc
  0.45%       postmaster  [kernel.kallsyms]      [k] clear_page_c
  0.44%       postmaster  postgres               [.] hash_search_with_hash_value
  0.41%           :13324  [kernel.kallsyms]      [k] compaction_alloc
  0.40%           :13561  [kernel.kallsyms]      [k] compaction_alloc

The compaction_alloc function points at a problem. You can turn off THP with:

echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled

Postgres versions before 9.4 do not specifically ask for huge pages, but it can be forced on them with always.

Here's a link to RedHat discouraging THP for database workloads.

Otros consejos

A dump-restore cycle removes all bloat and dead tuples from your tables and restores with the minimum possible size - except if you have a fillfactor setting below 100 that reserves some wiggle-room per data page.

Immediately after the migration, you get a lot more "extensions" (added pages at the physical end of the table (the file on disk).

The pro-active solution would be to set a fillfactor below 100 on tables with lots of random updates. You can do that in the dump before you restore again. Since you are already through with the migration, you might just wait for the dust to settle. Typically, extensions become less frequent while wiggle-room is reintroduced with dead row versions in your updated tables.

For tables with lots of random updates it might be a good idea to set FILLFACTOR below 100 in any case. Applies to btree indexes in a similar fashion. But the reserved space is only added to existing data pages with a VACUUM FULL or CLUSTER (or a dump-restore cycle).

More on FILLFACTOR, updates and VACUUM:

Licenciado bajo: CC-BY-SA con atribución
No afiliado a dba.stackexchange
scroll top