After my Btree bloat estimation query,
I found some time to work on a new query for tables. The goal here is still to
have a better bloat estimation using dedicated queries for each kind of objects.
Compare to the well known
bloat query, this query pay attention to:
I created the file sql/bloat_tables.sql with the 9.0 and more query version.
I edited the query to add the bloat reported by pgstattuple (free_percent +
dead_tuple_percent) to compare both results and added the following filter:
Here is the result on a fresh pagila database:
Well, not too bad. Let’s consider the largest table, clone it and create some
bloat:
Again, the bloat reported here is pretty close to the reality!
Some more tests:
Good, good, good. What next?
The alignment deviation
You might have noticed I did not mentioned this table with a large deviation
between the statistical bloat and the real one, called rental:
This particular situation is exactly why I loved writing these bloat queries
(including the btree one), confronting the statistics and the reality and
finding a logical answer or a fix.
Statistical and real bloat are actually both right here. The statistical one is
just measuring here the bloat AND something else we usually don’t pay attention
to. I’ll call it the alignment overhead.
Depending on the fields types, PostgreSQL adds some padding before the values
to align them inside the row in regards to the CPU word size. This help
ensuring a value fits in only one CPU register when possible. Alignment padding
are given in this pg_type
page from PostgreSQL document, see field typalign.
So let’s demonstrate how it influence the bloat here. Back to the rental table,
here is its definition:
All the fields here are fixed-size types, so it is quite easy to compute the
row size:
rental_id and inventory_id are 4-bytes integers, possible alignment is
every 4 bytes from the begining of the row
customer_id and staff_id are 2-bytes integers, possible alignment is
every 2 bytes from the begining of the row
rental_date, return_date and last_update are 8-bytes timestamps,
possible alignment is every 8 bytes from the begining of the row
The minimum row size would be 2*4 + 2*2 + 3*8, 36 bytes. Considering the
alignment optimization and the order of the fields, we now have (ascii art is
easier to explain):
That makes 12 bytes of padding and a total row size of 48 bytes instead of 36.
Here are the 10%! Let’s double check this by the experience:
Removing the “remove tables with real bloat < 1 block” filter from my demo
query, we have now:
Great!
Sadly, I couldn’t find a good way to measure this in the queries so far, so I
will live with that. By the way, this alignment overhead might be a nice
subject for a script measuring it per tables.
Known issues
The same than for the Btree statistical bloat query: I’m pretty sure the query
will have a pretty bad estimation with array types. I’ll investigate about that
later.
Comments Leave a comment by email or Disqus
Note: I only publish your name/pseudo, mail subject and content. I will NOT publish your email address.
Hi,
The article is really helpful.
The example that you have explained is of fixed size data type.
How we can order the columns of variable length types like varchar,char,text,numeric, bytes etc?