Most of the PostgreSQL DBAs might know about this large bloat estimate query
integrated in check_postgres. It is
supposed to compute a rough estimate of the bloat for tables and indexes in a
database. As the PostgreSQL wiki page says:
This query is for informational purposes only. It provides a loose estimate
of table growth activity only, and should not be construed as a 100% accurate
portrayal of space consumed by database objects
Lately, a customer asked me about existing tools to help deciding when he is
supposed to REINDEX. While writing a detailed answer with examples, I had
quite a surprise seeing an estimated index size ratio of 0.7 for a simple
table! I realized I never payed much attention to this part of the result…
The problem
Here is a test table, a copy table rental from
pagila project:
I copy pasted the bloat query in file ~/tmp/bloat_original.sql, here is its output for this table:
A B-tree index is filled at 90% per default, see storage parameter
FILL FACTOR.
So, a freshly created B-tree index with no bloat is supposed to be 1.1x larger
than it should be with a FILL FACTOR of 100. As I had some time
to dive in this, I couldn’t resist to investigate these insane estimated size
factors for my test table indexes: 0.5, 0.8 and 0.7. How an index could be
smaller than it is supposed to be ?
I hadn’t to dive deap in the query, after a closer look at the query code and
comments, we find:
Oh, ok. The query estimates the ideal size of each index considering it
references ALL the table fields. It’s quite rare to find a btree index on
all fields of a table, and obviously there’s no point having multiple indexes
on a table, all of them referencing all fields of the table.
A look at the real bloat
First, let see how the indexes are really bloated:
First point, the bloat on indexes is not 10% everywhere, only for the PK. This
is because indexes were created BEFORE inserting data. So it looks like data
were naturally order on the PK on table “rental” when scanning it sequentially.
What if we load data sorting on “inventory_id” field ?
Ok, it totally makes sense there: index test_inventory_id_idx is now 10%
bloated. First lesson: Do not create your indexes before loading your datas!
Demo:
A better query to estimate index bloat?
Wait, you just showed us you can have the real bloat on indexes, why would I
want a loose estimate from a query relying on stats?!
Because I feel bad reading the whole index again and again from monitoring
tools every few minutes. Having a loose estimate is good enough in some cases.
And anyway, I want to dig into this for education :-)
So, I tried to write a query able to give a better estimate of bloat ONLY for
indexes. I picked some parts of the original bloat query, but rewrote mostly
everything. Here is the result:
How does it perform compared to the actual values? Let’s go back to a bloated
situation:
I created the file “~/tmp/bloat_index.sql” with this estimated indexes bloat
query filtering on table test, here is its result:
Well, pretty close :-)
Let’s REINDEX:
Not bad.
Let’s create some bloat. The table is approx. 160,000 rows, so the following
query updates ~10% of the table:
Erk, not so good. Moar bloat?
Well, better again.
Conclusion
This new query performs much better for indexes than the usual one everyone
knew for a long time. But it isn’t perfect, showing sometimes some quite wrong
results.
First issue, this query doesn’t make any difference between branch nodes and
leaves in indexes. Rows values and references are kept in leaves, and only a
fractional part of them are in branches. I should do some more tests with
larger indexes to see how it behaves with a lot of branch nodes. As the number
of branch nodes is supposed to be a logarithm of the total number of rows,
maybe we could include some more guessing in the optimal index size computing.
Second issue…keep in mind it is based on statistics.
Anyway, as useful as this query can be, at least it has been funny to dive in
indexes and pages guts.
As a side note, I had been surprised to see that padding to MAXALIGN was
applied on both tuple header AND tuple data. My understanding is that for each
row in an index leave page, we have:
the item_pointer at the beginning of the page,
the tuple header + data at the end of the page that must be aligned on
MAXALIGN.
Why ain’t they padded together to MAXALIGN? Wouldn’t this save
some space?
Thank you for reading till the end! Cheers, see you maybe in another post about
table bloat next time ;-)
About index_tuple_hdr, I just realize I used it as a field name AND a psql variable, my bad. See the code bellow this comment in the query:
/* per tuple header: add index_attribute_bm if some cols are null-able */
So if you are referring to index_tuple_hdr at lines 11, 13 and 14, those are referring to the computed field, so no :, it would produce bad stats with indexes on NULLable cols.
I fixed to query to remove this confusion.
Good catch about the \SET! It appears there was some magic in my old blog system transforming it in uppercase. I’ll not investigate further as I moved to jekyll…
Comments Leave a comment by email or Disqus
Note: I only publish your name/pseudo, mail subject and content. I will NOT publish your email address.
The check_postgres project welcomes patches! :)
Thanks for posting this! I’ve known for a while that the index bloat calculation was faulty, but haven’t quite known what was wrong with it.
I’m working on a 2nd bloat calculation involving dead row counts – just for tables,though.
Actually, there’s some bugs in the query version as you’ve written it above:
\set
, not\SET
index_tuple_hdr
is missing the:
in two places.Thanks!
Josh,
About
index_tuple_hdr
, I just realize I used it as a field name AND a psql variable, my bad. See the code bellow this comment in the query:/* per tuple header: add index_attribute_bm if some cols are null-able */
So if you are referring to
index_tuple_hdr
at lines 11, 13 and 14, those are referring to the computed field, so no:
, it would produce bad stats with indexes on NULLable cols.I fixed to query to remove this confusion.
Good catch about the
\SET
! It appears there was some magic in my old blog system transforming it in uppercase. I’ll not investigate further as I moved to jekyll…Thanks!