Tuesday, June 27, 2017

PostgreSQL: production-ready Hash Indexes



Up to PostgreSQL 9.6, hash indexes were second class. Using them in production was not recommended; they weren't crash-safe. That's going to change with PostgreSQL 10. And here's what's gonna change.


Write Ahead Logging for Hash Indexes
Commit Fest Entry and the relevant threads (1) + (2)

The actual problem with hash indexes were their inability of using the WAL. Quoting the docs:
Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash if there were unwritten changes.
With PostgreSQL 10, this drawback has gone. The community, with Amit Kapilla leading the way, worked hard to remove this restriction and made hash index a first class citizen of the PostgreSQL.

What's in for you, you might wonder. According to Amit Kapilla, hash indexes tend to perform better than B-trees if the column values are unique. So, it can be beneficial to use hash indexes instead of btrees. Performance FTW!

Given the production-readiness of hash indexes, it now makes sense to improve the long-neglected index type in general. It's hard to say if "WAL for Hash Indexes" was the trigger here but it makes subsequent patches to hash indexes usable in the first place. So, I picked the following two interesting ones.


Microvacuum Hash Indexes
Commit Fest Entry and the relevant thread

This patch reduces the index size by factors of 2 to 4. It does that by increasing the re-usability of hash index pages and avoiding page splits.
A small but nice side effect: while reviewing the patch, occasional deadlocks caught the community's eye and eventually led to improved testing code for other index types.


A Better Way to Expand Hash Indexes.
Commit Fest Entry and the relevant thread

Hash indexes need to grow when more and more items are inserted. Usually, there's a so-called bucket increase which, by this patch, is chopped into smaller pieces and allows for a more gradual way of increasing the size of hash indexes.


Given those improvements, I can imagine a lot of people now considering hash indexes as a viable alternative to the venerable B-tree. What about you?

Regards,
Sven

No comments:

Post a Comment