Hibernate's table generator optimizers
Posted 03 Apr 2022 by Erik Lumme
Using a database table to keep track of the next identifiers to assign works in all SQL implementations, but comes at a performance cost. Here we take a look at what Hibernate does to alleviate this issue.
If you want the Java Persistence API (JPA) to generate identifiers for you when you insert a new entity into your database, there are three strategies to use: table, sequence, and identity. The table strategy is the only one whose implementation is database agnostic.
The table strategy works by using a separate database table to store the next identifier to use, with one column for the table name, and one for the identifier. When inserting a new entity, you simply increment the identifier for that table, read the new value, and use it for your entity.
The performance cost comes from the multiple queries that must be executed just to get the next identifier to use, and the fact that the row must be locked while it is read and incremented. Vlad Mihalcea expands on this further in his blog post.
Hibernate implements a set of optimizers to alleviate this performance cost. These optimizers work by retrieving the identifiers to assign in bulk, and keeping them in memory. Let’s first look at how it would work without optimization.
In this example, the sequence value starts at 7, so the next sequence value is 8. By incrementing the sequence value, we can use 8 as our identifier. Every time a new identifier is needed, the sequence value is read and incremented.
To use a
TableGenerator in JPA, you must first define it, e.g. using the
@TableGenerator annotation. Through it, you will define a name for your generator, so that you can reference it later. You can also define the name of your sequence table, the names of the columns in that table, the value to look for in that table, and an allocation size. You use it through the
When you define a
TableGenerator in Hibernate, it uses the
PooledOptimizer by default. It considers the next sequence value (current sequence value + 1) to be the upper bound of the pool of identifiers it is allowed to assign. The pool size is determined by the allocation size of the table generator, by default 50. Because of this, it also increments the sequence value by the allocation size.
All examples below use an allocation size of 3, and an initial sequence value of 7. For the pooled optimizer, the sequence value is incremend by the allocation size, here from 7 to 10.
PooledOptimizer assigns identifiers that are smaller than the sequence value it read. This is different from our initial example, where sequence values smaller than or equal to the current are assumed to already have been assigned. Using both these methods together on the same sequence table row will lead to identifier clashes.
PooledLoOptimizer works in a very similar manner, except it considers the next sequence value to be the lower bound of its pool of identifiers.
HiLoOptimizer only increments the sequence value by 1 at a time. However, it considers each sequence value as representing a pool of identifiers with the upper bound being (next sequence value ⋅ allocation size). If the next sequence value is 1, the upper bound would in our case be 1 ⋅ 3 = 3, and cover the identifiers 1, 2, and 3. If the next sequence value is 2, the upper bound would be 2 ⋅ 3 = 6, and cover the identifiers 4, 5, and 6.
There is also a
LegacyHiLoAlgorithmOptimizer that for legacy reasons works in mysterious ways.
NoopOptimizer performs no optimization, and if the allocation size is 1, it would work as our first example. By default the allocation size is 50, causing it to skip values as it increments the sequence value by the allocation size.
There is a
PooledLoThreadLocalOptimizer variant of the
PooledLoOptimizer. For thread safety, all other optimizers read the next sequence value and initialize their pools of identifiers in a
synchronized method. The
PooledLoThreadLocalOptimizer has a separate pool of identifiers per thread, and can therefore avoid the use of synchronization and its possible performance costs.
Configuring the optimizer to use
The optimizer to use can be set through the
hibernate.id.optimizer.pooled.preferred property. The Spring equivalent is
The possible values are