Hibernate audit log of fields' change

Question

How can I log the changes of the entity into log files? Consider I have Person like this.

import org.hibernate.envers.Audited;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.GeneratedValue;
import javax.persistence.Column;

@Entity
@Audited
public class Person {
    @Id
    @GeneratedValue
    private int id;

    private String name;
    private String surname;
// add getters, setters, constructors, equals and hashCode here
}

and a code of changing existing Person

Person p1 = new Person("name-1", "surname-1");
personRepository.save(p1);
Person p2 = personRepository.findOne(1L);
p2.setName("new-name");
personRepository.save(p2);

How can I have

old entity
new entity
list of fields changed (some thing like Diffable's result)

In my log file? I know that envars can store changes in db and let me extract them later with AuditReader but I like to store changes in Json file to send them to third party applications (like Elastic).

Naros · Accepted Answer

I would actually tackle this from two points of view.

One of the benefits you gain by using Envers is the fact you can very quickly annotate your entities and tell the framework exactly how you wish to track the changes for your entity models. What is even better is you can have Envers generate the diffable fields for you automatically.

Lets take this basic entity:

@Entity
@Audited(withModifiedFlag = true)
public class Person {
  @Id
  @GeneratedValue
  private Integer id;
  private String name;
}

When you enable the withModifiedFlag feature, this informs Envers to add some additional metadata columns to the audit table for this entity, so basically this entity's audit table would look like this:

+----+------+----------+-----+---------+
| ID | name | name_MOD | REV | REVTYPE |
+----+------+----------+-----+---------+

The benefit here is that if you're using some process to export & stream the data directly from the table, you no longer need to actually diff the current-row against the prior-row to know what changed; the schema tells you this automatically simply by seeing if the associated _MOD column is 1 (true) or 0 (false).

From this point you have a couple options.

Native Query ETL

You can use Hibernate native queries to extract and transform the data into JSON for ES either via some background application thread or separate background process. Since the _MOD column provides you the indicator that a field changed, you can easily read the rows and build the data necessary without having to perform the diff operation at extract time.

I would also recommend configuring Envers to place the audit objects in a separate catalog/schema. This maximizes the database's ability to improve disk IO across multiple databases simultaneously.

Debezium ETL

The Debezium project is an excellent way to handle data replication. One of its strong benefits is that it enables users to do this across heterogeneous platforms which fits precisely into your model.

The big difference here is that Debezium does not read the database directly to determine changes, but rather reads the database transaction log file and generates a series of events that describes the DML operations that took place against that database. In short, you avoid the read operations you are so concerned about since Debezium rehydrates the state directly from the transaction log.

As an example:

Hibernate executes INSERT INTO Person (?,?) VALUES (?,?).
Hibernate Envers executes INSERT INTO Person_AUD (...) VALUES (....)
Database writes that operation to the redo/transaction logs.
Debezium notices the log was written, reads the entries.
Debezium generates an insert-event for Person_AUD (table subscribed to).
Any registered interested party of that event receives it and processes it.

It is in (5) where your transform/load code would exist to receive that insert event and generate a JSON output and send it to ES.

Wrap-up

By using Debezium, you are able to effectively offline replicate the data across heterogeneous environments in an extremely efficient manor. Not only is the project great for your use case, but its extremely valuable in today's modern world of micro-service architecture where data-sharing between services is crucial.

By using Envers, you able able to provide an online fallback solution for giving users audit historical data when your ES cluster is unavailable or overloaded rather than giving users a "Service unavailable, come back later" response.

Whatever you decide, performance is not the only concerning factor. You should also be mindful of the user experience, scalability, and reliability too; hence why I believe the optimal solution is to pair both.

Sairam Cherupally · Answer

You could write a custom interceptor by implementing org.hibernate.EmptyInterceptor. This has callbacks to update/insert/delete with old and new snapshots of entities.

Refer this article for more details

Hibernate audit log of fields' change

Tags:

java

hibernate

spring-data

hibernate-envers

audit-logging

mohsen Lzd

2 Answers

Native Query ETL

Debezium ETL

Wrap-up

Naros

Sairam Cherupally

Recent Activity

Donate For Us

Hibernate audit log of fields' change

Tags:

java

hibernate

spring-data

hibernate-envers

audit-logging

mohsen Lzd

2 Answers

Native Query ETL

Debezium ETL

Wrap-up

Naros

Sairam Cherupally

Related questions

Recent Activity

Donate For Us