Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache pulsar infinite retention

In Apache Pulsar topic documentation it says can we set a topic time retention policy to -1 for infinite time based retention, What are the downsides of having infinite retention and can we use pulsar as message store where data lives forever in topics and build event sourcing application around them?

like image 597
Paul Avatar asked Dec 13 '22 17:12

Paul


2 Answers

The downside is that your data will grow forever. However, due to the segment based architecture of the underlying storage (bookkeeper), more space can by added by adding storage nodes (i.e. all the data doesn't have to fit on one machine, as is the case in some other systems).

The segment based architecture also makes it fairly straightforward to move data to a bulk storage system (s3 or something) while still having it available from Pulsar. However, this is still in earlier stages of discussion right now.

like image 92
Ivan Kelly Avatar answered Dec 24 '22 03:12

Ivan Kelly


Actually, you can and should use Pulsar's Tiered Storage option to offload your older data to more cost effective storage such as S3, Google Blob Storage, or HDFS. Unlike Kafka, Pulsar has decoupled the serving layers from the storage layers, which allows this. In Kafka, you would have to "add hard drives endlessly" and broker instances to store them.

like image 24
David Kjerrumgaard Avatar answered Dec 24 '22 04:12

David Kjerrumgaard