How to configure Secor (from pinterest) to partition files by hour of day?

Question

Looking for some insight on how to configure Secor to output fatter files that are partitioned by datetime rather than kafka offset. Something akin to hourly backups of kafka topic streams. Currently, my common.properties file contains these secor configs:

secor.generation=1
secor.consumer.threads=7
secor.messages.per.second=10000
secor.offsets.per.partition=10000000
secor.topic_partition.forget.seconds=600
secor.local.log.delete.age.hours=-1
    secor.file.reader.writer.factory=com.pinterest.secor.io.impl.SequenceFileReaderWriterFactory
secor.max.message.size.bytes=100000

This file mentions that a partition could describe the date of a message:

LogFilePath.java:

(line 29) Log file path has the following form: prefix/topic/partition1/.../partitionN/generation_kafkaParition_firstMessageOffset

(line 34) "partition1, ..., partitionN is the list of partition names extracted from message content. * E.g., the partition may describe the message date such as dt=2014-01-01 [...]"

Mulloy · Accepted Answer

Secor's Readme File: JSON date parser: parser that extracts timestamps from JSON messages and groups the output based on the date, similar to the Thrift parser above. To use this parser, start Secor with properties file secor.prod.partition.properties and set secor.message.parser.class=com.pinterest.secor.parser.JsonMessageParser. You may override the field used to extract the timestamp by setting the message.timestamp.name property.

How to configure Secor (from pinterest) to partition files by hour of day?

Tags:

java

amazon-web-services

amazon-s3

Mulloy

1 Answers

Mulloy

Recent Activity

Donate For Us

How to configure Secor (from pinterest) to partition files by hour of day?

Tags:

java

amazon-web-services

amazon-s3

Mulloy

1 Answers

Mulloy

Related questions

Recent Activity

Donate For Us