I read about Kafka connect transformations introduced in kafka 0.10.2.1 https://kafka.apache.org/documentation/#connect_transforms
I noticed that all the transformations are column based transformations. I have a use-case where I need value based filtering. For example:
consider the following dataset of a group of people:
{"firstName": "FirstName1", "lastName": "LastName1", "age": 30}
{"firstName": "FirstName2", "lastName": "LastName2", "age": 30}
{"firstName": "FirstName3", "lastName": "LastName1", "age": 60}
{"firstName": "FirstName4", "lastName": "LastName2", "age": 60}
I want my worker to filter all those records whose lastName is LastName2
Is it possible using kafka-connect or I need to write a separate program for this use-case.
Thanks
No reason why you couldn't solve this with Single Message Transforms - but you'd need to write a custom one since what you're describing is not available through the transforms currently shipped.
This is a useful talk here on when to use, and not to use, SMTs: Kafka Summit New York 2017 : Single Message Transformations Are Not the Transformations You’re Looking For (Ewen Cheslack-Postava, Confluent)
Edit April 2020: There is now a Filter SMT provided as part of Confluent Platform
There's a ready-to-use filtering SMT coming with Debezium 1.2. It allows to use any JSR 223 compatible scripting language for filtering out records, e.g. like so:
transforms=filter
transforms.filter.type=io.debezium.transforms.Filter
transforms.filter.language=jsr223.graal.js
transforms.filter.condition=value.lastName == 'LastName2'
Note while it's part of Debezium, you can use that SMT separately if you want.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With