Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka connect (Single message transform) row filtering

I read about Kafka connect transformations introduced in kafka 0.10.2.1 https://kafka.apache.org/documentation/#connect_transforms

I noticed that all the transformations are column based transformations. I have a use-case where I need value based filtering. For example:

consider the following dataset of a group of people:

{"firstName": "FirstName1", "lastName": "LastName1", "age": 30}
{"firstName": "FirstName2", "lastName": "LastName2", "age": 30}
{"firstName": "FirstName3", "lastName": "LastName1", "age": 60}
{"firstName": "FirstName4", "lastName": "LastName2", "age": 60}

I want my worker to filter all those records whose lastName is LastName2

Is it possible using kafka-connect or I need to write a separate program for this use-case.

Thanks

like image 617
afsd Avatar asked Nov 24 '25 11:11

afsd


2 Answers

No reason why you couldn't solve this with Single Message Transforms - but you'd need to write a custom one since what you're describing is not available through the transforms currently shipped.

This is a useful talk here on when to use, and not to use, SMTs: Kafka Summit New York 2017 : Single Message Transformations Are Not the Transformations You’re Looking For (Ewen Cheslack-Postava, Confluent)


Edit April 2020: There is now a Filter SMT provided as part of Confluent Platform

like image 182
Robin Moffatt Avatar answered Nov 27 '25 23:11

Robin Moffatt


There's a ready-to-use filtering SMT coming with Debezium 1.2. It allows to use any JSR 223 compatible scripting language for filtering out records, e.g. like so:

transforms=filter
transforms.filter.type=io.debezium.transforms.Filter
transforms.filter.language=jsr223.graal.js
transforms.filter.condition=value.lastName == 'LastName2'

Note while it's part of Debezium, you can use that SMT separately if you want.

like image 32
Gunnar Avatar answered Nov 27 '25 22:11

Gunnar