I'm dealing with a table in SQLServer that I need indexed in ElasticSearch in near realtime. Records are added/updated/deleted in this table from various points (legacy code, stored procedures, etc.) so instrumenting the code to find all areas that interact with this table is not feasible. What technique and/or tool would allow for this?
Note: Table in SQLServer contains roughly 10 columns and could include up to a million rows.
If you want to do it with minimal code, then you can set it up with Logstash jdbc plugin(now that river concept is no longer supported in Elasticsearch). In the table have a last updated timestamp column. You can configure you polling query to be based on it so that it can pickup changes since last run. Make sure to have index defined on this timestamp column. Since this is will be a polling process, just dont go over board and set very low polling interval as that might end up putting unnecessary work load on the SQL DB server.
If you ready to do it the right way and ready to spend time coding, then probably you should consider SQL Server service broker option. You can have trigger on the table to write a message and an external polling/trigger process on the queue to pickup the message/change and push it to Elasticsearch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With