Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fulltext Search DynamoDB

Following situation:

I´m storing elements in a DyanmoDb for my customers. HashKey is a Element ID and Range Key is the customer ID. In addition to these fields I´m storing an array of strings -> tags (e.g. ["Pets", "House"]) and a multiline text.

I want to provide a search function in my application, where the user can type a free text or select tags and get all related elements.

In my opinion a plain DB query is not the correct solution. I was playing around with CloudSearch, but I´m not really sure if this is the correct solution, because everytime the user adds a tag the index must be updated...

I hope you have some hints for me.

like image 881
SnowMax Avatar asked May 31 '17 17:05

SnowMax


People also ask

Does DynamoDB support search?

You can specify a DynamoDB table as a source when configuring indexing options or uploading data to a search domain through the console. This enables you to quickly set up a search domain to experiment with searching data stored in DynamoDB database tables.

Does DynamoDB have full text search?

In situations where you want to do more complicated queries single-table design and filters can sometimes help, but for full-text search your out of luck, it doesn't support it. For complicated search queries AWS recommends streaming data from DynamoDB to another database such as Elasticsearch.

How search works in DynamoDB?

The Query operation in Amazon DynamoDB finds items based on primary key values. You must provide the name of the partition key attribute and a single value for that attribute. Query returns all items with that partition key value.

What is a secondary index DynamoDB?

A secondary index is a data structure that contains a subset of attributes from a table, along with an alternate key to support Query operations. You can retrieve data from the index using a Query , in much the same way as you use Query with a table.


1 Answers

You can use an instant-search engine like Typesense to search through data in your DynamoDB table:

https://github.com/typesense/typesense

There's also ElasticSearch, but it has a steep learning curve and can become a beast to manage, given the number of features and configuration options it supports.

At a high level:

  1. Turn on DynamoDB streams
  2. Setup an AWS Lambda trigger to listen to these change events
  3. Write code inside your lambda function to index data into Typesense:
def lambda_handler(event, context):
    client = typesense.Client({
        'nodes': [{
            'host': '<Endpoint URL>',
            'port': '<Port Number>',
            'protocol': 'https',
        }],
        'api_key': '<API Key>',
        'connection_timeout_seconds': 2
    })

    processed = 0
    for record in event['Records']:
        ddb_record = record['dynamodb']
        if record['eventName'] == 'REMOVE':
            res = client.collections['<collection-name>'].documents[str(ddb_record['OldImage']['id']['N'])].delete()
        else:
            document = ddb_record['NewImage'] # format your document here and the use upsert function to index it.
            res = client.collections['<collection-name>'].upsert(document)
            print(res)
        processed = processed + 1
        print('Successfully processed {} records'.format(processed))
    return processed

Here's a detailed article from Typesense's docs on how to do this: https://typesense.org/docs/0.19.0/guide/dynamodb-full-text-search.html

like image 147
ErJab Avatar answered Sep 21 '22 12:09

ErJab