So I am trying to build something using bloodhound search engine and I noticed that it has these two tokenisers, datum and query.
The initializer code example given in the documentation looks like this:
var engine = new Bloodhound({
  local: ['dog', 'pig', 'moose'],
  queryTokenizer: Bloodhound.tokenizers.whitespace,
  datumTokenizer: Bloodhound.tokenizers.whitespace
});
What do these two Tokenizers do?
EDIT
Bloodhound documentation defines these two as follows:
datumTokenizer – A function with the signature (datum) that transforms a datum into an array of string tokens. Required.
queryTokenizer – A function with the signature (query) that transforms a query into an array of string tokens. Required.
It still doesn't explain what is the difference between a Datum and a Query.
datum are the elements of the index that is searched thru and the query is what is being searched for.  If either contain more than one token(s) (or word when whitespace is used), the engine needs some function to split characters on.  See more info on why tokenization is needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With