Understanding Bloodhound.tokenizers.obj.whitespace

Tags:

All, I was trying to apply Twitter typeahead and Bloodhound into my project based on some working sample, But I can't understand below code .

datumTokenizer: Bloodhound.tokenizers.obj.whitespace('songs'),
queryTokenizer: Bloodhound.tokenizers.whitespace,

The original code looks like below.

var songlist = new Bloodhound({
                datumTokenizer: Bloodhound.tokenizers.obj.whitespace('songs'),
                queryTokenizer: Bloodhound.tokenizers.whitespace,
                limit: 10,
                remote: '/api/demo/GetSongs?searchTterm=%QUERY'

            });

The official document just said :

datumTokenizer – A function with the signature (datum) that transforms a datum into an array of string tokens. Required.

queryTokenizer – A function with the signature (query) that transforms a query into an array of string tokens. Required.

What does it mean ? Could someone please help to tell me more about it so that I have better understanding?

505

asked Oct 28 '15 02:10

Joe.wang

1 Answers

I found some helpful information here:

https://github.com/twitter/typeahead.js/blob/master/doc/migration/0.10.0.md#tokenization-methods-must-be-provided

The most common tokenization methods split a given string on whitespace or non-word characters. Bloodhound provides implementations for those methods out of the box:
  // returns ['one', 'two', 'twenty-five']
  Bloodhound.tokenizers.whitespace('  one two  twenty-five');

  // returns ['one', 'two', 'twenty', 'five']
  Bloodhound.tokenizers.nonword('  one two  twenty-five');
For query tokenization, you'll probably want to use one of the above methods. For datum tokenization, this is where you may want to do something a tad bit more advanced.

For datums, sometimes you want tokens to be dervied from more than one property. For example, if you were building a search engine for GitHub repositories, it'd probably be wise to have tokens derived from the repo's name, owner, and primary language:
  var repos = [
    { name: 'example', owner: 'John Doe', language: 'JavaScript' },
    { name: 'another example', owner: 'Joe Doe', language: 'Scala' }
  ];

  function customTokenizer(datum) {
    var nameTokens = Bloodhound.tokenizers.whitespace(datum.name);
    var ownerTokens = Bloodhound.tokenizers.whitespace(datum.owner);
    var languageTokens = Bloodhound.tokenizers.whitespace(datum.language);
    
    return nameTokens.concat(ownerTokens).concat(languageTokens);
  }
There may also be the scenario where you want datum tokenization to be performed on the backend. The best way to do that is to just add a property to your datums that contains those tokens. You can then provide a tokenizer that just returns the already existing tokens:
  var sports = [
    { value: 'football', tokens: ['football', 'pigskin'] },
    { value: 'basketball', tokens: ['basketball', 'bball'] }
  ];

  function customTokenizer(datum) { return datum.tokens; }
There are plenty of other ways you could go about tokenizing datums, it really just depends on what you are trying to accomplish.

It seems unfortunate that this information wasn't easier to find from the main documentation.

196

answered Sep 23 '22 12:09

davew

Related questions
                            
                                jQuery.Deferred exception: $(...).datepicker is not a function
                            
                                How to deal with session timeouts in AJAX requests
                            
                                jQuery validate with a dynamic number of fields
                            
                                Why doesn't this closure have access to the 'this' keyword? - jQuery
                            
                                MailTo From Javascript
                            
                                Capturing result of window.onbeforeunload confirmation dialog
                            
                                How does the bre.ad (http://bre.ad) background work?
                            
                                jQuery - submit multiple forms through single request, without Ajax
                            
                                jquery programmatically click on new dom element
                            
                                Rotating an element based on cursor position in a separate element
                            
                                How does the jQuery pushStack function work?
                            
                                Check if JS has access to an iframe's document
                            
                                ASP.Net MVC vs. HTML + KnockoutJS + WebAPI [closed]
                            
                                Passing knockout.js observablearray object to MVC Controller Action?
                            
                                Why is the Javascript width of a table cell one pixel too short?
                            
                                Dragging on html table cells
                            
                                How to trigger a click event on disabled elements
                            
                                Get timezone offset from timezone name using Javascript
                            
                                How can I load the content of tumblr nextPage on indexPage
                            
                                Fullpage.js Slide horizontal on scroll

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Understanding Bloodhound.tokenizers.obj.whitespace

Tags:

jquery

twitter

twitter-typeahead

bloodhound

Joe.wang

People also ask

1 Answers

davew

Recent Activity

Donate For Us