I am parsing a large CSV file in a ruby script and need to find the closest match for a title from some search keys. The search keys maybe one or more values and the values may not exactly match as per below (should be close)
search_keys = ["big", "bear"]
A large array containing data that I need to search through, only want to search on the title column:
array = [
          ["id", "title",            "code", "description"],
          ["1",  "once upon a time", "3241", "a classic story"],
          ["2",  "a big bad wolf",   "4235", "a little scary"],
          ["3",  "three big bears",  "2626", "a heart warmer"]
        ]
In this case I would want it to return the row ["3",  "three big bears",  "2626", "a heart warmer"] as this is the closest match to my search keys.
I want it to return the closest match from the search keys given.
Is there any helpers/libraries/gems I can use? Anyone done this before??
I am worried, this task should be handled to any search engine at db level or similar, no point fetching data in app and do searching across columns/rows etc, should be expensive. but for now here is the plain simple approach :)
array = [
          ["id", "title",            "code", "description"],
          ["1",  "once upon a time", "3241", "a classic story"],
          ["2",  "a big bad wolf",   "4235", "a little scary"],
          ["3",  "three big bears",  "2626", "a heart warmer"]
        ]
h = {}
search_keys = ["big", "bear"]
array[1..-1].each do |rec|
  rec_id = rec[0].to_i
  search_keys.each do |key|
    if rec[1].include? key
      h[rec_id] = h[rec_id] ? (h[rec_id]+1) : 1
    end
  end
end
closest = h.keys.first
h.each do |rec, count| 
  closest = rec if h[closest] < h[rec]
end
array[closest] # => desired output :)
                        I think you can do it by your self and no need to use any gems! This may be close to what you need; searching in the array for the keys and set a rank for each found element.
result = []
array.each do |ar|
    rank = 0
    search_keys.each do |key|
        if ar[1].include?(key)
            rank += 1
        end
    end
    if rank > 0
        result << [rank, ar]
    end 
end
This code can be written better than the above, but i wanted to show you the details.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With