Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ruby searching array for keywords

I am parsing a large CSV file in a ruby script and need to find the closest match for a title from some search keys. The search keys maybe one or more values and the values may not exactly match as per below (should be close)

search_keys = ["big", "bear"]

A large array containing data that I need to search through, only want to search on the title column:

array = [
          ["id", "title",            "code", "description"],
          ["1",  "once upon a time", "3241", "a classic story"],
          ["2",  "a big bad wolf",   "4235", "a little scary"],
          ["3",  "three big bears",  "2626", "a heart warmer"]
        ]

In this case I would want it to return the row ["3", "three big bears", "2626", "a heart warmer"] as this is the closest match to my search keys.

I want it to return the closest match from the search keys given.

Is there any helpers/libraries/gems I can use? Anyone done this before??

like image 674
Norto23 Avatar asked May 30 '12 07:05

Norto23


2 Answers

I am worried, this task should be handled to any search engine at db level or similar, no point fetching data in app and do searching across columns/rows etc, should be expensive. but for now here is the plain simple approach :)

array = [
          ["id", "title",            "code", "description"],
          ["1",  "once upon a time", "3241", "a classic story"],
          ["2",  "a big bad wolf",   "4235", "a little scary"],
          ["3",  "three big bears",  "2626", "a heart warmer"]
        ]


h = {}

search_keys = ["big", "bear"]

array[1..-1].each do |rec|
  rec_id = rec[0].to_i

  search_keys.each do |key|
    if rec[1].include? key
      h[rec_id] = h[rec_id] ? (h[rec_id]+1) : 1
    end
  end
end

closest = h.keys.first

h.each do |rec, count| 
  closest = rec if h[closest] < h[rec]
end

array[closest] # => desired output :)
like image 59
Amol Pujari Avatar answered Oct 27 '22 00:10

Amol Pujari


I think you can do it by your self and no need to use any gems! This may be close to what you need; searching in the array for the keys and set a rank for each found element.

result = []
array.each do |ar|
    rank = 0
    search_keys.each do |key|
        if ar[1].include?(key)
            rank += 1
        end
    end

    if rank > 0
        result << [rank, ar]
    end 
end

This code can be written better than the above, but i wanted to show you the details.

like image 32
M.ElSaka Avatar answered Oct 27 '22 01:10

M.ElSaka