I have the following search term:
"login:17639 email:[email protected] ref:co-10000 common_name:testingdomain organization:'Internet Company'"
This term is derived from a params variable where everything to the left of the :
is a filter term and everything on the right of :
is the value of the filter. What I'm trying to do is to split the term into keys and values and create a hash from them. This is the end goal:
search_filters = {
login:17639,
email:'[email protected]',
etc, etc,
}
I'm playing around with split, gsub, tr
to get these values but I'm having a problem with the organization field. Here is what I have so far:
term.gsub(/'/,'').tr(':', ' ').split(" ")
term.gsub(":")
And basically, many other variations like the above. The problem is the organization field. Every iteration results in something like this ["organization", "Internet", "Company"]
the problem is that "Internet Company" is being split. I can't place a simple if/else statement just for this filter to glue them together because there are more filters to process. Is there a way I can simply divide the filter term based off the colon easier? Thank you.
Here's an example on how to start:
def splart(input)
input.scan(/([^:]+):('[^']*'|"[^"]*"|\S+)/).to_h
end
That will tease out the data you need. You may have to clean it up after.
str = "login:17639 email:[email protected] ref:co-10000 " +
"common_name:testingdomain organization:'ABC Internet Company'"
Hash[*str.split(/:| +(?![^'":]+['"])/)].transform_keys(&:to_sym)
#=> {:login=>"17639", :email=>"[email protected]",
# :ref=>"co-10000", :common_name=>"testingdomain",
# :organization=>"'ABC Internet Company'"}
See Hash::[] and Hash#transform_keys.
We can document the regular expression by writing it in free-spacing mode:
/
: # match :
| # or
[ ]+ # match > 0 spaces
(?! # begin negative lookahead
[^'":]+ # match > 0 chars other than ', " or :
['"] # match ' or "
) # end negative lookahead
/x # free-spacing regex definition mode
In free-spacing mode spaces are removed before the expression is parsed. That is why spaces intended to be part of the regex must be protected. I've done that by enclose a space in a character class ([ ]
) but one could instead escape a space character, use Unicode's [[:space:]]
or \p{Space}
or, if appropriate, \s
, which would include tabs and newlines (and a few more characters).
Suppose str
were shorter and contained only two key-value pairs, and we computed:
arr = str.split(/:| +(?![^'":]+['"])/)
#=> ["login", "17639", "email", "[email protected]"]
We would use Hash::[]
as follows:
Hash["login", "17639", "email", "[email protected]"]
#=> {"login"=>"17639", "email"=>"[email protected]"}
which is the same as:
Hash[*arr]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With