I have a string like this:
ticket:1 priority:5 delay:'2019-08-31 02:53:27.720422' delay:'2019-08-30 00:04:10.681242'
I successfully extracted ticket
and priority
but failed on delay
.
What I want is to extract delays as array so output will be like this:
#delays =>
[
"delay:'2019-08-31 02:53:27.720422'",
"delay:'2019-08-30 00:04:10.681242'"
]
What I've tried so far?
str = "ticket:1 priority:5 delay:'2019-08-31 02:53:27.720422' delay:'2019-08-30 00:04:10.681242'"
delays = str.scan(/delay:\w+(?:'\w+)*/).flatten
How can i extract them in my case? Note that, there is no guarantee that date format will be like in examples. Date format can be anything. So we should focus on strings between single quotes.
If possible result can be like this (so that i dont have to extract date again.):
#delays =>
[
"2019-08-31 02:53:27.720422",
"2019-08-30 00:04:10.681242"
]
This expression might be close to what you have in mind:
\bdelay\s*:\s*['][^']*[']
In case you had other chars such as "
for the delay
values, it would go in the char class:
\bdelay\s*:\s*['"][^'"]*['"]
or:
\bdelay\s*:\s*'(\d{4}-\d{1,2}-\d{1,2})\s*([^']*)'
or:
\bdelay\s*:\s*'(\d{4}-\d{1,2}-\d{1,2}\s*[^']*)'
or more simplified:
\bdelay\s*:\s*'([^']*)'
re = /\bdelay\s*:\s*'([^']*)'/
str = 'ticket:1 priority:5 delay:\'2019-08-31 02:53:27.720422\' delay:\'2019-08-30 00:04:10.681242\''
str.scan(re) do |match|
puts match.to_s
end
["2019-08-31 02:53:27.720422"]
["2019-08-30 00:04:10.681242"]
If you wish to explore/simplify/modify the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.
This is a suggestion for how you might extract all values of interest, not just the values for "delay"
. It permits any number of instances of "delay:'..."
in the string.
str = "ticket:1 priority:5 delay:'2019-08-31 02:53:27.720422' delay:'2019-08-30 00:04:10.681242"
str.delete("'").
split(/ +(?=ticket|priority|delay)/).
each_with_object({}) do |s,h|
key, value = s.split(':', 2)
case key
when 'delay'
(h[key] ||= []) << value
else
h[key] = value
end
end
#=> {"ticket"=>"1", "priority"=>"5",
# "delay"=>["2019-08-31 02:53:27.720422", "2019-08-30 00:04:10.681242"]}
The regular expression that is String#split
's argument reads, "match one or more spaces followed immediately by the string "ticket"
, "priority"
or "delay"
, the expression
(?=ticket|priority|delay)
being a positive lookahead.
The steps are as follows.
a = str.delete("'")
#=> "ticket:1 priority:5 delay:2019-08-31 02:53:27.720422 delay:2019-08-30 00:04:10.681242"
b = a.split(/ +(?=ticket|priority|delay)/)
#=> ["ticket:1", "priority:5", "delay:2019-08-31 02:53:27.720422",
# "delay:2019-08-30 00:04:10.681242"]
c = b.each_with_object({}) do |s,h|
key, value = s.split(':', 2)
case key
when 'delay'
(h[key] ||= []) << value
else
h[key] = value
end
end
#=> {"ticket"=>"1", "priority"=>"5",
# "delay"=>["2019-08-31 02:53:27.720422", "2019-08-30 00:04:10.681242"]}
Let's examine more closely the calculation of c
.
enum = b.each_with_object({})
#=> #<Enumerator: ["ticket:1", "priority:5", "delay:2019-08-31 02:53:27.720422",
# "delay:2019-08-30 00:04:10.681242"]:each_with_object({})>
The first value is generated by this enumerator and passed to the block, and the two block variables are assigned these values using array decompostion.
s, h = enum.next
#=> ["ticket:1", {}]
s #=> "ticket:1"
h #=> {}
The block calculation is then performed.
key, value = s.split(':', 2)
#=> ["ticket", "1"]
key
#=> "ticket"
value
#=> "1"
case else
applies, so
h[key] = value
#=> h["ticket"] = 1
h #=> {"ticket"=>"1"}
The next element is generated by enum
, the block variables are assigned values and block calculation is performed.
s, h = enum.next
#=> ["priority:5", {"ticket"=>"1"}]
key, value = s.split(':', 2)
#=> ["priority", "5"]
case else
again applies, so we execute
h[key] = value
#=> h["priority"] = "5"
h #=> {"ticket"=>"1", "priority"=>"5"}
Next,
s, h = enum.next
#=> ["delay:2019-08-31 02:53:27.720422", {"ticket"=>"1", "priority"=>"5"}]
key, value = s.split(':', 2)
#=> ["delay", "2019-08-31 02:53:27.720422"]
case "delay"
now applies, so we compute
(h[key] ||= []) << value
#=> h[key] = (h[key] || []) << value
#=> h["delay"] = (h["delay"] || []) << "2019-08-31 02:53:27.720422"
#=> h["delay"] = (nil || []) << "2019-08-31 02:53:27.720422"
#=> h["delay"] = [] << "2019-08-31 02:53:27.720422
#=> h["delay"] = ["2019-08-31 02:53:27.720422]
h #=> {"ticket"=>"1", "priority"=>"5", "delay"=>["2019-08-31 02:53:27.720422"]}
Lastly,
s, h = enum.next
#=> ["delay:2019-08-30 00:04:10.681242",
# {"ticket"=>"1", "priority"=>"5", "delay"=>["2019-08-31 02:53:27.720422"]}]
key, value = s.split(':', 2)
#=> ["delay", "2019-08-30 00:04:10.681242"]
(h[key] ||= []) << value
#=> ["2019-08-31 02:53:27.720422", "2019-08-30 00:04:10.681242"]
h #=> {"ticket"=>"1", "priority"=>"5",
# "delay"=>["2019-08-31 02:53:27.720422", "2019-08-30 00:04:10.681242"]}
In this last step, unlike the previous one,
h[key] ||= []
#=> ["2019-08-31 02:53:27.720422"] ||= []
#=> ["2019-08-31 02:53:27.720422"]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With