Let's be honest, I'm struggling with the regular expression I need to extract parts of the character sequence. The sequence contains pairs of key and value pairs enclosed with /
character. So the pair could be /KEY/VALUE/
, but also /KEY/VAL/UE/
. The pairs sits next to each other in the sequence.
Let's look at the example sequence:
/ABCD/value1//ECFG/value2//HIJK/value3a/value3b/
What I'd like to be able to do is to get the list of the key value pairs like this:
ABCD -> value1
ECFG -> value2
HIJK -> value3a/value3b
This should work:
/(.+?)/(.+?)/(?=/|$)
The first paren will capture the key, the second the value.
The lookahead matches either a 2nd /
, indicating a new key/value pair or the string end for the last key/value pair.
Edit: Here some python code:
s = "/ABCD/value1//ECFG/value2//HIJK/value3a/value3b/"
re.findall('/(.+?)/(.+?)/(?=/|$)', s)
# [('ABCD', 'value1'), ('ECFG', 'value2'), ('HIJK', 'value3a/value3b')]
Try this: /(.*?)/(.*?)/
Here's how you would use it with sed:
sed -e 's,/(.*?)/(.*?)/,$1 --> $2\n,g' inputfile.txt
The key is non-greedy matches .*?
(instead of greedy .*
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With