I have file with contents like below. I am trying to extract the word next to "-x" in the file and finally need to get only uniq results. As a part of that i tried the below regex but got only single and double quotes in the output. When i use regex only for double quotes, I got the result.
File Content
00 04 * * 2-6 testuser /get_results.sh -q -x 'igp_srm_m' -s 'yesterday' -e 'yesterday' -m '2048' -b >>'/var/log/process/srm-console.log' 2>&1
00 10 * * 2-6 testuser /get_results.sh -q -x 'igp_srm_m' -s 'yesterday' -e 'yesterday' -m '2048' -w '720' >>'/var/log/process/srm-console.log' 2>&1
00 08 * * 1-5 testuser /get_results.sh -q -x "igp_france" -s "today" -e "today" -m "90000" -b -z partA >>"/var/log/process/france-partA-console.log" 2>&1
00 12 * * 2-6 testuser /get_results.sh -q -x "igp_france" -s "yesterday" -e "yesterday" -m "90000" -w "900" -z partA >>"/var/log/process/france-partA-console.log" 2>&1
00 08 * * 1-5 testuser /get_results.sh -q -x "igp_france" -s "today" -e "today" -m "90000" -b -z partB >>"/var/log/process/france-partB-console.log" 2>&1
00 12 * * 2-6 testuser /get_results.sh -q -x "igp_france" -s "yesterday" -e "yesterday" -m "90000" -w "900" -z partB >>"/var/log/process/france-partB-console.log" 2>&1
00 12 * * 2-6 testuser JAVA_OPTS='-server -Xmx512m' /merge.sh "yesterday" "igp_france" "partA,partB" >>"/var/log/process/france-console.log" 2>&1
00 08 * * 1-5 testuser /get_results.sh -q -x "igpswitz_france" -s "today" -e "today" -m "15000" -b >>'/var/log/process/igpswitz_france-console.log' 2>&1
00 12 * * 2-6 testuser /get_results.sh -q -x "igpswitz_france" -s "yesterday" -e "yesterday" -m "15000" -Dapc.maxalerts=8000 -w "900" >>'/var/log/process/igpswitz_france-console.log' 2>&1
30 07 * * 2-6 testuser /get_results.sh -q -x "igp_franced" -s 'yesterday' -e 'yesterday' -m "105000" -b >>"/var/log/process/franced-console.log" 2>&1
15 12 * * 2-6 testuser /get_results.sh -q -x "igp_franced" -s 'yesterday' -e 'yesterday' -m "105000" -w "960" >>"/var/log/process/franced-console.log" 2>&1
Tried syntax
import re
with open ("test2") as file:
for line in file:
try:
m=re.search('(?<=\-x (\"|\'))(\w+)',line)
print m.group(1)
except:
m = None
Expected output
igp_srm_m
igp_france
igpswitz_france
igp_franced
Received Output
'
'
"
"
"
"
"
"
"
"
Unsure what is going wrong, because when I tried only for double quotes it is working correctly.
Working script only for double quotes
import re
with open ("test2") as file:
for line in file:
try:
m = re.search('(?<=\-x \")(\w*)', line)
print m.group(1)
except:
m = None
Received Output - Search for double quotes only
igp_france
igp_france
igp_france
igp_france
igpswitz_france
igpswitz_france
igp_franced
igp_franced
You can use a set to get the unique values.
In your pattern, the values are in group 2, but you can optimize the pattern a bit. the single and double quote can be used in a character class (["']) and captured in group 1. Then you can use a backreference to pair up the matched quote using \
-x (["'])(\w+)\1
Regex demo | Python demo
import re
result = set()
with open ("test2") as file:
for line in file:
try:
m = re.search(r"-x ([\"'])(\w+)\1", line)
result.add(m.group(2))
except:
m = None
print(result)
Output
{'igp_france', 'igp_srm_m', 'igp_franced', 'igpswitz_france'}
In
m=re.search('(?<=\-x (\"|\'))(\w+)',line)
print m.group(1)
instead of group(1), use group(2), basically,
m=re.search('(?<=\-x (\"|\'))(\w+)',line)
print m.group(2)
From trying out on https://regex101.com/, group 1 is coming up as ' , while using group 2 gives the required output.
The double quotes one is working correctly since your required output is already in group 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With