I'm writing a text to cdr (chordpro) converter and I'm having trouble detecting chord lines on the form:
Cmaj7 F#m C7
Xxx xxxxxx xxx xxxxx xx x xxxxxxxxxxx xxx
This is my python code:
def getChordMatches(line):
import re
notes = "[CDEFGAB]";
accidentals = "(#|##|b|bb)?";
chords = "(maj|min|m|sus|aug|dim)?";
additions = "[0-9]?"
return re.findall(notes + accidentals + chords + additions, line)
I want it to return a list ["Cmaj7", "F#m", "C7"]. The above code doesn't work, I've struggled with the documentation, but I'm not getting anywhere.
Why doesn't it work to just chain the classes and groups together?
edit
Thanks, I ended up with the following which covers most (it won't match E#m11 for instance) of my needs.
def getChordMatches(line):
import re
notes = "[ABCDEFG]";
accidentals = "(?:#|##|b|bb)?";
chords = "(?:maj|min|m|sus|aug|dim)?"
additions = "[0-9]?"
chordFormPattern = notes + accidentals + chords + additions
fullPattern = chordFormPattern + "(?:/%s)?\s" % (notes + accidentals)
matches = [x.replace(' ', '').replace('\n', '') for x in re.findall(fullPattern, line)]
positions = [x.start() for x in re.finditer(fullPattern, line)]
return matches, positions
You should make your groups non-capturing by changing (...)
to (?:...)
.
accidentals = "(?:#|##|b|bb)?";
chords = "(?:maj|min|m|sus|aug|dim)?";
See it working online: ideone
The reason why it doesn't work when you have capturing groups is that it only returns those groups and not the entire match. From the documentation:
re.findall(pattern, string, flags=0)
Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.
There is a specific syntax for writing a verbose regex
regex = re.compile(
r"""[CDEFGAB] # Notes
(?:#|##|b|bb)? # Accidentals
(?:maj|min|m|sus|aug|dim) # Chords
[0-9]? # Additions
""", re.VERBOSE
)
result_list = regex.findall(line)
It's arguably a bit clearer than joining the strings together
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With