The Issue At Hand
I have a CAN log file which contains a series of messages in the following format. I've identified each string in the log file by naming it 'String n:' followed by the actual content of the file.
String 1: 01 3E 55 55 55 55 55 55
String 2: 01 7E 00 00 00 00 00 00
String 3: 21 51 00 00 66 63 51 00
String 4: 22 00 00 00 00 37 41 31
String 5: 30 00 00 55 55 55 55 55
There is more content on each log line, but this regex will only be run once I've extracted just this portion of each line from the original log file contents. I've provided a sample of a raw log line below, just in case that somehow helps anyone figure this out more easily.
Sample Line: 2023-07-07 05:07:48.896 Tx 7e0 01 3E 55 55 55 55 55 55
I'd like to make a regular expression which only returns back out pairs of characters before where I see all 00
for the remainder of a string, or 55
for the remainder of a string. I'm expecting to see results as follows for the 5 input strings, but I can't seem to build the correct regular expression to produce these results.
String 1: 01 3E
String 2: 01 7E
String 3: 21 51 00 00 66 63 51
String 4: 22 00 00 00 00 37 41 31
String 5: 30
Can someone help me build this regex correctly?
What I've Tried
I've tried using positive lookahead regular expression patterns, but no matter how I try and configure my positive lookaheads, I am struggling to get the right characters back. I'm always either dropping one pair of characters (the 3E
in string 1, or the 7E
in string 2), or I'm not getting matches at all (string 5 gives me back nothing). I've dropped the regex I've been messing with below along with an example of what it's not returning out.
Regular Expression: ([0-9A-F]{2,} (?!55|00))+
String 1: Returns 01
String 2: Returns 01
String 3: Returns 21 00 66 63
(No idea how to fix this issue)
String 4: Returns 00 37 41
(Again, no idea how to fix this issue)
String 5: Returns null
(Why doesn't it even see the 30?)
You can match any chars up to the first occurrence of spaces and then 55
or 00
repeated till the end of the line with
^.*?(?=(?: (?:00|55))*$)
See the regex demo.
Details:
^
- start of the string (or line).*?
- any zero or more chars other than line break chars as few as possible(?=(?: (?:00|55))*$)
- a positive lookahead that matches a location that is immediately followed with zero or more repetitions of a space + 00
or 55
till the end of the string/line.UPDATE
To match these texts inside larger strings, you can use
(?<!\S)[a-fA-F0-9]{2}(?: [a-fA-F0-9]{2})*?(?=(?: (?:00|55))*$)
See the regex demo.
Details:
(?<!\S)
- left-hand whitespace boundary[a-fA-F0-9]{2}
- two hex chars(?: [a-fA-F0-9]{2})*?
- zero or more, but as few as possible, occurrences of a space + two hex chars(?=(?: (?:00|55))*$)
- a positive lookahead that matches a position immediately followed with zero or more repetitions of a space and then either 00
or 55
, till end of string.This works for you as you extract a single match from any given input string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With