I'm analysing some very large log files using Python regex. I need to substitute every number in the log file, except the numbers that are preceded by 'java:' (the log files are made by a java program).
This means that given we have a line saying:
This is a bogus test line with limit=300 doing 53 rounds and the error is in (Abc.java:417) and some more
The numbers 300 and 53 should be replaced, but not 417.
I filter on a line basis, and it should be noted that not all lines contain java:[number]
.
The closest I have gotten is ((?<!java:)[0-9]+)
Probably what's happening with
((?<!java:)[0-9]+)
is that, sure, the match at this point,
java:
^
fails, but then at _this point,
java:4
^
succeeds, because indeed, ava:4
is not java:
.
You'll just need to add one more negative lookbehind,
((?<!java:)(?<![0-9])[0-9]+)
^^^^^^^^^^
so that only "complete" numbers are considered.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With