I'm not a Python developer, but I'm using a Python script to convert SQLite to MySQL
The suggested script gets close, but no cigar, as they say.
The line giving me a problem is:
line = re.sub(r"([^'])'t'(.)", r"\1THIS_IS_TRUE\2", line)
...along with the equivalent line for false ('f'), of course.
The problem I'm seeing is that only the first occurrence of 't' in any given line is replaced.
So, input to the script,
INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,'t','t','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');
...gives...
INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,THIS_IS_TRUE,'t','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');
I mentioned I'm not a Python developer, but I have tried to fix this myself. According to the documentation, I understand that re.sub should replace all occurrences of 't'.
I'd appreciate a hint as to why I'm only seeing the first occurrence replaced, thanks.
By default, the count is set to zero, which means the re. sub() method will replace all pattern occurrences in the target string.
Put a capture group around the part that you want to preserve, and then include a reference to that capture group within your replacement text. @Amber: I infer from your answer that unlike str. replace(), we can't use variables a) in raw strings; or b) as an argument to re. sub; or c) both.
If you want to replace a string that matches a regular expression (regex) instead of perfect match, use the sub() of the re module. In re. sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third.
re. sub() function is used to replace occurrences of a particular sub-string with another sub-string. This function takes as input the following: The sub-string to replace.
The two substitutions you'd want in your example overlap - the comma between your two instances of 't'
will be matched by (.)
in the first case, so ([^'])
in the second case never gets a chance to match it. This slightly modified version might help:
line = re.sub(r"(?<!')'t'(?=.)", r"THIS_IS_TRUE", line)
This version uses lookahead and lookbehind syntax, described here.
How about
line = line.replace("'t'", "THIS_IS_TRUE").replace("'f'", "THIS_IS_FALSE")
without using re
. This replaces all occurrences of 't'
and 'f'
. Just make sure that no car is named t
.
The first match you see is ,'t',
. Python proceeds starting with the next character, which is '
(before the second t
), subsequently, it cannot match the ([^'])
part and skips the second 't'
.
In other words, subsequent matches to be replaced cannot overlap.
using re.sub(r"\bt\b","THIS_IS_TRUE",line)
:
In [21]: strs="""INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,'t','t','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');"""
In [22]: print re.sub(r"\bt\b","THIS_IS_TRUE",strs)
INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,'THIS_IS_TRUE','THIS_IS_TRUE','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With