re.sub not replacing all occurrences

Tags:

regex

I'm not a Python developer, but I'm using a Python script to convert SQLite to MySQL

The suggested script gets close, but no cigar, as they say.

The line giving me a problem is:

line = re.sub(r"([^'])'t'(.)", r"\1THIS_IS_TRUE\2", line)

...along with the equivalent line for false ('f'), of course.

The problem I'm seeing is that only the first occurrence of 't' in any given line is replaced.

So, input to the script,

INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,'t','t','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');

...gives...

INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,THIS_IS_TRUE,'t','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');

I mentioned I'm not a Python developer, but I have tried to fix this myself. According to the documentation, I understand that re.sub should replace all occurrences of 't'.

I'd appreciate a hint as to why I'm only seeing the first occurrence replaced, thanks.

547

asked Nov 13 '12 15:11

4 Answers

The two substitutions you'd want in your example overlap - the comma between your two instances of 't' will be matched by (.) in the first case, so ([^']) in the second case never gets a chance to match it. This slightly modified version might help:

line = re.sub(r"(?<!')'t'(?=.)", r"THIS_IS_TRUE", line)

This version uses lookahead and lookbehind syntax, described here.

176

answered Sep 21 '22 11:09

Zero Piraeus

How about

line = line.replace("'t'", "THIS_IS_TRUE").replace("'f'", "THIS_IS_FALSE")

without using re. This replaces all occurrences of 't' and 'f'. Just make sure that no car is named t.

answered Sep 20 '22 11:09

eumiro

The first match you see is ,'t',. Python proceeds starting with the next character, which is ' (before the second t), subsequently, it cannot match the ([^']) part and skips the second 't'.

In other words, subsequent matches to be replaced cannot overlap.

answered Sep 18 '22 11:09

Alexander Pavlov

using re.sub(r"\bt\b","THIS_IS_TRUE",line):

In [21]: strs="""INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,'t','t','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');"""

In [22]: print re.sub(r"\bt\b","THIS_IS_TRUE",strs)

INSERT INTO "cars" VALUES(56,'Bugatti Veyron','BUG 1',32,'THIS_IS_TRUE','THIS_IS_TRUE','2011-12-14 18:39:16.556916','2011-12-15 11:25:03.675058','81');

answered Sep 21 '22 11:09

Ashwini Chaudhary

Related questions
                            
                                Opencv error -Unsupported depth of input image:
                            
                                Give function defaults arguments from a dictionary in Python
                            
                                ValueError: Mountpoint must not contain a space. (Colab)
                            
                                Are executables produced with Cython really free of the source code?
                            
                                Understanding `width_shift_range` and `height_shift_range` arguments in Keras's ImageDataGenerator class
                            
                                Run nosetests with warnings as errors?
                            
                                What is the best way to get a stacktrace when using multiprocessing?
                            
                                Is there a javascript equivalent to unpack sequences like in python?
                            
                                Python - difference between os.access and os.path.exists?
                            
                                Google Protocol Buffers, HDF5, NumPy comparison (transferring data)
                            
                                Django testing tips [closed]
                            
                                How do I parse subjectAltName extension data using pyasn1?
                            
                                Downloading a Torrent with libtorrent-python
                            
                                Different logging levels for filehandler and display in Python
                            
                                Minimising reading from and writing to disk in Python for a memory-heavy operation
                            
                                Py_initialize / Py_Finalize not working twice with numpy
                            
                                Installing scrapy/pyopenssl in Windows' virtualenv
                            
                                Sharing Python virtualenv environments
                            
                                Is the max thread limit actually a non-relevant issue for Python / Linux?
                            
                                How can I print to console while the program is running in python? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

re.sub not replacing all occurrences

Tags:

python