I'm trying to search for a substring within lines of a file and insert similar lines immediately after the found line. Although there were similar solutions using the fileinput
method, I could not figure out how to use it in my case.
Here is what I have tried:
list = ["abc", "pqr", "xyz"]
inputfile = open (somefile.txt, 'a+')
for line in <inputfile>:
if 'stringstosearch' in line:
for <item> in list:
new_line = "new_line with %s" %(item)
inputfile.write(new_line + "\n")
break
inputfile.close()
for example if the text file is:
Torquent scelerisque aptent hac rhoncus vel
Turpis vestibulum tellus laoreet mollis conubia facilisis tempor nec semper
In mi mauris etiam quisque sem congue est velit lacus convallis amet ante ad
Integer maecenas semper quisque nisi hendrerit, libero feugiat cursus euismod accumsan
Dui sed magna vivamus augue ac quisque ac mauris torquent eros taciti
Conubia curae vel himenaeos dictumst sed at
string to search = "mauris etiam quisque"
list = ["abc", "pqr", "xyz" ]
Expected output after file write:
Torquent scelerisque aptent hac rhoncus vel
Turpis vestibulum tellus laoreet mollis conubia facilisis tempor nec semper
In mi mauris etiam quisque sem congue est velit lacus convallis amet ante ad
new_line with abc
new_line with pqr
new_line with xyz
Integer maecenas semper quisque nisi hendrerit, libero feugiat cursus euismod accumsan
Dui sed magna vivamus augue ac quisque ac mauris torquent eros taciti
Conubia curae vel himenaeos dictumst sed at
you cant just insert in middle of file,so 1st read the file entirely, for small files. then open the same file in write mode and append when you find the string.
list = ["abc", "pqr", "xyz"]
inputfile = open('somefile.txt', 'r').readlines()
write_file = open('somefile.txt','w')
for line in inputfile:
write_file.write(line)
if 'stringstosearch' in line:
for item in list:
new_line = "new_line with %s" %(item)
write_file.write(new_line + "\n")
write_file.close()
You can't generally insert into the middle of a file.*
The generic solution to this is to copy to a new file, inserting in the midst of copying, and then move the new file on top of the old one. For example:
with tempfile.NamedTemporaryFile('w', delete=False) as outfile:
with open(inpath) as infile,
for line in infile:
outfile.write(line)
if needs_inserting_after(line):
outfile.write(stuff_to_insert_after(line))
os.replace(outfile.name, inpath)
Note that os.replace
doesn't exist in Python 2.7. If you don't care about Windows, you can use os.rename
instead. If you do, I'd strongly suggest looking for a backport of os.replace
on PyPI; there are at least two of them. Otherwise, you have to learn about the whole mess with exclusive locks and atomic moves on Windows.
There are also some higher-level libraries that wrap the whole thing up for you. (I wrote one called fatomic
that I think serves as nice sample code, but I'm not sure I'd trust it for production code without a lot more testing. I'm sure if you search PyPI you can find other alternatives.)
Of course there are alternatives:
You can move the original file to a backup path, then copy it into a new file at the normal path, instead of copying to a new file at a temporary path and then moving after the fact. This has the disadvantage of leaving you with half a file if you fail in the middle, but the advantage of not needing to deal with the exclusive-locks-on-Windows problem. This is effectively what fileinput.FileInput
with inplace=True
automates for you.
You can read the whole file into memory, process it in-memory, then write the whole file back out. This has the advantage of being dead simple, not needing any extra files, and meaning that if anyone has a file handle to your file (rather than a pathname) they see the new version once you're done. But the last of those can be a disadvantage. And of course this means that you need enough memory to hold all your data at once.
Finally, you can always shift the whole file from the current position up by N bytes before writing N bytes. This has most of the advantages of both of the above, but it's also messy and slow.
* Why did I say "generally" there? Well, ultimately, the filesystem has to have some way of inserting a new block in the middle of a file. And some filesystems will expose this to the user level. Some older platforms used to have user-level features built on top of this, like "random access text files" on Apple ][ ProDOS or the thingy I forget in VMS. So, it's not literally true that you can't ever insert into the middle of a file. It's just true in every case you care about.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With