Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace a substring when it is a separate word

I am trying to replace a string i.e. "H3" in a file with "H1" but I want only "H3" to get replaced and not "mmmoleculeH3" to become "mmmoleculeH1". I tried re but my limited knowledge in python didn't get me anywhere. If there is any other method than that would be great.script that i am using now is:

#!/usr/bin/python

import fileinput
import sys
def replaceAll(file,searchExp,replaceExp):
    for line in fileinput.input(file, inplace=1):
        if searchExp in line:
            line = line.replace(searchExp,replaceExp)
        sys.stdout.write(line)

replaceAll("boxFile.cof","H3","H1")

If there is any way i can do it with this itself without using re then that would be great.Thanks in advance.

like image 434
Mohit Dixit Avatar asked Jan 12 '23 09:01

Mohit Dixit


2 Answers

As others have said, this is a case when regexes are the proper tool.

You can replace only whole words by using \b:

>>> text = 'H3 foo barH3 H3baz H3 quH3ux'
>>> re.sub(r'\bH3\b', 'H1', text)
'H1 foo barH3 H3baz H1 quH3ux'
like image 87
Daniel Roseman Avatar answered Jan 23 '23 19:01

Daniel Roseman


Since I had been curious about doing this without regex, here's an version without:

MYSTR = ["H3", "H3b", "aH3", "H3 mmmoleculeH3 H3",
         "H3 mmmoleculeH3 H3b", "H3 mmmoleculeH3 H3b H3"]
FIND = "H3"
LEN_FIND = len( FIND )
REPLACE = "H1"

for entry in MYSTR:
    index = 0
    foundat = []
    # Get all positions where FIND is found
    while index < len( entry ):
        index = entry.find( FIND, index )
        if index == -1:
            break
        foundat.append( index )
        index += LEN_FIND

    print "IN: ", entry,
    for loc in foundat:
        # Check if String is starting with FIND
        if loc == 0:
            # Check if String only contains FIND
            if LEN_FIND == len( entry ):
                entry = REPLACE
            # Check if the character after FIND is blank
            elif entry[LEN_FIND] == " ":
                entry = entry[:loc] + REPLACE + entry[loc + LEN_FIND:]
        else:
            # Check if character before FIND is blank
            if entry[loc - 1] == " ":
                # Check if FIND is the last part of the string
                if loc + LEN_FIND + 1 > len( entry ):
                    entry = entry[:loc] + REPLACE + entry[loc + LEN_FIND:]
                # Check if character after FIND is blank
                elif entry[loc + LEN_FIND] == " ":
                    entry = entry[:loc] + REPLACE + entry[loc + LEN_FIND:]

    print " OUT: ", entry

The output is:

IN:  H3  OUT:  H1
IN:  H3b  OUT:  H3b
IN:  aH3  OUT:  aH3
IN:  H3 mmmoleculeH3 H3  OUT:  H1 mmmoleculeH3 H1
IN:  H3 mmmoleculeH3 H3b  OUT:  H1 mmmoleculeH3 H3b
IN:  H3 mmmoleculeH3 H3b H3  OUT:  H1 mmmoleculeH3 H3b H1

PS: I would prefer the solution from Daniel Roseman.

like image 28
Robert Caspary Avatar answered Jan 23 '23 18:01

Robert Caspary