Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Sheets multiple search and replace from a list

I am looking for a solution to search for certain strings in a Google Sheet and, when found, replace them with another string from a list in another sheet.

For better understanding, I prepared a Sheet for you:

https://docs.google.com/a/vicampo.de/spreadsheets/d/1mETtAY72K6ST-hg1qOU9651265nGq0qvcgvzMRqHDO8/edit?usp=sharing

So here's the exact task I want to achieve:

In every single cell in column A of sheet "Text", look for the strings given in column A in sheet "List" and, when found, replace it with the corresponding string in column B of the sheet "List".

See my Example: Look in cell A1 for the string "Lorem" and replace it with "Xlorem", then look for the string "Ipsum" and replace it with "Xipsum", then look for the string "amet" and replace it with "Xamet" then move on to cell B1 and start again looking for the strings...

I have tried different functions and managed to do this with a function for one cell. But how to do it in a loop?

Thanks everyone who is interested in helping out with this problem!

like image 674
Christoph Avatar asked Apr 02 '15 10:04

Christoph


People also ask

Can you select multiple items from a list in Google Sheets?

In such cases, it's possible that the user knows more than one and there is a need to select multiple options from the drop-down. Therefore, multiple selections in dropdown lists can be quite useful. Unfortunately, this option is not traditionally allowed in Google Sheets. You are only allowed one option at a time.

How do I select multiple options from a drop-down list in Google Sheets?

Select the cell you want to fill with multiple items from your validation range. Go to Scripts > Multi-select for this cell... and the sidebar should open, showing a checklist of valid items. Tick the items you want and click the 'Set' button to fill your cell with those selected items, comma separated.

How do you do multiple search in Google Sheets?

In the β€œFind and Replace” window that opens, click the β€œFind” box at the top and type your search query. Next, click the β€œSearch” drop-down menu and select β€œAll sheets” to search in all sheets. Then, click β€œFind” at the bottom to start the search.


2 Answers

Copy Sample File With Explanation

enter image description here

Problem

enter image description here

The challenge is: Find & Replace multiple values in the input of multiple cells.

ArrayFormula's

Solutions which I account as Array-Solution must be:

  1. based on open ranges
  2. no need to drag the formula down
  3. no need to modify the formula when new items in lists appear

These tests must be passed:

  • Is ArrayFormula
  • User can set Case Sensitivity
  • Replaces Emojis
  • Replaces Special Chars $\[]. etc.
  • CrashTest. Works for 10K rows of data
  • CrashTest. Works for 2K replacements

Script

I recommend using the not-regex-based script in this case. This algorithm finds and replaces text by chars:

Usage

Use as a regular formula from sheet:

=substitutes(A12:A;List!A1:B)

enter image description here

Code

Save this code to use the formula above:

/**
 * Substitutes in every entry in array
 * Text from prefilled array
 *
 * @param {array} input The array of strings.
 * @param {array} subTable The array of string pairs: search texts / replace texts.
 * @param {boolean} caseSensitive [optional=false] 
 * TRUE to match Apple and apple as different words
 * @return The input with all replacement made
 * @customfunction
 */
function substitutes(input, subTable,caseSensitive) {
  //  default behavior it is not case sensitive
  caseSensitive = caseSensitive || false;
  // if the input is not a list, become a list */
  if( typeof input != "object" ) {
    input = [ input ];
  }
  var res = [], text;
  for (var i = 0; i < input.length; i++) {
    // force each array element in the input be a string
    text = input[i].toString();
    for (var ii = 0; ii < subTable.length; ii++) {
      text = replaceAll_(
        text, 
        subTable[ii][0], 
        subTable[ii][1], 
        caseSensitive);
    }
    res.push(text);
  }
  return res;
}


/***
 * JavaScript Non-regex Replace
 * 
 * Original code sourse:
 * https://stackoverflow.com/a/56989647/5372400
 */
function replaceAll_(str, find, newToken, caseSensitive) {
    var i = -1;
    // sanity check & defaults
    if (!str) {
        // Instead of throwing, act as 
        // COALESCE if find == null/empty and str == null
        if ((str == null) && (find == null))
            return newToken;
        return str;
    }
    if (!find || find === ''){ return str; }
    if (find === newToken) { return str; }
    caseSensitive = caseSensitive || false;
    find = !caseSensitive ? find.toLowerCase() : find;
    // search process, search by char
    while ((
        i = (!caseSensitive ? str.toLowerCase() : str).indexOf(
            find, i >= 0 ? i + newToken.length : 0
        )) !== -1
    ) {
        str = str.substring(0, i) +
            newToken +
            str.substring(i + find.length);
    } 
    return str;
}

Monster Formula

I've used the RegEx algorithm to solve it with native functions. This method is not recommended as it slows down your Worksheet.

The formula is:

=INDEX(SUBSTITUTE(REGEXREPLACE(TRANSPOSE(QUERY(TRANSPOSE(IFERROR(SPLIT(SUBSTITUTE(TRANSPOSE(QUERY(TRANSPOSE(IFERROR(VLOOKUP(SPLIT(REGEXREPLACE(A12:A;SUBSTITUTE(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑");"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1");"𑇑";"(.*)");INDEX(REGEXREPLACE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IF(SEQUENCE(COUNTA(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑");"[^𑇑]";""))/2));MAX(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑");"[^𑇑]";""))/2)))-(SEQUENCE(COUNTA(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑");"[^𑇑]";""))/2)))-1)*MAX(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑");"[^𑇑]";""))/2))<=INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑");"[^𑇑]";""))/2);"𑇣"&SEQUENCE(COUNTA(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑");"[^𑇑]";""))/2));MAX(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑");"[^𑇑]";""))/2)))-(SEQUENCE(COUNTA(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑");"[^𑇑]";""))/2)))-1)*MAX(INDEX(LEN(REGEXREPLACE(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑");"[^𑇑]";""))/2))&"𑇀";));;2^99)));" ?𑇣";"$")));"𑇀");{List!A1:A\List!B1:B};2;)&"𑇩"));;2^99));"𑇩 ";"𑇩")&"π…˜";"𑇩")&SPLIT(REGEXREPLACE(A12:A;"(?i)"&SUBSTITUTE(SUBSTITUTE(QUERY(FILTER(REGEXREPLACE(List!A1:A;"(\\|\+|\*|\?|\[|\^|\]|\$|\(|\)|\{|\}|\=|\!|\<|\>|\||\:|\-)";"\\$1")&"𑇦";List!A1:A<>"");;2^99);"𑇦 ";"|");"𑇦";"");"𑇑")&"π…˜";"𑇑")))&"𝅗";;2^99));"𝅗 *";"");"π…˜";""))

Other Solutions

Nested formulas

Nested SUBSTITUTE or REGEXREPLACE formulas as was noted in other answers.

Formulas you need to drag down for the result

Here's a sample formula. Basic logic - split the text into parts β†’ modify parts individually β†’ to join the new result.

This formula must be copied down:

=JOIN(" ";
ArrayFormula(
IFERROR(VLOOKUP(TRANSPOSE(SPLIT(A1;" "));List!A:B;2;0);TRANSPOSE(SPLIT(A1;" ")))))
like image 74
Max Makhrov Avatar answered Nov 02 '22 20:11

Max Makhrov


Probably the best for you, in this case, should be creating a new function to your Google Spreadsheet. It tends to be, in the general case, more simple, clear and powerfull than that kind of complex formulas that should do the same.

In this particular case, I have the same problem, so you can use the same function:

Click on "Tools" menu, then click on the "Script Editor" option. Into the script editor, erase the draft and paste this function:

function preg_quote( str ) {
  // http://kevin.vanzonneveld.net
  // +   original by: booeyOH
  // +   improved by: Ates Goral (http://magnetiq.com)
  // +   improved by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
  // +   bugfixed by: Onno Marsman
  // *     example 1: preg_quote("$40");
  // *     returns 1: '\$40'
  // *     example 2: preg_quote("*RRRING* Hello?");
  // *     returns 2: '\*RRRING\* Hello\?'
  // *     example 3: preg_quote("\\.+*?[^]$(){}=!<>|:");
  // *     returns 3: '\\\.\+\*\?\[\^\]\$\(\)\{\}\=\!\<\>\|\:'

  return (str+'').replace(/([\\\.\+\*\?\[\^\]\$\(\)\{\}\=\!\<\>\|\:])/g, "\\$1");
}

function ARRAYREPLACE(input,fromList,toList,caseSensitive){
  /* default behavior it is not case sensitive */
  if( caseSensitive === undefined ){
    caseSensitive = false;
  }
  /* if the from list it is not a list, become a list */
  if( typeof fromList != "object" ) {
    fromList = [ fromList ];
  }
  /* if the to list it is not a list, become a list */
  if( typeof toList != "object" ) {
    toList = [ toList ];
  }
  /* force the input be a string */
  var result = input.toString();

  /* iterates using the max size */
  var bigger  = Math.max( fromList.length, toList.length) ;

  /* defines the words separators */
  var arrWordSeparator = [ ".", ",", ";", " " ];

  /* interate into the lists */
  for(var i = 0; i < bigger; i++ ) {
    /* get the word that should be replaced */
    var fromValue = fromList[ ( i % ( fromList.length ) ) ]
    /* get the new word that should replace */
    var toValue = toList[ ( i % ( toList.length ) ) ]

    /* do not replace undefined */
    if ( fromValue === undefined ) {
      continue;
    }
    if ( toValue == undefined ) {
      toValue = "";
    }

    /* apply case sensitive rule */
    var caseRule = "g";
    if( !caseSensitive ) {
      /* make the regex case insensitive */
      caseRule = "gi";
    }

    /* for each end word char, make the replacement and update the result */
    for ( var j = 0; j < arrWordSeparator.length; j++ ) {
      
      /* from value being the first word of the string */
      result =  result.replace( new RegExp( "^(" + preg_quote( fromValue + arrWordSeparator[ j ] ) + ")" , caseRule ), toValue + arrWordSeparator[ j ] );
      
      /* from value being the last word of the string */
      result =  result.replace( new RegExp( "(" + preg_quote( arrWordSeparator[ j ] + fromValue ) + ")$" , caseRule ), arrWordSeparator[ j ] + toValue );
      
      /* from value in the middle of the string between two word separators */
      for ( var k = 0; k < arrWordSeparator.length; k++ ) {
        result =  result.replace( 
          new RegExp( 
            "(" + preg_quote( arrWordSeparator[ j ] + fromValue + arrWordSeparator[ k ] ) + ")" , 
            caseRule 
          ), 
          /* need to keep the same word separators */
          arrWordSeparator[ j ] + toValue + arrWordSeparator[ k ] 
        );
      }
    }
    
    /* from value it is the only thing in the string */
    result =  result.replace( new RegExp( "^(" + preg_quote( fromValue ) + ")$" , caseRule ), toValue );
  }
  /* return the new result */
  return result;
}

Just save your script and the new function it will be available to you. Now, you have the function that replaces all the first values list by the second value list.

=ARRAYREPLACE(C2;A1:A4;B1:B4)

for example, takes the C2 text and replaces all the elements found in the A1:A4 list by the equivalent into the B1:B4 list.

example of the use of the function into Google Spreadsheet

like image 25
Thiago Mata Avatar answered Nov 02 '22 19:11

Thiago Mata