Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

find and remove words matching a substring in a sentence

Is it possible to use regex to find all words within a sentence that contains a substring?

Example:

var sentence = "hello my number is 344undefined848 undefinedundefined undefinedcalling whistleundefined";

I need to find all words in this sentence which contains 'undefined' and remove those words.

Output should be "hello my number is ";

FYI - currently I tokenize (javascript) and iterate through all the tokens to find and remove, then merge the final string. I need to use regex. Please help.

Thanks!

like image 600
user3658423 Avatar asked Jan 02 '15 08:01

user3658423


People also ask

How do I remove certain words from a string?

Using the replace() function We can use the replace() function to remove word from string in Python. This function replaces a given substring with the mentioned substring. We can replace a word with an empty character to remove it. We can also specify how many occurrences of a word we want to replace in the function.

How do I remove a substring from a string in R?

You can either use R base function gsub() or use str_replace() from stringr package to remove characters from a string or text.

How do I remove a substring from a string in TypeScript?

The replace() method can be used in a TypeScript string to replace a particular substring or a match (regular expression) in a string. All we need to do is pass the substring or regular expression of the substring we want to replace.

How do I remove something from a string in Javascript?

The replace() method is one of the most commonly used techniques to remove the character from a string in javascript. The replace() method takes two parameters, the first of which is the character to be replaced and the second of which is the character to replace it with.


2 Answers

You can use:

str = str.replace(/ *\b\S*?undefined\S*\b/g, '');

RegEx Demo

like image 130
anubhava Avatar answered Nov 03 '22 11:11

anubhava


It certainly is possible.

Something like start of word, zero or more letters, "undefined", zero or more letters, end of word should do it.

A word boundary is \b outside a character class, so:

\b\w*?undefined\w*?\b

using non-greedy repetition to avoid the letter matching tryig to match "undefined" and leading to lots of backtracking.

Edit switch [a-zA-Z] to \w because the example includes numbers in the "words".

like image 33
Richard Avatar answered Nov 03 '22 09:11

Richard