Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove HTML markup from a body of text within a Google Spreadsheet?

I am working on cleaning up text within a google doc. The challenge is that the copy contains HTML markup and I am trying to remove it to be left with clean text.

I have created the following, but it seems to remove only the first instance of HTML code in the cell, how do I get it all out?

= regexreplace(C9,"\<[a-zA-Z0-9-?]*\>","")
like image 820
Greg Hay Avatar asked Apr 04 '13 14:04

Greg Hay


People also ask

How do I remove the HTML tag in Google Spreadsheet?

The first way is to use the function to remove all non-printable characters from a text string. To do this, you would use the following syntax: =CLEAN(text) . The second way is to use the function to remove all HTML tags from a text string. To do this, you would use the following syntax: =CLEAN(text, removeHTML) .

How do I get rid of markup in HTML?

The HTML tags can be removed from a given string by using replaceAll() method of String class. We can remove the HTML tags from a given string by using a regular expression. After removing the HTML tags from a string, it will return a string as normal text.

Is it possible to remove the HTML tags from data?

PHP provides an inbuilt function to remove the HTML tags from the data. The strip_tags() function is an inbuilt function in PHP that removes the strings form HTML, XML and PHP tags. It accepts two parameters. This function returns a string with all NULL bytes, HTML, and PHP tags stripped from a given $str.


1 Answers

try this regular expression :

= regexreplace(C9,"<.*?>","")
like image 126
Oussama Jilal Avatar answered Jan 01 '23 07:01

Oussama Jilal