Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stripping out HTML tags from a string

Tags:

html

ios

swift

How do I remove HTML tags from a string so that I can output clean text?

let str = string.stringByReplacingOccurrencesOfString("<[^>]+>", withString: "", options: .RegularExpressionSearch, range: nil) print(str) 
like image 937
arled Avatar asked Sep 22 '14 21:09

arled


People also ask

What function removes HTML tags from a string?

The strip_tags() function strips a string from HTML, XML, and PHP tags. Note: HTML comments are always stripped.

How do you remove a tag in HTML?

The <del> tag in HTML stands for delete and is used to mark a portion of text which has been deleted from the document. The deleted text is rendered as strike-through text by the web browsers although this property can be changed using CSS text-decoration property. The <del> tag requires a starting and ending tag.

How remove HTML tag from string in react?

To remove html tags from string in react js, just use the /(<([^>]+)>)/ig regex with replace() method it will remove tags with their attribute and return new string.


2 Answers

Hmm, I tried your function and it worked on a small example:

var string = "<!DOCTYPE html> <html> <body> <h1>My First Heading</h1> <p>My first paragraph.</p> </body> </html>" let str = string.stringByReplacingOccurrencesOfString("<[^>]+>", withString: "", options: .RegularExpressionSearch, range: nil) print(str)  //output "  My First Heading My first paragraph. " 

Can you give an example of a problem?

Swift 4 and 5 version:

var string = "<!DOCTYPE html> <html> <body> <h1>My First Heading</h1> <p>My first paragraph.</p> </body> </html>" let str = string.replacingOccurrences(of: "<[^>]+>", with: "", options: .regularExpression, range: nil) 
like image 51
Steve Rosenberg Avatar answered Oct 15 '22 04:10

Steve Rosenberg


Since HTML is not a regular language (HTML is a context-free language), you cannot use Regular Expressions. See: Using regular expressions to parse HTML: why not?

I would consider using NSAttributedString instead.

let htmlString = "LCD Soundsystem was the musical project of producer <a href='http://www.last.fm/music/James+Murphy' class='bbcode_artist'>James Murphy</a>, co-founder of <a href='http://www.last.fm/tag/dance-punk' class='bbcode_tag' rel='tag'>dance-punk</a> label <a href='http://www.last.fm/label/DFA' class='bbcode_label'>DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href='http://www.last.fm/tag/alternative%20dance' class='bbcode_tag' rel='tag'>alternative dance</a> and <a href='http://www.last.fm/tag/post%20punk' class='bbcode_tag' rel='tag'>post punk</a>, along with elements of <a href='http://www.last.fm/tag/disco' class='bbcode_tag' rel='tag'>disco</a> and other styles. <br />"     let htmlStringData = htmlString.dataUsingEncoding(NSUTF8StringEncoding)! let options: [String: AnyObject] = [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding] let attributedHTMLString = try! NSAttributedString(data: htmlStringData, options: options, documentAttributes: nil) let string = attributedHTMLString.string 

Or, as Irshad Mohamed in the comments would do it:

let attributed = try NSAttributedString(data: htmlString.data(using: .unicode)!, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType], documentAttributes: nil) print(attributed.string) 
like image 37
Joony Avatar answered Oct 15 '22 05:10

Joony