I have a CouchDB view map function that generates an abstract of a stored HTML document (first <code>x</code> characters of text). Unfortunately I have no browser environment to convert HTML to plain text. Currently I use this multi-stage regexp <pre class="prettyprint"><code>html.replace(/<style([\s\S]*?)<\/style>/gi, ' ') .replace(/<script([\s\S]*?)<\/script>/gi, ' ') .replace(/(<(?:.|\n)*?>)/gm, ' ') .replace(/\s+/gm, ' '); </code></pre> while it's a very good filter, it's obviously not a perfect one and some leftovers slip through sometimes. Is there a better way to convert to plain text without a browser environment?

This simple regular expression works: <pre class="prettyprint"><code>text.replace(/<[^>]*>/g, ''); </code></pre> It removes all anchors. Entities, like <code>&lt;</code> does not contains <, so there is no issue with this regex.

Convert HTML to plain text in JS without browser environment

Tags:

javascript

regex

couchdb

I have a CouchDB view map function that generates an abstract of a stored HTML document (first x characters of text). Unfortunately I have no browser environment to convert HTML to plain text.

Currently I use this multi-stage regexp

html.replace(/<style([\s\S]*?)<\/style>/gi, ' ')     .replace(/<script([\s\S]*?)<\/script>/gi, ' ')     .replace(/(<(?:.|\n)*?>)/gm, ' ')     .replace(/\s+/gm, ' ');

while it's a very good filter, it's obviously not a perfect one and some leftovers slip through sometimes. Is there a better way to convert to plain text without a browser environment?

905

asked Mar 02 '13 22:03

Era

1 Answers

This simple regular expression works:

text.replace(/<[^>]*>/g, '');

It removes all anchors.

Entities, like < does not contains <, so there is no issue with this regex.

108

answered Sep 28 '22 04:09

Gaël Barbin

Related questions
                            
                                Best way to make one model 'selected' in a Backbone.js collection?
                            
                                Is there any documentation on the package.json file?
                            
                                JavaScript DOMParser access innerHTML and other properties
                            
                                Html/Javascript debugging in JavaFX WebView
                            
                                Can I create a TypeScript type and use that when AJAX returns JSON data?
                            
                                Pass Angular scope variable to Javascript
                            
                                $resource.query return split strings (array of char) instead of a string
                            
                                Loop through all Mongo collections and execute query
                            
                                Do I need to escape dash character in regex? [duplicate]
                            
                                React JS Server side issue - window not found
                            
                                React - How to open PDF file as a href target blank
                            
                                How do you dynamically create a radio button in Javascript that works in all browsers?
                            
                                Chrome FileReader
                            
                                Using jQuery to trigger html onclick event
                            
                                Mechanize and Javascript
                            
                                code to set a select box option value as an object
                            
                                Does Google Closure Compiler ever decrease performance?
                            
                                Automatically Crop HTML5 canvas to contents
                            
                                Is there an indexOf in javascript to search an array with custom compare function
                            
                                Select multiple jQuery objects with .add()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With