Basically I just need the effect of copying that HTML from browser window and pasting it in a textarea element.
For example I want this:
<p>Some</p>
<div>text<br />Some</div>
<div>text</div>
to become this:
Some
text
Some
text
The easiest way would be to strip all the HTML tags using the replace() method of JavaScript. It finds all tags enclosed in angle brackets and replaces them with a space. var text = html.
Using innerText property: We can use innerText property to get the text from HTML element. Output: Using textContent property: We can also use textContent property to get the text from HTML element.
If that HTML is visible within your web page, you could do it with the user selection (or just a TextRange
in IE). This does preserve line breaks, if not necessarily leading and trailing white space.
UPDATE 10 December 2012
However, the toString()
method of Selection
objects is not yet standardized and works inconsistently between browsers, so this approach is based on shaky ground and I don't recommend using it now. I would delete this answer if it weren't accepted.
Demo: http://jsfiddle.net/wv49v/
Code:
function getInnerText(el) {
var sel, range, innerText = "";
if (typeof document.selection != "undefined" && typeof document.body.createTextRange != "undefined") {
range = document.body.createTextRange();
range.moveToElementText(el);
innerText = range.text;
} else if (typeof window.getSelection != "undefined" && typeof document.createRange != "undefined") {
sel = window.getSelection();
sel.selectAllChildren(el);
innerText = "" + sel;
sel.removeAllRanges();
}
return innerText;
}
I tried to find some code I wrote for this a while back that I used. It worked nicely. Let me outline what it did, and hopefully you could duplicate its behavior.
You could even expand this more to format things like ordered and unordered lists. It really just depends on how far you'll want to go.
EDIT
Found the code!
public static string Convert(string template)
{
template = Regex.Replace(template, "<img .*?alt=[\"']?([^\"']*)[\"']?.*?/?>", "$1"); /* Use image alt text. */
template = Regex.Replace(template, "<a .*?href=[\"']?([^\"']*)[\"']?.*?>(.*)</a>", "$2 [$1]"); /* Convert links to something useful */
template = Regex.Replace(template, "<(/p|/div|/h\\d|br)\\w?/?>", "\n"); /* Let's try to keep vertical whitespace intact. */
template = Regex.Replace(template, "<[A-Za-z/][^<>]*>", ""); /* Remove the rest of the tags. */
return template;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With