This is my string. It contains some HTML:
First sentence. Here is a <a href="http://google.com">Google</a> link in the second sentence! The third sentence might contain an image like this <img src="http://link.to.image.com/hello.png" /> and ends with !? The last sentence looks like <b>this</b>??
I want to split the string to sentences (array), keep the HTML as well as the separator. Like this:
[0] = First sentence.
[1] = Here is a <a href="http://google.com">Google</a> link in the second sentence!
[2] = The third sentence might contain an image like this <img src="http://link.to.image.com/hello.png" /> and ends with !?
[3] = The last sentence looks like <b>this</b>??
Can anybody suggest me a way to do this please? May be using Regex and match?
This is very close to what I’m after, but not really with the HTML bits: JavaScript Split Regular Expression keep the delimiter
The easy part is the parsing; you can do this easily by wrapping an element around the string. Splitting the sentences is somewhat more intricate; this is my first stab at it:
var s = 'First sentence. Here is a <a href="http://google.com">Google.</a> link in the second sentence! The third sentence might contain an image like this <img src="http://link.to.image.com/hello.png" /> and ends with !? The last sentence looks like <b>this</b>??';
var wrapper = document.createElement('div');
wrapper.innerHTML = s;
var sentences = [],
buffer = [],
re = /[^.!?]+[.!?]+/g;
[].forEach.call(wrapper.childNodes, function(node) {
if (node.nodeType == 1) {
buffer.push(node.outerHTML); // save html
} else if (node.nodeType == 3) {
var str = node.textContent; // shift sentences
while ((match = re.exec(str)) !== null) {
sentences.push(buffer.join('') + match);
buffer = [];
str = str.substr(re.lastIndex + 1);
re.lastIndex = 0; // reset regexp
}
buffer.push(str);
}
});
if (buffer.length) {
sentences.push(buffer.join(''));
}
console.log(sentences);
Demo
Every node that's either an element or unfinished sentence gets added to a buffer until a full sentence is found; it's then prepended to the result array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With