I am trying to parse a text document containing multiple bullets.
I would like to parse a bullet point having single newline characters but would like to break when 2 or more newline characters are found.
for example :
-----------------------------------
* bullet
text on new line
more text
this should be a separate block
-----------------------------------
when passed through the function, this should capture :
-----------------------------------
-> start
bullet
text on new line
more text
<- end capture
this should be a seperate block
-----------------------------------
This is what i have so far , I have written a javascript function that can recursively parse ordered/unordered mediawiki'sh lists to html. Only difference is that the blocks are onserted on 2 line breaks vs mediawiki way of 1 line breaks.
function parseLists(str)
{
//How can I capture bulleted lines with less than or equal to "1" newline character?
return str.replace(/(?:(?:(?:^|\n)[\*#].*)+)/g, function (match) {
var listType = match.match(/(^|\n)#/) ? 'ol' : 'ul';
match = match.replace(/(^|\n)[\*#][ ]{0,1}/g, "$1");
match = parseLists(match);
return '<'
+ listType + '><li>'
+ match.replace(/^\n/, '').split(/\n/).join('</li><li>')
+ '</li></' + listType
+ '>';
});
}
http://jsfiddle.net/epinapala/L18y7zyx/7/
I think the problem is with the first regex - /(?:(?:(?:^|\n)[*#].*)+)/g to match bullts, this regex actually breaks when a newline character is found, How can I capture bulleted lines with less than or equal to "1" newline character?
I would like to parse the bullets with newlines in them, and would like to break a bullet only if there are 2 or more new line characters. followed by bullet content.
[Edit] - I was able to make some changes and the current version of my function looks like below
function parseLists2(str)
{
return str.replace(/(?:(?:(?:^|\n)[\*#](?:.+\n)+.*))/g, function(match){
match = match.replace(/\n(?![#\*])/g," ");
//alert(match);
var listType = match.match(/(^|\s)#/) ? 'ol' : 'ul';
match = match.replace(/(^|\s)[\*#][ ]{0,1}/g, "$1");
match = parseLists2(match);
return '<'
+ listType + '><li>'
+ match.replace(/^\s/, '')
.split(/\n/).join('</li><li>')
+ '</li></' + listType
+ '>';
});
}
The only problem I am facing is If I have a pattern like below:
* some ul item
* some ul item
# some ol item
the ul item is not being seperated as a block unless it is seperated by a double line break.
Thanks!
You can first create lists and the <li>
s for your bullets using these two (1, 2) regexs:
/\*\s*(([^\n]*(\n|$))*?)(?=\n|#|\*|<[uo]l>|$)/g;
/#\s*(([^\n]*(\n|$))*?)(?=\n|#|\*|<[uo]l>|$)/g;
You can then join adjacent <ul>
s and <ol>
s using another regex:
/(<\/ul>\n?<ul>|<\/ol>\n?<ol>)/g;
The following snippet demonstrates this:
txt1.onkeyup = txt1.onkeydown = txt1.onchange = replace;
replace();
function replace() {
txt2.innerHTML = txt1.value.
replace (/\*\s*(([^\n]*(\n|$))*?)(?=\n|#|\*|<[uo]l>|$)/g, "<ul><li>\n$1</li></ul>").
replace ( /#\s*(([^\n]*(\n|$))*?)(?=\n|#|\*|<[uo]l>|$)/g, "<ol><li>\n$1</li></ol>").
replace (/(<\/ul>\n?<ul>|<\/ol>\n?<ol>)/g, "");
}
#txt1, #txt2 {
width: 40%;
height: 150px;
display: inline-block;
overflow-y: scroll;
}
<textarea id="txt1">
* aaaa
* bbbb
# cccc
# dddd
This text is separate.
</textarea><div id="txt2"></div>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With