I have the text below:
This is a code update
* Official Name: Noner
* Pub: https://content.upcodes.co/viewer/washington/wa-mechanical-code-2021
* Agency: Agency Ni
* Reference:
https://web.archive.org/web/20230226234118/https://lawfilesext.leg.wa.gov/law/wsr/agency/BuildingCodeCouncil.htm
https://web.archive.org/web/20230303022030/https://lawfilesext.leg.wa.gov/law/wsr/2023/02/23-02-055.htm (#1)
* Citation: WAC 51-52 / WSR 23-02-055
* Draft Doc Title:
WSR 23-02-055 (#1)
* Draft Source Doc: https://web.archive.org/web/20230303022030/https://lawfilesext.leg.wa.gov/law/wsr/2023/02/23-02-055.htm (#1)
* Draft Drive: https://drive.google.com/file/d/1pYmwQS3t-ZX-Vyg9yBabtIpXZ7By2G6f/view?usp=share_link ( #1)
* Final Doc Title:
IECC Com Update(#1)
IECC Res Update (#2)
IECC Res Update (#3)
* Final Source Doc:
https://web.archive.org/web/20230303022130/https://apps.leg.wa.gov/wac/default.aspx?cite=51-52&full=true&pdf=true (#1)
https://web.archive.org/web/20230303022030/https://lawfilesext.leg.wa.gov/law/wsr/2023/02/23-02-055.htm (#2)
* Final Drive: https://web.archive.org/web/20230303022130/https://apps.leg.wa.gov/wac/default.aspx?cite=51-52&full=true&pdf=true (#1)
https://web.archive.org/web/2023030302fdfdfg2130/https://apps.legfdg.gov/wac/default.aspx?cite=51-52&fdsfullfdsf=true&pfdsfdf=true (#2)
* Effective Date: January 4, 2023
I want to extract the information corresponding to the tag 'Reference:' but the code below only gives me one line. I want to scan all text until it encounters the asterisk symbol.
//Extract Reference
var reference = description.search("Reference:");
if(reference != -1){
reference = description.match(/(?<=^\* Reference\s*:)[\s]*[\n]*[^\n\r]*/m);
reference = reference?.[0].trim();
}else{
reference = '';
}
console.log('Reference: ' + reference);
Expected Output:
https://web.archive.org/web/20230226234118/https://lawfilesext.leg.wa.gov/law/wsr/agency/BuildingCodeCouncil.htm
https://web.archive.org/web/20230303022030/https://lawfilesext.leg.wa.gov/law/wsr/2023/02/23-02-055.htm (#1)
I decided to follow @Nick's idea not making any assumption about the 'subject' string.
I produced two lenient approaches in the sense that they work:
The first works in all cases whatever the content:
let ref_pat = /^\* Reference:\s*(.*\S(?:\s+.*\S)*?)??\s*(?:^\*|(?![\s\S]))/m;
let reference = description.match(ref_pat)?.[1] ?? '';
A second more efficient pattern is possible if you assume the content doesn't contain asterisk characters:
let ref_pat = /^\* Reference:\s*([^*]*[^*\s])/m;
let reference = description.match(ref_pat)?.[1] ?? '';
This is the only break, but this one is from far more simple.
Whatever the one you choose, the result is already trimmed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With