How to convert a camelcased string to sentence cased without excluding any special characters?
Suggest a regex for converting camelcased string with special characters and numbers to sentence case?:
const string = `includes:SummaryFromDetailHistory1990-AsAbstract`
Expected outcome:
Includes : Summary From Detail History 1990 - As Abstract
Currently I'm using lodash startCase to convert camelCased to sentenceCase. But the issue with this approach is that it is removing special characters like brackets, numbers, parenthesis, hyphens, colons, etc... (most of the special characters)
So the idea is to convert camelcased strings to sentence cased while preserve the string identity
For example:
const anotherString = `thisIsA100CharactersLong:SampleStringContaining-SpecialChar(s)10&20*`
const expectedReturn = `This Is A 100 Characters : Long Sample String Containing - Special Char(s) 10 & 20 *`
Is that possible with regex?
CamelCase is a way to separate the words in a phrase by making the first letter of each word capitalized and not using spaces. It is commonly used in web URLs, programming and computer naming conventions. It is named after camels because the capital letters resemble the humps on a camel's back.
You'll have to deal with all the cases yourself:
[a-z](?=[A-Z])
: lowercase followed by uppercase[a-zA-Z](?=[0-9])
: letter followed by digit[0-9](?=[a-zA-Z])
: digit followed by letter[a-zA-Z0-9](?=[^a-zA-Z0-9])
: letter or digit followed by neither letter nor digit (\w
and \W
could be used, but they cover _
too, so up to you)[^a-zA-Z0-9](?=[a-zA-Z0-9])
: not letter nor digit following by either letter or digitThen, you can or them together:
([a-z](?=[A-Z])|[a-zA-Z](?=[0-9])|[0-9](?=[a-zA-Z])|[a-zA-Z0-9](?=[^a-zA-Z0-9])|[^a-zA-Z0-9](?=[a-zA-Z0-9]))
And replace by:
$1
(see the space after $1
).
See https://regex101.com/r/4AVbAs/1 for instance.
You will hit edge cases though, e.g. Char(s)
, so you'll need special rules for the parens for instance (see the following section about lookbehinds that can help for that). A bit of a tough job, quite error prone too and hardly maintainable I'm afraid.
If lookbehinds were allowed, you would not need to capture the first char in each group, but wrap the left patterns in (?<=...)
and replace by a simple space directly:
(?<=[a-z])(?=[A-Z])
: preceded by lowercase, followed by uppercase(?<=[a-zA-Z])(?=[0-9])
: preceded by letter, followed by digit(?<=[0-9])(?=[a-zA-Z])
: preceded by digit, followed by letter(?<=[a-zA-Z0-9])(?=[^a-zA-Z0-9])(?!(?:\(s)?\))
: preceded by letter or digit, followed by not letter nor digit, as well as not followed by (s)
nor )
(?<=[^a-zA-Z0-9])(?<!\()(?=[a-zA-Z0-9])
: preceded by not letter nor digit, as well as not preceded by (
, followed by letter or digitor-ed together:
(?<=[a-z])(?=[A-Z])|(?<=[a-zA-Z])(?=[0-9])|(?<=[0-9])(?=[a-zA-Z])|(?<=[a-zA-Z0-9])(?=[^a-zA-Z0-9])(?!(?:\(s)?\))|(?<=[^a-zA-Z0-9])(?<!\()(?=[a-zA-Z0-9])
Replace with an empty space, see https://regex101.com/r/DB91DE/1.
The wanted result doesn't seem to be regular, some special characters are supposed to be preceeded with a space and some are not. Treating the parenthesis like you want is a bit tricky. You can use function to handle the parenthesis, like this:
let parenth = 0;
const str = `thisIsA100CharactersLong:SampleStringContaining-SpecialChar(s)10&20*`,
spaced = str.replace(/[A-Z]|\d+|\W/g, (m) => {
if (m === '(') {
parenth = 1;
return m;
}
if (parenth || m === ')') {
parenth = 0;
return m;
}
return ` ${m}`;
});
console.log(spaced);
If the data can contain other brackets, instead of just checking parentheses, use a RexExp to test any opening bracket: if (/[({[]/.test(m)) ...
, and test for closing brackets: if (/[)}\]]/.test(m)) ...
.
You can test the snippet with different data at jsFiddle.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With