I have a very large file containing thousands of sentences. In all of them, the first word of each sentence begins with lowercase, but I need them to begin with uppercase.
I looked through the site trying to find a regex to do this but I was unable to. I learned a lot about regex in the process, which is always a plus for my job, but I was unable to find specifically what I am looking for.
I tried to find a way of compiling the code from several answers, including the following:
But for different reasons none of them served my purpose.
I am working with a translation-specific application which accepts regex.
Do you think this is possible at all? It would save me hours of tedious work.
You can use this regex to search for the first letters of sentences:
(?<=[\.!?]\s)([a-z])
It matches a lowercase letter [a-z]
, following the end of a previous sentence (which might end with one of the following: [\.!?]
) and a space character \s
.
Then make a substitution with \U$1
.
It doesn't work only for the very first sentence. I intentionally kept the regex simple, because it's easy to capitalize the very first letter manually.
Working example: https://regex101.com/r/hqwK26/1
UPD: If your software doesn't support \U
, you might want to copy your text to Notepad++ and make a replacement there. The \U
is fully supported, just checked.
UPD2: According to the comments, the task is slightly different, and just the first letters of each line should be capitalized.
There is a simple regex for that: ^([a-z])
, with the same substitution pattern.
Here is a working example: https://regex101.com/r/hqwK26/2
Taking Ildar's answer and combining both of his patterns should work with no compromises.
(?<=[\.!?]\s)([a-z])|^([a-z])
This is basically saying, if first pattern OR second pattern. But because you're now technically extracting 2 groups instead of one, you'll have to refer to group 2 as $2
. Which should be fine because only one of the patterns should be matched.
So your substitution pattern would then be as follows...
\U$1$2
Here's a working example, again based on Ildar's answer... https://regex101.com/r/hqwK26/13
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With