I'm trying to write a program that capitalizes the first letter of each sentence. This is what I have so far, but I cannot figure out how to add back the period in between sentences. For example, if I input:
hello. goodbye
the output is
Hello Goodbye
and the period has disappeared.
string=input('Enter a sentence/sentences please:')
sentence=string.split('.')
for i in sentence:
print(i.capitalize(),end='')
The first letter of a string can be capitalized using the capitalize() function. This method returns a string with the first letter capitalized. If you are looking to capitalize the first letter of the entire string the title() function should be used.
3. Click the "Aa" drop-down menu located in the Font section of the Ribbon, and then select "Capitalize Each Word."
You should always capitalize the first letter of the first word in a sentence, no matter what the word is. Take, for example, the following sentences: The weather was beautiful. It was sunny all day. Even though the and it aren't proper nouns, they're capitalized here because they're the first words in their sentences.
You could use nltk for sentence segmentation:
#!/usr/bin/env python3
import textwrap
from pprint import pprint
import nltk.data # $ pip install http://www.nltk.org/nltk3-alpha/nltk-3.0a3.tar.gz
# python -c "import nltk; nltk.download('punkt')"
sent_tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
text = input('Enter a sentence/sentences please:')
print("\n" + textwrap.fill(text))
sentences = sent_tokenizer.tokenize(text)
sentences = [sent.capitalize() for sent in sentences]
pprint(sentences)
Enter a sentence/sentences please: a period might occur inside a sentence e.g., see! and the sentence may end without the dot! ['A period might occur inside a sentence e.g., see!', 'And the sentence may end without the dot!']
You could use regular expressions. Define a regex that matches the first word of a sentence:
import re
p = re.compile(r'(?<=[\.\?!]\s)(\w+))
This regex contains a positive lookbehind assertion (?<=...)
which matches either a .
, ?
or !
, followed by a whitespace character \s
. This is followed by a group that matches one or more alphanumeric characters \w+
. In effect, matching the next word after the end of a sentence.
You can define a function that will capitalise regex match objects, and feed this function to sub()
:
def cap(match):
return(match.group().capitalize())
p.sub(cap, 'Your text here. this is fun! yay.')
You might want to do the same for another regex that matches the word at the beginning of a string:
p2 = re.compile(r'^\w+')
Or make the original regex even harder to read, by combining them:
p = re.compile(r'((?<=[\.\?!]\s)(\w+)|(^\w+))')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With