I am trying to read in a file and remove all punctuation from the file. I've been using ispunct() to iterate through the string and check if the character is a punctuation but it doesn't seem to catch all the punctuations. I wanted to know if i am doing something wrong. Here is my code:
How are you?
I'm fine, thanks.
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
using namespace std;
//removes punctuation, numbers, and extra spaces
void removeNonAlph(string &tmp)
{
for(int i = 0; i < tmp.length(); i++)
{
if (ispunct(tmp[i]))
tmp.erase(i--, 1);
else if (isdigit(tmp[i]))
tmp.erase(i--, 1);
else if ((tmp[i] == ' ') && (tmp[i+1]) == ' ')
tmp.erase(i--, 1);
}
}
int main(int argc, const char * argv[])
{
ifstream file("2.txt");
string tmp;
string words[500];
while (getline(file, tmp))
{
removeNonAlph(tmp);
toLower(tmp);
cout << tmp << endl;
}
file.close();
}
how are you
i'm fine thanks
(Comments moved to answer for easy discovery by future readers)
Beware editors putting non-ASCII quotes into your text files. Many editors generate "smart quotes" that look nicer by showing the right and left quote differently, rendered using different non-ASCII character codes. ispunct normally only works for ASCII input.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With