Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ispunct() does not detect single quote character

Tags:

c++

I am trying to read in a file and remove all punctuation from the file. I've been using ispunct() to iterate through the string and check if the character is a punctuation but it doesn't seem to catch all the punctuations. I wanted to know if i am doing something wrong. Here is my code:

2.txt

How are you?

I'm fine, thanks.

#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
using namespace std;

//removes punctuation, numbers, and extra spaces
void removeNonAlph(string &tmp)
{
     for(int i = 0; i < tmp.length(); i++)
     {
         if (ispunct(tmp[i]))
             tmp.erase(i--, 1);
         else if (isdigit(tmp[i]))
             tmp.erase(i--, 1);
         else if ((tmp[i] == ' ') && (tmp[i+1]) == ' ')
             tmp.erase(i--, 1);
     }
 }

int main(int argc, const char * argv[]) 
{

    ifstream file("2.txt");
    string tmp;
    string words[500];

    while (getline(file, tmp))
    {
        removeNonAlph(tmp);
        toLower(tmp);
        cout << tmp << endl;
    }

    file.close();
}

Output:

how are you

i'm fine thanks

like image 347
Brenda Gonzalez Avatar asked Apr 24 '26 17:04

Brenda Gonzalez


1 Answers

(Comments moved to answer for easy discovery by future readers)

Beware editors putting non-ASCII quotes into your text files. Many editors generate "smart quotes" that look nicer by showing the right and left quote differently, rendered using different non-ASCII character codes. ispunct normally only works for ASCII input.

like image 178
Tony Delroy Avatar answered May 01 '26 02:05

Tony Delroy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!