Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Separating alphabetic characters in C++ STL

Tags:

c++

algorithm

stl

I've been practicing C++ for a competition next week. And in the sample problem I've been working on, requires splitting of paragraphs into words. Of course, that's easy. But this problem is so weird, that the words like: isn't should be separated as well: isn and t. I know it's weird but I have to follow this.

I have a function split() that takes a constant char delimiter as one of the parameter. It's what I use to separate words from spaces. But I can't figure out this one. Even numbers like: phil67bs should be separated as phil and bs.

And no, I don't ask for full code. A pseudocode will do, or something that will help me understand what to do. Thanks!

PS: Please no recommendations for external libs. Just the STL. :)

like image 987
LOLcode Avatar asked Jan 28 '26 03:01

LOLcode


1 Answers

Filter out numbers, spaces and anything else that isn't a letter by using a proper locale. See this SO thread about treating everything but numbers as a whitespace. So use a mask and do something similar to what Jerry Coffin suggests but only for letters:

struct alphabet_only: std::ctype<char> 
{
    alphabet_only(): std::ctype<char>(get_table()) {}

    static std::ctype_base::mask const* get_table()
    {
        static std::vector<std::ctype_base::mask> 
            rc(std::ctype<char>::table_size,std::ctype_base::space);

        std::fill(&rc['A'], &rc['['], std::ctype_base::upper);
        std::fill(&rc['a'], &rc['{'], std::ctype_base::lower);
        return &rc[0];
    }
};

And, boom! You're golden.

Or... you could just do a transform:

char changeToLetters(const char& input){ return isalpha(input) ? input : ' '; }

vector<char> output;
output.reserve( myVector.size() );
transform( myVector.begin(), myVector.end(), insert_iterator(output), ptr_fun(changeToLetters) );

Which, um, is much easier to grok, just not as efficient as Jerry's idea.

Edit:

Changed 'Z' to '[' so that the value 'Z' is filled. Likewise with 'z' to '{'.

like image 140
wheaties Avatar answered Jan 29 '26 19:01

wheaties