<h3>In an pre-interview, I am faced with a question like this:</h3> Given a string consists of words separated by a single white space, print out the words in descending order sorted by the number of times they appear in the string. For example an input string of “a b b” would generate the following output: <pre class="prettyprint"><code>b : 2 a : 1 </code></pre> Firstly, I'd say it is not so clear that whether the input string is made up of single-letter words or multiple-letter words. If the former is the case, it could be simple. Here is my thought: <pre class="prettyprint"><code>int c[26] = {0}; char *pIn = strIn; while (*pIn != 0 && *pIn != ' ') { ++c[*pIn]; ++pIn; } /* how to sort the array c[26] and remember the original index? */ </code></pre> I can get the statistics of the frequecy of every single-letter word in the input string, and I can get it sorted (using QuickSort or whatever). But after the count array is sorted, how to get the single-letter word associated with the count so that I can print them out in pair later? If the input string is made of of multiple-letter word, I plan to use a <code>map<const char *, int></code> to track the frequency. But again, how to sort the map's key-value pair? The question is in C or C++, and any suggestion is welcome. Thanks!

I would use a <code>std::map<std::string, int></code> to store the words and their counts. Then I would use something this to get the words: <pre class="prettyprint"><code>while(std::cin >> word) { // increment map's count for that word } </code></pre> finally, you just need to figure out how to print them in order of frequency, I'll leave that as an exercise for you.

Word Frequency Statistics

In an pre-interview, I am faced with a question like this:

Given a string consists of words separated by a single white space, print out the words in descending order sorted by the number of times they appear in the string.

For example an input string of “a b b” would generate the following output:

b : 2
a : 1

Firstly, I'd say it is not so clear that whether the input string is made up of single-letter words or multiple-letter words. If the former is the case, it could be simple.

Here is my thought:

int c[26] = {0};
char *pIn = strIn;

while (*pIn != 0 && *pIn != ' ')
{
    ++c[*pIn];
    ++pIn;
}

/* how to sort the array c[26] and remember the original index? */

I can get the statistics of the frequecy of every single-letter word in the input string, and I can get it sorted (using QuickSort or whatever). But after the count array is sorted, how to get the single-letter word associated with the count so that I can print them out in pair later?

If the input string is made of of multiple-letter word, I plan to use a map<const char *, int> to track the frequency. But again, how to sort the map's key-value pair?

The question is in C or C++, and any suggestion is welcome.

Thanks!

875

asked Dec 30 '11 15:12

Qiang Xu

2 Answers

I would use a std::map<std::string, int> to store the words and their counts. Then I would use something this to get the words:

while(std::cin >> word) {
    // increment map's count for that word
}

finally, you just need to figure out how to print them in order of frequency, I'll leave that as an exercise for you.

159

answered Oct 17 '22 01:10

Evan Teran

You're definitely wrong in assuming that you need only 26 options, 'cause your employer will want to allow multiple-character words as well (and maybe even numbers?).

This means you're going to need an array with a variable length. I strongly recommend using a vector or, even better, a map.

To find the character sequences in the string, find your current position (start at 0) and the position of the next space. Then that's the word. Set the current position to the space and do it again. Keep repeating this until you're at the end.

By using the map you'll already have the word/count available.

If the job you're applying for requires university skills, I strongly recommend optimizing the map by adding some kind of hashing function. However, judging by the difficulty of the question I assume that that is not the case.

answered Oct 17 '22 01:10

Tom van der Woerdt

Related questions
                            
                                Address of a reference to first item of an array
                            
                                how to detect orientation of a scanned document?
                            
                                Default-initialized vs. Value-initialized
                            
                                Not Understanding Pointer Arithmetic with ++ and --
                            
                                c++ function map implementation
                            
                                EventMachine gem workaround causes missing dll file ruby error, Windows 7
                            
                                Porting existing C++ code to R
                            
                                .net wrapper for native dll - how to minimize risk of run-time error?
                            
                                Xcode 4.2 won't recognize C++ raw string literals?
                            
                                What's a good way to handle builtin functions in an interpreter written in C++?
                            
                                Creating in c#,c++ and java a strong typed version of a python weak typed structure
                            
                                Linker error when using an extern template
                            
                                In an OpenCV application, how do I identify the source of memory leak and fix it?
                            
                                Qt - Remove shortcut -- Ambiguous shortcut overload
                            
                                source-specific multicast using boost
                            
                                Is there a way to get the database location from a sqlite3 object?
                            
                                In pursuit of a better bitflag enum
                            
                                Does *&++i cause undefined behaviour in C++03?
                            
                                Why all java methods are implicitly overridable?
                            
                                C++ Vector initial capacity

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Word Frequency Statistics

Tags:

c++

c

word-frequency