Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C Isolating "only strings" in a text file

Tags:

c

parsing

I have a text file, which has 1 word followed by ~100 float numbers. The float numbers are separated by space, tab, or newline. This format repeats several times throughout the text file.

For example, this is what the text file looks like:

one 0.00591 0.07272 -0.78274 ... 
0.0673 ...
0.0897 ...
two 0.0654 ...
0.07843 ...
0.0873 ...
three ...
...
...

My question is, how can I count the number of words in the file, I tried using fscanf but once it reads the first word, after that I have to skip all the floats till the next word.

Any help would be much appreciated.

Thanks.

like image 801
M. Averbach Avatar asked Feb 09 '23 01:02

M. Averbach


1 Answers

I'll give you a high-level overview of a possible solution letting you figure out yourself how to translate that into C.

  • Initialize a counter for the number of words (non-numbers) with zero.
  • Read the file line-by-line. For each line, repeat the following:
    • Tokenize the line into white-space separated words. For each word, repeat the following:
      • If the word can be parsed into a number, do nothing and continue.
      • Otherwise, increment the counter.

Some library functions that you might find useful:

  • getline to read a single line of input. It is not part of the official standard library but provided as an extension by many implementations, including GNU's libc. If you don't have it, you can roll your own using fgets and realloc.
  • strtok to tokenize a string, though it is a little awkward to use. If you want to tokenize yourself, you'll find isspace useful. You will want to replace white-space characters with NUL bytes so you can treat the characters between them as individual NUL terminated strings.
  • strtod to try parsing a character array into a double.

Instead of using a library function to parse a number into a double, you could also implement your own little finite automaton. This is a classical teaching example in automaton theory. See for example this lecture (scroll down for “The Language of Floating Point Numbers”).

like image 149
5gon12eder Avatar answered Feb 19 '23 02:02

5gon12eder