Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tokenizing an arithmetic expression?

Tags:

c++

stl

Say I have:

3.14 + 3 * (7.7/9.8^32.9  )

I need to tokenize this input string:

3.14
+
3
*
(
7.7
/
9.8
^
32.9
)

Is there a convenient way to do this with a stringstream or something else from STL or should I look at the input 1 char at a time and do it myself?

like image 400
jmasterx Avatar asked Dec 06 '25 16:12

jmasterx


2 Answers

It depends on what you mean by 'convenient'. You can do it easily with a stringstream, but I don't know if that's what you're looking for:

#include <iostream>
#include <vector>
#include <sstream>

using namespace std;

struct token{
      char c;
      float f;
      bool number;

      token():number(false),c(0){};
};

vector<token> split(string input)
{
   stringstream parser(input);
   vector<token> output;
   while(parser)
   {
      token t;
      if(isalnum(parser.peek()))
         parser >> t.f;
      else
         parser >> t.c;
      t.number = (t.c==0); 
      output.push_back(t);
   }
   output.pop_back();
   return output;
}

int main()
{
    string input = "3.14 + 3 * (7.7/9.8^32.9  )";
    vector<token> tokens = split(input);
    for(unsigned int i=0;i<tokens.size();i++)
    {
       if(tokens[i].number) cout << "number: " << tokens[i].f << endl;
       else cout << "sign: " << tokens[i].c << endl;
    }
}
like image 126
Paweł Stawarz Avatar answered Dec 09 '25 22:12

Paweł Stawarz


Normally you would use Flex/Bison to generate a simple lexer and optionally parser. Or if you are after C++ compiler only solution - Boost.Spirit (example). No pure STL solution exists, I believe, which you would like.

I favor Flex/Bison approach, so your tokenizer written in Flex would be:

%{

#include <iostream>
#include <memory>

%}


%option prefix="Calc"
%option noyywrap
%option c++

ws      [ \t]+

dig     [0-9]
num1    [-+]?{dig}+\.?([eE][-+]?{dig}+)?
num2    [-+]?{dig}*\.{dig}+([eE][-+]?{dig}+)?
number  {num1}|{num2}

%%

{ws} /* skip */
{number} std::cout << "=> number " << YYText() << '\n';

"+"|"-"|"*"|"/"|"^" std::cout << "=> operator " << YYText() << '\n';

"("|")" std::cout << "=> parenthesis " << YYText() << '\n';

. std::cout << "=> unknown " << YYText() << '\n';

%%

int main( int argc, char **argv )
{
    std::unique_ptr<FlexLexer> lexer(new CalcFlexLexer);
    while(lexer->yylex());
    return 0;
}

And compilation command line:

$ flex calc.l
$ g++-4.7 -std=c++11 -o calc lex.Calc.cc
$ calc
1 + (2e4^3)
=> number 1
=> operator +
=> parenthesis (
=> number 2e4
=> operator ^
=> number 3
=> parenthesis )
like image 43
bobah Avatar answered Dec 09 '25 21:12

bobah