Say I have:
3.14 + 3 * (7.7/9.8^32.9 )
I need to tokenize this input string:
3.14
+
3
*
(
7.7
/
9.8
^
32.9
)
Is there a convenient way to do this with a stringstream or something else from STL or should I look at the input 1 char at a time and do it myself?
It depends on what you mean by 'convenient'. You can do it easily with a stringstream, but I don't know if that's what you're looking for:
#include <iostream>
#include <vector>
#include <sstream>
using namespace std;
struct token{
char c;
float f;
bool number;
token():number(false),c(0){};
};
vector<token> split(string input)
{
stringstream parser(input);
vector<token> output;
while(parser)
{
token t;
if(isalnum(parser.peek()))
parser >> t.f;
else
parser >> t.c;
t.number = (t.c==0);
output.push_back(t);
}
output.pop_back();
return output;
}
int main()
{
string input = "3.14 + 3 * (7.7/9.8^32.9 )";
vector<token> tokens = split(input);
for(unsigned int i=0;i<tokens.size();i++)
{
if(tokens[i].number) cout << "number: " << tokens[i].f << endl;
else cout << "sign: " << tokens[i].c << endl;
}
}
Normally you would use Flex/Bison to generate a simple lexer and optionally parser. Or if you are after C++ compiler only solution - Boost.Spirit (example). No pure STL solution exists, I believe, which you would like.
I favor Flex/Bison approach, so your tokenizer written in Flex would be:
%{
#include <iostream>
#include <memory>
%}
%option prefix="Calc"
%option noyywrap
%option c++
ws [ \t]+
dig [0-9]
num1 [-+]?{dig}+\.?([eE][-+]?{dig}+)?
num2 [-+]?{dig}*\.{dig}+([eE][-+]?{dig}+)?
number {num1}|{num2}
%%
{ws} /* skip */
{number} std::cout << "=> number " << YYText() << '\n';
"+"|"-"|"*"|"/"|"^" std::cout << "=> operator " << YYText() << '\n';
"("|")" std::cout << "=> parenthesis " << YYText() << '\n';
. std::cout << "=> unknown " << YYText() << '\n';
%%
int main( int argc, char **argv )
{
std::unique_ptr<FlexLexer> lexer(new CalcFlexLexer);
while(lexer->yylex());
return 0;
}
And compilation command line:
$ flex calc.l
$ g++-4.7 -std=c++11 -o calc lex.Calc.cc
$ calc
1 + (2e4^3)
=> number 1
=> operator +
=> parenthesis (
=> number 2e4
=> operator ^
=> number 3
=> parenthesis )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With