So I am given a file with ten matrices, and I would like to read from file these matrices and save them into vectors/arrays, where each matrix is stored into either a vector or an array. However, the format of these matrices makes it hard for me to read the data(I'm not good with reading from input file).
the file has the following format. Elements of each matrix are separated by "," . Each row is separated by ";", and each matrix is separated by "|". For example three 2 by 2 matrices are as follows.
1,2;3,4|0,1;1,0|5,3;3,1|
And I just want to save matrices into three different vectors, but I am not sure how to do this.
I tried
while(getline(inFile,line)){
stringstream linestream(line);
string value;
while(getline(linestream, value, ','){
//save into vector
}
}
But this is obviously very crude, and only seperates data by comma. Is there a way to separate the data with multiple delimiters?
Thank you!
string line;
while(getline(infile, line, '|'))
{
stringstream rowstream(line);
string row;
while(getline(rowstream, row, ';'))
{
stringstream elementstream(row);
string element;
while(getline(elementstream, element, ','))
{
cout << element << endl;
}
}
}
Using above code you can build the logic to store individual element
as you like.
I use this own function to split a string to a vector of strings :
/**
* \brief Split a string in substrings
* \param sep Symbol separating the parts
* \param str String to be splitted
* \return Vector containing the splitted parts
* \pre The separator can not be 0
* \details Example :
* \code
* std::string str = "abc.def.ghi..jkl.";
* std::vector<std::string> split_str = split('.', str); // the vector is ["abc", "def", "ghi", "", "jkl", ""]
* \endcode
*/
std::vector<std::string> split(char sep, const std::string& str);
std::vector<std::string> split(char sep, const std::string& str)
{
assert(sep != 0 && "PRE: the separator is null");
std::vector<std::string> s;
unsigned long int i = 0;
for(unsigned long int j = 0; j < str.length(); ++j)
{
if(str[j] == sep)
{
s.push_back(str.substr(i, j - i));
i = j + 1;
}
}
s.push_back(str.substr(i, str.size() - i));
return s;
}
Then, expecting you have a class Matrix, you can do something like :
std::string matrices_str;
std::ifstream matrix_file(matrix_file_name.c_str());
matrix_file >> matrices_str;
const std::vector<std::string> matrices = split('|', matrices_str);
std::vector<Matrix<double> > M(matrices.size());
for(unsigned long int i = 0; i < matrices.size(); ++i)
{
const std::string& matrix = matrices[i];
const std::vector<std::string> rows = split(';', matrix);
for(unsigned long int j = 0; j < rows.size(); ++j)
{
const std::string& row = matrix[i];
const std::vector<std::string> elements = split(',', row);
for(unsigned long int k = 0; k < elements.size(); ++k)
{
const std::string& element = elements[k];
if(j == 0 && k == 0)
M[i].resize(rows.size(), elements.size());
std::istringstream iss(element);
iss >> M[i](j,k);
}
}
}
Or, compressed code :
std::string matrices_str;
std::ifstream matrix_file(matrix_file_name.c_str());
matrix_file >> matrices_str;
const std::vector<std::string> matrices = split('|', matrices_str);
std::vector<Matrix<double> > M(matrices.size());
for(unsigned long int i = 0; i < matrices.size(); ++i)
{
const std::vector<std::string> rows = split(';', matrices[i]);
for(unsigned long int j = 0; j < rows.size(); ++j)
{
const std::vector<std::string> elements = split(',', matrix[i]);
for(unsigned long int k = 0; k < elements.size(); ++k)
{
if(j == 0 && k == 0)
M[i].resize(rows.size(), elements[k].size());
std::istringstream iss(elements[k]);
iss >> M[i](j,k);
}
}
}
You can use finite state machine
concept. You need define states for each step.
Read one char and then decide what it is (number or delimiter).
Here is concept how you could do it.
For more reading check this on internet. text parsing
, finite state machine
, lexical analyzer
, formal grammar
enum State
{
DECIMAL_NUMBER,
COMMA_D,
SEMICOLON_D,
PIPE_D,
ERROR_STATE,
};
char GetChar()
{
// implement proper reading from file
static char* input = "1,2;3,4|0,1;1,0|5,3;3,1|";
static int index = 0;
return input[index++];
}
State GetState(char c)
{
if ( isdigit(c) )
{
return DECIMAL_NUMBER;
}
else if ( c == ',' )
{
return COMMA_D;
}
else if ( c == ';' )
{
return SEMICOLON_D;
}
else if ( c == '|' )
{
return PIPE_D;
}
return ERROR_STATE;
}
int main(char* argv[], int argc)
{
char c;
while ( c = GetChar() )
{
State s = GetState(c);
switch ( c )
{
case DECIMAL_NUMBER:
// read numbers
break;
case COMMA_D:
// append into row
break;
case SEMICOLON_D:
// next row
break;
case PIPE_D:
// finish one matrix
break;
case ERROR_STATE:
// syntax error
break;
default:
break;
}
}
return 0;
}
The example you have actually maps to a very simple byte machine.
Start with a zeroed matrix and something that keeps track where in the matrix you're writing. Read one character at a time. If the character is a digit, multiply the current number in the matrix by 10 and add the digit to it, if the character is a comma, advance to the next number in the row, if the character is a semi-colon go to the next row, if the character is a pipe, start a new matrix.
You might not want to do it exactly this way if the numbers are floating point. I'd save them in a buffer and use a standard method of parsing floating point numbers. But other than that you don't really need to keep much complex state or build a large parser. You might want to add error handling at a later stage, but even there the error handling is pretty trivial and only depends on the current character you're scanning.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With