Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how can i correctly parse a text file delimited by white space

Below is my sample text file

enter image description here {

Here is my schema file

[Sample File.txt]
ColNameHeader=True
Format=TabDelimited
CharacterSet=ANSI

And here is the code i have so far writen to try and read the above sample file, the data rows read from the text file above is supposed to be returned for display in a dataGridView control. The problem is, its being returned as single column, yet i want to use those white spaces as the column delimiters. I have tried different character delimiters with out success.

public DataSet LoadCSV(int numberOfRows)
    {
        DataSet ds = new DataSet();
            // Creates and opens an ODBC connection
            string strConnString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + this.dirCSV.Trim() + ";Extensions=asc,csv,tab,txt;Persist Security Info=False";

            string sql_select;
            OdbcConnection conn;
            conn = new OdbcConnection(strConnString.Trim());
            conn.Open();

            //Creates the select command text
            if (numberOfRows == -1)
            {
                sql_select = "select * from [" + this.FileNevCSV.Trim() + "]";
            }
            else
            {
                sql_select = "select top " + numberOfRows + " * from [" + this.FileNevCSV.Trim() + "]";
            }

            //Creates the data adapter
            OdbcDataAdapter obj_oledb_da = new OdbcDataAdapter(sql_select, conn);

            //Fills dataset with the records from CSV file
            obj_oledb_da.Fill(ds, "csv");

            //closes the connection
            conn.Close();

        return ds;
    }

And setting the dataGridView's data source like to

    // loads the first 500 rows from CSV file
this.dataGridView_preView.DataSource = LoadCSV(500);
this.dataGridView_preView.DataMember = "csv";

i, get this in the datagridview, i get one column yet i expect to see the data returned as 7 columns.

Plus, i have no idea where F2 and F3 columns are coming from

enter image description here

like image 476
StackTrace Avatar asked Nov 03 '22 19:11

StackTrace


1 Answers

I would probably do this a different way. I would use a StreamReader, and read in the file line by line, break the string up into object properties, and store the objects in a list. Then you bind the list to the datagridviews datasource. I demonstrate two quick ways to do this.

1 -Tab separated data

If the file is tab separated, as it seems to be, split the line into an array and assign each index with to a property like so.

public partial class Form1 : Form
{
    private void Form1_Load(object sender, EventArgs e)
    {
        var rows = new List<Row>();
        var sr = new StreamReader(@"C:\so_test.txt");
        while (!sr.EndOfStream)
        {
            string s = sr.ReadLine();
            if (!String.IsNullOrEmpty(s.Trim()))
            {
                rows.Add(new Row(s));
            }
        }
        sr.Close();
        dataGridView1.DataSource = rows;
    }
}

public class Row
{
    public double Number1 { get; set; }
    public double Number2 { get; set; }
    public double Number3 { get; set; }
    public double Number4 { get; set; }
    public double Number5 { get; set; }
    public double Number6 { get; set; }
    public double Number7 { get; set; }
    public string Date1 { get; set; }

    public Row(string str)
    {
        string[] separator = { "\t" };
        var arr = str.Split(separator, StringSplitOptions.None);
        Number1 = Convert.ToDouble(arr[0]);
        Number2 = Convert.ToDouble(arr[1]);
        Number3 = Convert.ToDouble(arr[2]);
        Number4 = Convert.ToDouble(arr[3]);
        Number5 = Convert.ToDouble(arr[4]);
        Number6 = Convert.ToDouble(arr[5]);
        Number7 = Convert.ToDouble(arr[6]);
        Date1 = arr[7];
    }
}

2 -Hard Start points and lengths

If the data is tab separated, but conforms to strict start and endpoints for each column, you could declare the startpoints and lengths for each column as constants and get those via substring. This would only need a change in code in your Row class, like this. I have left of the constants from brevity, and just hardcoded them.

    public Row(string str)
    {
        Number1 = Convert.ToDouble(str.Substring(4, 6));
        Number2 = Convert.ToDouble(str.Substring(16, 6));
        Number3 = Convert.ToDouble(str.Substring(28, 7));
        Number4 = Convert.ToDouble(str.Substring(40, 7));
        Number5 = Convert.ToDouble(str.Substring(52, 6));
        Number6 = Convert.ToDouble(str.Substring(64, 6));
        Number7 = Convert.ToDouble(str.Substring(76, 6));
        Date1 = str.Substring(88, 24);
    }

Screenshot

like image 162
GrayFox374 Avatar answered Nov 14 '22 00:11

GrayFox374