Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best Practices for uploading files to database

I am looking for any best practices or ideas on how you would create an interface with a DB from a .NET web application to upload data from Excel files Should I use a mechanism that allows all the records to be loaded and flags the errors or should I use a mechanism that stops the load when an error occurs.

I've never had to deal with this type of requirement before so any help would be super!

Thanks

like image 954
WACM161 Avatar asked Nov 13 '08 17:11

WACM161


3 Answers

I would try the following approach which has worked well in the past.

  1. Allow the user to upload the file, put it somewhere on disk.
  2. Bind the results of the file to some grid (you can connect to Excel files using ODBC/OLE DB using traditional Connection/Command objects).
  3. Apply validation to the rows in the grid based on some set of business rules (excel data is normally quite dirty).
  4. Allow the user to update values in the grid and correct validation issues.
  5. When all the data is kosher and the user is happy with it perform a bulk insert in a transaction.
  6. If anything "bad" happens rollback and present some user feedback.
like image 64
Tyler Avatar answered Oct 12 '22 09:10

Tyler


You should upload the data and then flag it if it fails validation checks. For actually loading the data, you have a few options:

  • The ADO.Net bulk-load API - use the bulk load API to put it in a staging table. The snippet below shows a process to open a .CSV file and programatically load it into a staging table.

.

  public void Load() {
        bool OK = File.Exists(_filename);
        if (OK) {
            string sql = String.Format("Select * from {0}", FileName);
            OleDbConnection csv = new OleDbConnection();
            OleDbCommand cmd = new OleDbCommand(sql, csv);
            OleDbDataReader rs = null;
            SqlConnection db = null;
            SqlCommand clear = null;

            SqlBulkCopy bulk_load = null;
            try {
                    // Note two connections: one from the csv file
                    // and one to the database;
                    csv = new OleDbConnection();
                    csv.ConnectionString = ConnectionString;
                    csv.Open();
                    cmd = new OleDbCommand(sql, csv);
                    rs = cmd.ExecuteReader();

                    // Dung out the staging table
                    db = // [Create A DB conneciton Here]
                    clear = new SqlCommand("Truncate table Staging", db); // Left to the reader
                    clear.ExecuteNonQuery();

                   // Import into the staging table
                    bulk_load = new SqlBulkCopy(db);
                    bulk_load.DestinationTableName = Destination; // Actually an instance var
                    bulk_load.WriteToServer(rs);
                } catch (Exception ee) {
                    string summary = ee.Message;
                    string detail = ee.StackTrace;
                    //Notify(DisplayType.error, summary, detail);
                } finally {
                    if (rs != null) rs.Close();
                    if (csv != null) csv.Close();
                    if (bulk_load != null) bulk_load.Close();
                }
            }
        }
  • Use BCP or SSIS to import it, either directly from the spreadsheet or from a .CSV file.
like image 20
ConcernedOfTunbridgeWells Avatar answered Oct 12 '22 10:10

ConcernedOfTunbridgeWells


If data integrity in your DB is important, do not allow data to be imported that has errors or does not meet the validation requirements of your DB.

Since these are Excel files, it should be easy enough for the user to correct the data in the Excel file, instead of trying to use another interface to fix the data. Just make sure the error messages direct the user to what field is the problem and clearly explain what is wrong.

like image 1
DCNYAM Avatar answered Oct 12 '22 09:10

DCNYAM