Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sql Server 2008 Tuning with large transactions (700k+ rows/transaction)

So, I'm working on a database that I will be adding to my future projects as sort of a supporting db, but I'm having a bit of an issue with it, especially the logs.

The database basically needs to be updated once a month. The main table has to be purged and then refilled off of a CSV file. The problem is that Sql Server will generate a log for it which is MEGA big. I was successful in filling it up once, but wanted to test the whole process by purging it and then refilling it.

That's when I get an error that the log file is filled up. It jumps from 88MB (after shrinking via maintenance plan) to 248MB and then stops the process altogether and never completes.

I've capped it's growth at 256MB, incrementing by 16MB, which is why it failed, but in reality I don't need it to log anything at all. Is there a way to just completely bypass logging on any query being run against the database?

Thanks for any responses in advance!

EDIT: Per the suggestions of @mattmc3 I've implemented SqlBulkCopy for the whole procedure. It works AMAZING, except, my loop is somehow crashing on the very last remaining chunk that needs to be inserted. I'm not too sure where I'm going wrong, heck I don't even know if this is a proper loop, so I'd appreciate some help on it.

I do know that its an issue with the very last GetDataTable or SetSqlBulkCopy calls. I'm trying to insert 788189 rows, 788000 get in and the remaining 189 are crashing...

string[] Rows;

using (StreamReader Reader = new StreamReader("C:/?.csv")) {
    Rows = Reader.ReadToEnd().TrimEnd().Split(new char[1] {
        '\n'
     }, StringSplitOptions.RemoveEmptyEntries);
};

int RowsInserted = 0;

using (SqlConnection Connection = new SqlConnection("")) {
    Connection.Open();

    DataTable Table = null;

    while ((RowsInserted < Rows.Length) && ((Rows.Length - RowsInserted) >= 1000)) {
        Table = GetDataTable(Rows.Skip(RowsInserted).Take(1000).ToArray());

        SetSqlBulkCopy(Table, Connection);

        RowsInserted += 1000;
    };

    Table = GetDataTable(Rows.Skip(RowsInserted).ToArray());

    SetSqlBulkCopy(Table, Connection);

    Connection.Close();
};

static DataTable GetDataTable(
    string[] Rows) {
    using (DataTable Table = new DataTable()) {
        Table.Columns.Add(new DataColumn("A"));
        Table.Columns.Add(new DataColumn("B"));
        Table.Columns.Add(new DataColumn("C"));
        Table.Columns.Add(new DataColumn("D"));

        for (short a = 0, b = (short)Rows.Length; a < b; a++) {
            string[] Columns = Rows[a].Split(new char[1] {
                ','
            }, StringSplitOptions.RemoveEmptyEntries);

            DataRow Row = Table.NewRow();

            Row["A"] = Columns[0];
            Row["B"] = Columns[1];
            Row["C"] = Columns[2];
            Row["D"] = Columns[3];

            Table.Rows.Add(Row);
        };

        return (Table);
    };
}

static void SetSqlBulkCopy(
    DataTable Table,
    SqlConnection Connection) {
    using (SqlBulkCopy SqlBulkCopy = new SqlBulkCopy(Connection)) {
        SqlBulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("A", "A"));
        SqlBulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("B", "B"));
        SqlBulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("C", "C"));
        SqlBulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("D", "D"));

        SqlBulkCopy.BatchSize = Table.Rows.Count;
        SqlBulkCopy.DestinationTableName = "E";
        SqlBulkCopy.WriteToServer(Table);
    };
}

EDIT/FINAL CODE: So the app is now finished and works AMAZING, and quite speedy! @mattmc3, thanks for all the help! Here is the final code for anyone who may find it useful:

List<string> Rows = new List<string>();

using (StreamReader Reader = new StreamReader(@"?.csv")) {
    string Line = string.Empty;

    while (!String.IsNullOrWhiteSpace(Line = Reader.ReadLine())) {
        Rows.Add(Line);
    };
};

if (Rows.Count > 0) {
    int RowsInserted = 0;

    DataTable Table = new DataTable();

    Table.Columns.Add(new DataColumn("Id"));
    Table.Columns.Add(new DataColumn("A"));

    while ((RowsInserted < Rows.Count) && ((Rows.Count - RowsInserted) >= 1000)) {
        Table = GetDataTable(Rows.Skip(RowsInserted).Take(1000).ToList(), Table);

        PerformSqlBulkCopy(Table);

        RowsInserted += 1000;

        Table.Clear();
    };

    Table = GetDataTable(Rows.Skip(RowsInserted).ToList(), Table);

    PerformSqlBulkCopy(Table);
};

static DataTable GetDataTable(
    List<string> Rows,
    DataTable Table) {
    for (short a = 0, b = (short)Rows.Count; a < b; a++) {
        string[] Columns = Rows[a].Split(new char[1] {
            ','
        }, StringSplitOptions.RemoveEmptyEntries);

        DataRow Row = Table.NewRow();

        Row["A"] = "";

        Table.Rows.Add(Row);
    };

    return (Table);
}

static void PerformSqlBulkCopy(
    DataTable Table) {
    using (SqlBulkCopy SqlBulkCopy = new SqlBulkCopy(@"", SqlBulkCopyOptions.TableLock)) {
        SqlBulkCopy.BatchSize = Table.Rows.Count;
        SqlBulkCopy.DestinationTableName = "";
        SqlBulkCopy.WriteToServer(Table);
    };
}
like image 281
Gup3rSuR4c Avatar asked Jul 29 '10 01:07

Gup3rSuR4c


2 Answers

If you are doing a Bulk Insert into the table in SQL Server, which is how you should be doing this (BCP, Bulk Insert, Insert Into...Select, or in .NET, the SqlBulkCopy class) you can use the "Bulk Logged" recovery model. I highly recommend reading the MSDN articles on recovery models: http://msdn.microsoft.com/en-us/library/ms189275.aspx

like image 70
mattmc3 Avatar answered Oct 21 '22 14:10

mattmc3


You can set the Recover model for each database separately. Maybe the simple recovery model will work for you. The simple model:

Automatically reclaims log space to keep space requirements small, essentially eliminating the need to manage the transaction log space.

Read up on it here.

like image 2
Aheho Avatar answered Oct 21 '22 13:10

Aheho