Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is NHibernate refusing to batch inserts?

Using NHibernate 2.1.2.4000 against SQL Server 2008. The target table has no triggers or extraneous indexes. It is simply:

create table LogEntries (
    Id INT IDENTITY NOT NULL,
   HostName NVARCHAR(32) not null,
   UserName NVARCHAR(64) not null,
   LogName NVARCHAR(512) not null,
   Timestamp DATETIME not null,
   Level INT not null,
   Thread NVARCHAR(64) not null,
   Message NVARCHAR(MAX) not null,
   primary key (Id)
)

My entity mapping is:

<class name="LogEntry" table="LogEntries">
    <id name="Id" unsaved-value="0">
        <generator class="native"/>
    </id>
    <property name="HostName" length="32" not-null="true"/>
    <property name="UserName" length="64" not-null="true"/>
    <property name="LogName" length="512" not-null="true"/>
    <property name="Timestamp" type="utcdatetime" not-null="true"/>
    <property name="Level" not-null="true"/>
    <property name="Thread" length="64" not-null="true"/>
    <property name="Message">
        <column name="Message" sql-type="NVARCHAR(MAX)" not-null="true"/>
    </property>
</class>

Now, consider the following test case:

[Fact]
public void bulk_insert_test()
{
    var batchSize = 100;
    var numberItems = 10000;

    var configuration = new NHibernate.Cfg.Configuration().Configure();
    configuration.SetProperty("connection.connection_string", @"my_conn_string");
    configuration.SetProperty("adonet.batch_size", batchSize.ToString());
    var sessionFactory = configuration.BuildSessionFactory();

    var ts = this.WriteWithNH(sessionFactory, numberItems);
    ////var ts = this.WriteWithBC(sessionFactory, numberItems, batchSize);

    Console.WriteLine("Saving {0} items with batch size {1}: {2}", numberItems, batchSize, ts);
}

public TimeSpan WriteWithNH(ISessionFactory sessionFactory, int numberItems)
{
    using (var session = sessionFactory.OpenStatelessSession())
    using (var transaction = session.BeginTransaction())
    {
        session.Insert(new LogEntry()
        {
            HostName = "host",
            UserName = "user",
            LogName = "log",
            Level = 0,
            Thread = "thread",
            Timestamp = DateTime.UtcNow,
            Message = "Warm up"
        });

        transaction.Commit();
    }

    var sw = Stopwatch.StartNew();

    using (var session = sessionFactory.OpenStatelessSession())
    using (var transaction = session.BeginTransaction())
    {
        for (var i = 0; i < numberItems; ++i)
        {
            session.Insert(new LogEntry()
            {
                HostName = "host",
                UserName = "user",
                LogName = "log",
                Level = 0,
                Thread = "thread",
                Timestamp = DateTime.UtcNow,
                Message = "Message " + i
            });
        }

        transaction.Commit();
    }

    return sw.Elapsed;
}

public TimeSpan WriteWithBC(ISessionFactory sessionFactory, int numberItems, int batchSize)
{
    using (var session = sessionFactory.OpenStatelessSession())
    using (var bulkCopy = new SqlBulkCopy((SqlConnection)session.Connection))
    {
        bulkCopy.BatchSize = batchSize;
        bulkCopy.DestinationTableName = "LogEntries";
        var table = new DataTable("LogEntries");
        table.Columns.Add("Id", typeof(int));
        table.Columns.Add("HostName", typeof(string));
        table.Columns.Add("UserName", typeof(string));
        table.Columns.Add("LogName", typeof(string));
        table.Columns.Add("Timestamp", typeof(DateTime));
        table.Columns.Add("Level", typeof(int));
        table.Columns.Add("Thread", typeof(string));
        table.Columns.Add("Message", typeof(string));

        var row = table.NewRow();
        row["HostName"] = "host";
        row["UserName"] = "user";
        row["LogName"] = "log";
        row["Timestamp"] = DateTime.UtcNow;
        row["Level"] = 0L;
        row["Thread"] = "thread";
        row["Message"] = "Warm up";
        table.Rows.Add(row);

        bulkCopy.WriteToServer(table);
    }

    var sw = Stopwatch.StartNew();

    using (var session = sessionFactory.OpenStatelessSession())
    using (var bulkCopy = new SqlBulkCopy((SqlConnection)session.Connection))
    {
        bulkCopy.BatchSize = batchSize;
        bulkCopy.DestinationTableName = "LogEntries";
        var table = new DataTable("LogEntries");
        table.Columns.Add("Id", typeof(int));
        table.Columns.Add("HostName", typeof(string));
        table.Columns.Add("UserName", typeof(string));
        table.Columns.Add("LogName", typeof(string));
        table.Columns.Add("Timestamp", typeof(DateTime));
        table.Columns.Add("Level", typeof(int));
        table.Columns.Add("Thread", typeof(string));
        table.Columns.Add("Message", typeof(string));

        for (var i = 0; i < numberItems; ++i)
        {
            var row = table.NewRow();
            row["HostName"] = "host";
            row["UserName"] = "user";
            row["LogName"] = "log";
            row["Timestamp"] = DateTime.UtcNow;
            row["Level"] = 0;
            row["Thread"] = "thread";
            row["Message"] = "Message " + i;
            table.Rows.Add(row);
        }

        bulkCopy.WriteToServer(table);
    }

    return sw.Elapsed;
}

Here is some sample output when using NHibernate to perform the inserts:

Saving 10000 items with batch size 500: 00:00:12.3064936
Saving 10000 items with batch size 100: 00:00:12.3600981
Saving 10000 items with batch size 1: 00:00:12.8102670

As a point of comparison, you'll see I also implemented a BCP-based solution. Here is some sample output:

Saving 10000 items with batch size 500: 00:00:00.3142613
Saving 10000 items with batch size 100: 00:00:00.6757417
Saving 10000 items with batch size 1: 00:00:26.2509605

Clearly, the BCP solution is miles faster than the NH one. Also evident is that batching is affecting the speed of the BCP solution, but not the NH one. When using NHibernate to do inserts, NHProf shows the following:

alt text http://img9.imageshack.us/img9/8407/screenshotac.png

There are only INSERTs, no SELECTs. Interestingly, at no point does NHProf give me this warning.

I have tried specifying adonet.batch_size in both my config file and in code as per the test case above.

Now, I'm not expecting the NH solution to ever reach the speed of the BCP solution, but I'd at least like to know why batching isn't working. If it's good enough with batching enabled, then I may use the NH solution over the BCP just to keep the code base simpler.

Can anyone explain why NH is refusing to honor ADO.NET batching, and what I can do to fix it? All the scattered NH "documentation" I've read states that all you need to do is specify adonet.batch_size and (preferably) use a stateless session, but I'm doing both those things.

Thanks

like image 894
Kent Boogaart Avatar asked Dec 07 '22 02:12

Kent Boogaart


2 Answers

Using identity breaks batching.

Explained by Fabio Maulo here

The best alternative is switching to a different generator (I always recommend hilo or guid.comb)

like image 88
Diego Mijelshon Avatar answered Feb 15 '23 08:02

Diego Mijelshon


Neither ISession or IStatelessSession will be able to batch inserts if you are using identity as primary key.

When Insert is made Nhibernate will put the correct value in your Id property. But when you are using Identity the only place where this Id can be taken is the data base. Use hilo to make batch inserts.

like image 29
Sly Avatar answered Feb 15 '23 09:02

Sly