Memory Optimized Tables - Slower INSERT than SSD

Tags:

I have observed that inserting data into a Memory-Optimized Table is much slower than an equivalent parallelized insert to a disk-based table on a 5-SSD stripe set.

--DDL for Memory-Optimized Table    
CREATE TABLE [MYSCHEMA].[WIDE_MEMORY_TABLE]
        (
        [TX_ID] BIGINT NOT NULL
        , [COLUMN_01] [NVARCHAR](10) NOT NULL
        , [COLUMN_02] [NVARCHAR] (10) NOT NULL
        --etc., about 100 columns
        --at least one index is required for Memory-Optimized Tables
        , INDEX IX_WIDE_MEMORY_TABLE_ENTITY_ID HASH (TX_ID) WITH (BUCKET_COUNT=10000000)
        )
        WITH (MEMORY_OPTIMIZED=ON, DURABILITY=SCHEMA_ONLY)

--DDL for Disk-Based Table
CREATE TABLE [MYSCHEMA].[WIDE_DISK_TABLE]
        (
        [TX_ID] BIGINT NOT NULL
        , [COLUMN_01] [NVARCHAR](10) NOT NULL
        , [COLUMN_02] [NVARCHAR] (10) NOT NULL
        --etc., about 100 columns
        --No indexes
        ) ON [PRIMARY]

For this particular test, I am batching 10,000,000 rows into this table in sets of 25,000. The statement looks something like this for the Memory-Optimized Table:

    --Insert to Memory-Optimized Table 
    INSERT INTO
        WIDE_MEMORY_TABLE
        (
        TX_ID
        , COLUMN_01
        , COLUMN_02
        --etc., about 100 columns
        )
    SELECT
        S.COLUMN_01
        , S.COLUMN_02
        --etc., about 100 columns
    FROM
        [MYSCHEMA].[SOURCE_TABLE] AS S WITH(TABLOCK)
    WHERE
        S.TX_ID >= 1
        AND S.TX_ID < 25001
    OPTION (MAXDOP 4)

This process continues to load 10,000,000 rows. Each iteration just retrieves the next 25,000 rows. The SELECT performs a seek on a covering index on [MY_SCHEMA].[SOURCE_TABLE]. The query plan shows a serialized insert to BIG_MEMORY_TABLE. Each set of 25,000 rows takes around 1400ms.

If I do this to a disk-based table, hosted on a 5-SSD stripe (5,000 IOPS per disk, 200MB/sec throughput), the inserts progress much faster, averaging around 700ms. In the disk-based case, the query performs a parallel insert to [MY_SCHEMA].[WIDE_DISK_TABLE]. Note the TABLOCK hint on [MYSCHEMA].[WIDE_DISK_TABLE].

    --Insert to Disk-Based Table 
    INSERT INTO
        WIDE_DISK_TABLE WITH(TABLOCK)
        (
        TX_ID
        , COLUMN_01
        , COLUMN_02
        --etc., about 100 columns
        )
    SELECT
        S.COLUMN_01
        , S.COLUMN_02
        --etc., about 100 columns
    FROM
        [MYSCHEMA].[SOURCE_TABLE] AS S WITH(TABLOCK)
    WHERE
        S.TX_ID >= 1
        AND S.TX_ID < 25001
    OPTION (MAXDOP 4)

Granted, the disk-based table does not have an index, and the TABLOCK hint enables parallel insert, but I expect way more from an INSERT to RAM.

Any ideas?

Thanks!

Here is a comparison of 100 batches run in 3 modes: Disk-based, with Deferred Index creation, Disk-based with Index, and Memory-Optimized with Index (at least one index is required on Memory-Optimized tables).

100 Batches, 25K rows each

473

asked Feb 03 '18 22:02

Pittsburgh DBA

1 Answers

UPDATE

After much testing and research, I believe this comes down to parallelism. At this time, SQL Server 2016, up to and including SP1 CU7, does not support parallel insert to Memory-Optimized Tables. This makes all INSERT statements to Memory-Optimized tables single-threaded.

Here is an insightful article from Niko Neugebauer regarding this issue: Niko Neugebauer - Parallelism in Hekaton (In-Memory OLTP)

This makes it much less useful for ETL/ELT ingestion. However, it is pretty amazing for OLTP DML (especially through natively compiled stored procedures), and stellar for aggregating data in BI queries. For ingestion, it is nearly impossible to beat SSD-based heaps without indexes, so long as you take the right steps to ensure that your INSERT will run in parallel.

Even if the database is in Full Recovery mode, the parallel INSERT to disk-based heap outperforms the INSERT to a Memory-Optimized table. This continues to remain true if comparable indexes are added to the disk-based table after the INSERT.

114

answered Oct 14 '22 17:10

Pittsburgh DBA

Related questions
                            
                                Solr DataImportHandler with SQL Server
                            
                                Is there a way to detect open readers on a SqlConnection?
                            
                                Why does sp_executesql run slower when parameters are passed as arguments
                            
                                SQL Server Management Studio: icon meaning
                            
                                unsafe assembly permission was denied on object 'server' database 'master' [closed]
                            
                                Sharing SQL Server Session State across Web Applications
                            
                                Select in LINQ with a strange value @p__linq__0
                            
                                SQL Server, ISABOUT, weighted terms
                            
                                SQL Azure - One session locking entire DB for Update and Insert
                            
                                How to optimize query that has the same subquery twice?
                            
                                The request failed with HTTP status 401: Unauthorized IN SSRS
                            
                                Why does Microsoft SQL Server 2012 query take minutes over JDBC 4.0 but second(s) in Management Studio?
                            
                                SQL Server 2014: SSISDB vs MSDB for package deployment
                            
                                SQL Number - Row_Number() - Allow Repeating Row Number
                            
                                Azure SQL frequent connection timeouts
                            
                                SQL won't connect after deploying
                            
                                SQL Server : ADO.Net GetUpdateCommand ok with PK, fails after adding another clustered unique index
                            
                                How can I use more than 2100 values in an IN clause using Dapper?
                            
                                How to join json list with a table in SQL Server using OPENJSON?
                            
                                SQL Server Management Studio crash on edit rows

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Memory Optimized Tables - Slower INSERT than SSD

Tags:

performance

sql-server

query-performance

memory-optimized-tables

Pittsburgh DBA

People also ask

1 Answers

Pittsburgh DBA

Recent Activity

Donate For Us