I have observed that inserting data into a Memory-Optimized Table is much slower than an equivalent parallelized insert to a disk-based table on a 5-SSD stripe set.
--DDL for Memory-Optimized Table
CREATE TABLE [MYSCHEMA].[WIDE_MEMORY_TABLE]
(
[TX_ID] BIGINT NOT NULL
, [COLUMN_01] [NVARCHAR](10) NOT NULL
, [COLUMN_02] [NVARCHAR] (10) NOT NULL
--etc., about 100 columns
--at least one index is required for Memory-Optimized Tables
, INDEX IX_WIDE_MEMORY_TABLE_ENTITY_ID HASH (TX_ID) WITH (BUCKET_COUNT=10000000)
)
WITH (MEMORY_OPTIMIZED=ON, DURABILITY=SCHEMA_ONLY)
--DDL for Disk-Based Table
CREATE TABLE [MYSCHEMA].[WIDE_DISK_TABLE]
(
[TX_ID] BIGINT NOT NULL
, [COLUMN_01] [NVARCHAR](10) NOT NULL
, [COLUMN_02] [NVARCHAR] (10) NOT NULL
--etc., about 100 columns
--No indexes
) ON [PRIMARY]
For this particular test, I am batching 10,000,000 rows into this table in sets of 25,000. The statement looks something like this for the Memory-Optimized Table:
--Insert to Memory-Optimized Table
INSERT INTO
WIDE_MEMORY_TABLE
(
TX_ID
, COLUMN_01
, COLUMN_02
--etc., about 100 columns
)
SELECT
S.COLUMN_01
, S.COLUMN_02
--etc., about 100 columns
FROM
[MYSCHEMA].[SOURCE_TABLE] AS S WITH(TABLOCK)
WHERE
S.TX_ID >= 1
AND S.TX_ID < 25001
OPTION (MAXDOP 4)
This process continues to load 10,000,000 rows. Each iteration just retrieves the next 25,000 rows. The SELECT performs a seek on a covering index on [MY_SCHEMA].[SOURCE_TABLE]. The query plan shows a serialized insert to BIG_MEMORY_TABLE. Each set of 25,000 rows takes around 1400ms.
If I do this to a disk-based table, hosted on a 5-SSD stripe (5,000 IOPS per disk, 200MB/sec throughput), the inserts progress much faster, averaging around 700ms. In the disk-based case, the query performs a parallel insert to [MY_SCHEMA].[WIDE_DISK_TABLE]. Note the TABLOCK hint on [MYSCHEMA].[WIDE_DISK_TABLE].
--Insert to Disk-Based Table
INSERT INTO
WIDE_DISK_TABLE WITH(TABLOCK)
(
TX_ID
, COLUMN_01
, COLUMN_02
--etc., about 100 columns
)
SELECT
S.COLUMN_01
, S.COLUMN_02
--etc., about 100 columns
FROM
[MYSCHEMA].[SOURCE_TABLE] AS S WITH(TABLOCK)
WHERE
S.TX_ID >= 1
AND S.TX_ID < 25001
OPTION (MAXDOP 4)
Granted, the disk-based table does not have an index, and the TABLOCK hint enables parallel insert, but I expect way more from an INSERT to RAM.
Any ideas?
Thanks!
Here is a comparison of 100 batches run in 3 modes: Disk-based, with Deferred Index creation, Disk-based with Index, and Memory-Optimized with Index (at least one index is required on Memory-Optimized tables).
The main benefit of memory-optimized tables are that rows in the table are read from and written to memory which results in non-blocking transactions at super-fast speed. The second copy of the data is stored on the disk and during database recovery, data is read from the disk-based table.
What are Memory Optimized Tables? A Memory Optimized Table, starting in SQL Server 2014, is simply a table that has two copies, one in active memory and one durable on disk whether that includes data or just Schema Only, which I will explain later.
Memory-optimized tables and natively compiled stored procedures cannot be created or dropped if there is a server or database event notification for that DDL operation. Remove the server and database event notifications on CREATE TABLE or DROP TABLE and CREATE PROCEDURE or DROP PROCEDURE . Save this answer.
UPDATE
After much testing and research, I believe this comes down to parallelism. At this time, SQL Server 2016, up to and including SP1 CU7, does not support parallel insert to Memory-Optimized Tables. This makes all INSERT statements to Memory-Optimized tables single-threaded.
Here is an insightful article from Niko Neugebauer regarding this issue: Niko Neugebauer - Parallelism in Hekaton (In-Memory OLTP)
This makes it much less useful for ETL/ELT ingestion. However, it is pretty amazing for OLTP DML (especially through natively compiled stored procedures), and stellar for aggregating data in BI queries. For ingestion, it is nearly impossible to beat SSD-based heaps without indexes, so long as you take the right steps to ensure that your INSERT will run in parallel.
Even if the database is in Full Recovery mode, the parallel INSERT to disk-based heap outperforms the INSERT to a Memory-Optimized table. This continues to remain true if comparable indexes are added to the disk-based table after the INSERT.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With