I started writing a program that copies data from an oracle database to a SQL Server database and immediately ran into a memory issue. I am using the Oracle.ManagedDataAccess library (nuget install command: "install-package Oracle.ManagedDataAccess"). The library is version 4.122.1.0
Here is my function:
private static void LoadTable(TableToLoad table)
{
DataTable loadBuffer = null;
//assume source is oracle for now. assume destination is sql server
using (OracleConnection conn = new OracleConnection(table.SourceConnectionString))
{
OracleDataReader reader = OracleCommands.GetDataReader(string.Format("select * from \"{0}\".\"{1}\"", table.SourceSchema, table.SourceTable),conn);
bool foundData = reader.Read();
if (loadBuffer == null)
{
loadBuffer = InitializeBuffer(reader);
}
int recordsAffected;
while (foundData==true)
{
object[] currentRowValues = new object[reader.FieldCount];
int valueCount = reader.GetValues(currentRowValues);
loadBuffer.Rows.Add(currentRowValues);
if (loadBuffer.Rows.Count >= 15000)
{
SqlCommands.RunSqlCommandWithDataTableInput(string.Format("insert into {0}.{1} select * from @loadBufferTable", table.TargetSchema, table.TargetTable), table.TargetConnectionString, out recordsAffected, loadBuffer, "loadBufferTable");
loadBuffer.Dispose();
loadBuffer = null;
loadBuffer = InitializeBuffer(reader);
}
foundData = reader.Read();
}
if(loadBuffer.Rows.Count>0)
{
SqlCommands.RunSqlCommandWithDataTableInput(string.Format("insert into {0}.{1} select * from @loadBufferTable", table.TargetSchema, table.TargetTable), table.TargetConnectionString, out recordsAffected, loadBuffer, "loadBufferTable");
loadBuffer.Dispose();
loadBuffer = null;
}
reader.Close();
reader.Dispose();
reader = null;
}
}
When I run this the memory consumption just shoots up and in a few minutes I get an out of memory error. I stopped the process and used the diagnostic tools to take a look at what was using so much memory. Almost all the memory was taken up with tens of thousands of "OraBuf" objects which are created by the DataReader.Read method.
I tried deallocating and recreating the DataTable object I am using as a buffer for insert batch, thinking that maybe somehow the DataTable was holding onto a reference to the OraBuf objects but that did not fix the issue (I initially was using the DataTable.Clear() method to reset the DataTable).
Why is this happening (and what can I do to fix it)?
Thank you for your help.
EDIT: My test table in Oracle has a CLOB column in it. The issue seems to be related to reading that CLOB value as the problem does not manifest when using other tables (no OutOfMemoryException). Is there a better library for accessing Oracle that I should be using?
EDIT 2: I should also mention that the table I am testing with (the one with the CLOB column) has about 2.9 million records and it usually fails somewhere between row 500,000 and row 1,500,000 (actual lowest row count before out of memory failure was about 649,000 and the highest was about 1,390,000).
EDIT 3: I tried pairing down this code to help identify the problem and the one line that makes a difference is:
int valueCount = reader.GetValues(currentRowValues);
I also tried a version where I only read one column at a time and in that case, reading the value using an index causes the problem (only on the CLOB column). Here is the line in the alternate version that results in the exception:
newRow[columnIndex] = reader[columnIndex];
I had the same issue with CLOB columns while reading 3 million rows in a batch application.When I set the InitialLobFetchSize to -1 it seems to keep the memory usage under control:
command.InitialLOBFetchSize = -1;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With