I'm trying to use Fluent NHibernate to migrate a database that needs some of the database 'massaged'. The source database is a MS Access database and the current table I'm stuck on is one with an OLE Object field. The target database is a MS SQL Server Express database.
In the entity I simply had this field defined as a byte[]
however when loading however even when just loading that single field for a single record I was hitting a System.OutOfMemoryException
byte[] test = aSession.Query<Entities.Access.Revision>().Where(x => x.Id == 5590).Select(x => x.FileData).SingleOrDefault<byte[]>();
I then tried implementing the blob type listed here but now when running that I receive an error of:
"Unable to cast object of type 'System.Byte[]' to type 'TestProg.DatabaseConverter.Entities.Blob'."}
I can't imagine the Ole Object is any larger than 100mb but haven't been able to check. Is there any good way using Fluent NHibernate to copy this out of the one database and save it to the other or will I need to look at other options?
My normal loop for processing these is:
IList<Entities.Access.Revision> result;
IList<int> recordIds = aSession.Query<Entities.Access.Revision>().Select(x => x.Id).ToList<int>();
foreach (int recordId in recordIds)
{
result = aSession.Query<Entities.Access.Revision>().Where(x => x.Id == recordId).ToList<Entities.Access.Revision>();
Save(sqlDb, result);
}
Save function just copies properties from one to another and for some entities is used to manipulate data or give feedback to user related to data problems. I'm using stateless sessions for both databases.
--
From further testing the objects it appears to be hanging on are about 60-70mb. I'm currently testing grabbing the data with an OleDbDataReader using GetBytes.
--
Update (Nov 24): I've yet to find a way to get this to work with NHibernate. I did get this working with regular db command objects. I've put the code for function I made below for anybody curious who finds this. This is code from my database converter so objects prefixed with 'a' are access database objects and 's' are sql ones.
public void MigrateBinaryField(int id, string tableName, string fieldName)
{
var aCmd = new OleDbCommand(String.Format(@"SELECT ID, {0} FROM {1} WHERE ID = {2}", fieldName, tableName, id), aConn);
using (var reader = aCmd.ExecuteReader(System.Data.CommandBehavior.SequentialAccess))
{
while (reader.Read())
{
if (reader[fieldName] == DBNull.Value)
return;
long read = 0;
long offset = 0;
// Can't .WRITE a NULL column so need to set an initial value
var sCmd = new SqlCommand(string.Format(@"UPDATE {0} SET {1} = @data WHERE OldId = @OldId", tableName, fieldName), sConn);
sCmd.Parameters.AddWithValue("@data", new byte[0]);
sCmd.Parameters.AddWithValue("@OldId", id);
sCmd.ExecuteNonQuery();
// Incrementally store binary field to avoid OutOfMemoryException from having entire field loaded in memory
sCmd = new SqlCommand(string.Format(@"UPDATE {0} SET {1}.WRITE(@data, @offset, @len) WHERE OldId = @OldId", tableName, fieldName), sConn);
while ((read = reader.GetBytes(reader.GetOrdinal(fieldName), offset, buffer, 0, buffer.Length)) > 0)
{
sCmd.Parameters.Clear();
sCmd.Parameters.AddWithValue("@data", buffer);
sCmd.Parameters.AddWithValue("@offset", offset);
sCmd.Parameters.AddWithValue("@len", read);
sCmd.Parameters.AddWithValue("@OldId", id);
sCmd.ExecuteNonQuery();
offset += read;
}
}
}
}
This sounds like the results I have seen with using .NET on top of other frameworks as well.
The native database driver beneath ADO.NET beneath NHibernate (two "beneaths" are intentional here) will require a pinned destination memory block that cannot be moved in memory while the driver fills it. Since the .NET garbage collector can randomly move blocks of memory on a separate thread in order to compact the heaps, NHibernate's underlying .NET database layer has to create a non-managed memory block to receive the data, which effectively doubles the amount of memory required to load a record.
Also, I have not verified this next point, but NHibernate should attempt to cache blocks of records, since it bypasses some of the relational database query operations. This allows NHibernate to make fewer database requests, which is optimal for smaller record sizes, but requires many records (including many blobs) to fit in memory at a time.
As a first step toward a resolution, make sure the process is really running the machine out of memory (or if it is 32-bit, make sure it is hitting the 2GB limit). If so, attempt to determine the baseline - if it is processing records with a variety of blob sizes, what is the minimum and maximum memory it uses? From that, you can estimate how much memory would be required for that large record (or the cache block that contains that record!)
64-bit and more physical memory may be a brute-force solution, if you aren't already running 64-bit, and if bigger hardware is even an option.
Another possible solution is to check whether NHibernate has configurable settings or properties for how it caches data. For example, check whether you can set a property that limits how many records are loaded at a time, or tell it to limit its cache to a certain size in bytes.
A more efficient solution is to use your ADO.NET code for the blobs; that might be the best solution, especially if you expect even larger blobs than this particular 60-70MB blob. MS Access will normally allow multiple read-only connections, so this should work as long as NHibernate doesn't set the database to block other connections.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With