Okay, this problem is indeed a challenge!
I am working on an arithmetic-based project involving larger than normal numbers. I new I was going to be working with a worst-case-senario of 4 GB-caped file sizes (I was hopping to even extend that to a 5GB cap as I have seen file sizes greater than 4 GB before - specifically image *.iso files)
Now, the algorithm(s) to which I will apply computation to do not matter at the moment, but the loading and handling of such large quantities of data - the numbers - do.
System.IO.File.ReadAllBytes(String)
can only read a cap of 2 GB worth of file data, so this is my first problem - How will I go about loading and/or configuring to access memory, such file sizes - twice as much, if not more? System.Numerics.BigInteger()
class online - but being that there is no BigInteger.MaxValue
and that I can only load a max of 2 GB of data at a time, I don't know what the potential of BigInteger
would be - even compared to the object I was writing called Number()
(which does have my desired minimum potential). There were also issues with available memory and performance, though I do not care so much about speed, but rather completing this experimental process successfully.BigInteger
or finish my own Number
class?PS I cannot reveal too much information about this project under a non-discloser agreement. ;)
For those who would like to see a sample operator from my Number object for a per-byte array adder(C#):
public static Number operator +(Number n1, Number n2)
{
// GB5_ARRAY is a cap constant for 5 GB - 5368709120L
byte[] data = new byte[GB5_ARRAY];
byte rem = 0x00, bA, bB, rm, dt;
// Iterate through all bytes until the second to last
// The last byte is the remainder if any
// I tested this algorithm on smaller arrays provided by the `BitConverter` class,
// then I made a few tweeks to satisfy the larger arrays and the Number object
for (long iDx = 0; iDx <= GB5_ARRAY-1; iDx++)
{
// bData is a byte[] with GB5_ARRAY number of bytes
// Perform a check - solves for unequal (or jagged) arrays
if (iDx < GB5_ARRAY - 1) { bA = n1.bData[iDx]; bB = n2.bData[iDx]; } else { bA = 0x00; bB = 0x00; }
Add(bA, bB, rem, out dt, out rm);
// set data and prepare for the next interval
rem = rm; data[iDx] = dt;
}
return new Number(data);
}
private static void Add(byte a, byte b, byte r, out byte result, out byte remainder)
{
int i = a + b + r;
result = (byte)(i % 256); // find the byte amount through modulus arithmetic
remainder = (byte)((i - result) / 256); // find remainder
}
Normally, you would process large files using a streaming API, either raw binary (Stream
), or via some protocol-reader (XmlReader
, StreamReader
, etc). This also could be done via memory-mapped files in some cases. The key point here is that you only look at a small portion of the file at a time (a moderately-sized buffer of data, a logical "line", or "node", etc - depending on the scenario).
Where this gets odd is your desire to map this somehow directly to some form of large number. Frankly, I don't know how we can help with that without more information, but if you are dealing with an actual number of this size, I think you're going to struggle unless the binary protocol makes that convenient. And "performing such arithmetic as division and multiplication" is meaningless on raw data; that only makes sense on parsed data with custom operations defined.
Also: note that in .NET 4.5 you can flip a configuration switch to expand the maximum size of arrays, going over the 2GB limit. It still has a limit, but: it is a bit bigger. Unfortunately, the maximum number of elements is still the same, so if you are using a byte[]
array it won't help. But if you are using SomeCompositeStruct[]
you should be able to get higher usage. See gcAllowVeryLargeObjects
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With