I have some 3d interpolation code that takes up 90% of my projects runtime and cannot be precomputed.
What are some techniques that I could use to speed this up? Algorithmic or Micro Optimization?
Here is the code for those interested.
It basically takes data that was placed across the 2 3d arrays and interpolates the rest of the data.
EDIT: Also I am already spliting this into threads at a higher level for increased performance, but this doesn't help on the windows phone as they are all single core...
I will probably do something like (Single[] DensityMap = new Single[128 * 128 * 128];) to remove the multi-D array hit. I access the array in 100's of places and was hoping to not have to do that (wrapping in a function doesn't help as the windows phone won't inline the function call and it doesn't help perf then...)
float[, ,] DensityMap = new float[128, 128, 128];
float[, ,] PressureMap = new float[128, 128, 128];
unchecked
{
for (int x = 0; x < g_CraftWorldConstants.RegionSizeX; x++)
{
int offsetX = (x / SAMPLE_RATE_3D_HOR) * SAMPLE_RATE_3D_HOR;
int plusOffsetX = SAMPLE_RATE_3D_HOR + offsetX;
int poxox = plusOffsetX - offsetX;
double poxxpoxox = ((plusOffsetX - x) / (double)poxox);
double xoxpoxox = ((x - offsetX) / (double)poxox);
for (int y = 0; y < g_CraftWorldSettings.GET.RegionSizeY; y++)
{
int offsetY = (y / SAMPLE_RATE_3D_VERT) * SAMPLE_RATE_3D_VERT;
int plusOffsetY = SAMPLE_RATE_3D_VERT + offsetY;
int poyoy = plusOffsetY - offsetY;
double poyypoyoy = ((plusOffsetY - y) / (double)poyoy);
double yoypoyoy = ((y - offsetY) / (double)poyoy);
for (int z = 0; z < g_CraftWorldConstants.RegionSizeZ; z++)
{
if (!(x % SAMPLE_RATE_3D_HOR == 0 && y % SAMPLE_RATE_3D_VERT == 0 && z % SAMPLE_RATE_3D_HOR == 0))
{
int offsetZ = (z / SAMPLE_RATE_3D_HOR) * SAMPLE_RATE_3D_HOR;
int plusOffsetZ = SAMPLE_RATE_3D_HOR + offsetZ;
int pozoz = plusOffsetZ - offsetZ;
double pozzpozoz = ((plusOffsetZ - z) / (double)pozoz);
double zozpozoz = ((z - offsetZ) / (double)pozoz);
double x00 = poxxpoxox * in_DensityMap[offsetX, offsetY, offsetZ] + xoxpoxox * in_DensityMap[plusOffsetX, offsetY, offsetZ];
double x10 = poxxpoxox * in_DensityMap[offsetX, offsetY, plusOffsetZ] + xoxpoxox * in_DensityMap[plusOffsetX, offsetY, plusOffsetZ];
double x01 = poxxpoxox * in_DensityMap[offsetX, plusOffsetY, offsetZ] + xoxpoxox * in_DensityMap[plusOffsetX, plusOffsetY, offsetZ];
double x11 = poxxpoxox * in_DensityMap[offsetX, plusOffsetY, plusOffsetZ] + xoxpoxox * in_DensityMap[plusOffsetX, plusOffsetY, plusOffsetZ];
double r0 = poyypoyoy * x00 + yoypoyoy * x01;
double r1 = poyypoyoy * x10 + yoypoyoy * x11;
in_DensityMap[x, y, z] = (float)(pozzpozoz * r0 + zozpozoz * r1);
double x02 = poxxpoxox * in_CaveDensity[offsetX, offsetY, offsetZ] + xoxpoxox * in_CaveDensity[plusOffsetX, offsetY, offsetZ];
double x12 = poxxpoxox * in_CaveDensity[offsetX, offsetY, plusOffsetZ] + xoxpoxox * in_CaveDensity[plusOffsetX, offsetY, plusOffsetZ];
double x03 = poxxpoxox * in_CaveDensity[offsetX, plusOffsetY, offsetZ] + xoxpoxox * in_CaveDensity[plusOffsetX, plusOffsetY, offsetZ];
double x13 = poxxpoxox * in_CaveDensity[offsetX, plusOffsetY, plusOffsetZ] + xoxpoxox * in_CaveDensity[plusOffsetX, plusOffsetY, plusOffsetZ];
double r2 = poyypoyoy * x02 + yoypoyoy * x03;
double r3 = poyypoyoy * x12 + yoypoyoy * x13;
in_CaveDensity[x, y, z] = (float)(pozzpozoz * r2 + zozpozoz * r3);
}
}
}
}
}
Performance-based on Nature Of Language C++ language is an object-oriented programming language, and it supports some important features like Polymorphism, Abstract Data Types, Encapsulation, etc. Since it supports object-orientation, speed is faster compared to the C language.
For the access to a an element you will have to do the index calculation yourself. But as one of the indices (j) is constant over the loop, you can compute the start element's index before the loop and inside just increment the index 1. That might get you some significant improvement.
C is not always faster. C is slower than, for example Modern Fortran. C lets pointer aliasing happen, which means some good optimizations are not possible.
It seems that you have a lot of opportunities to optimize your code. Your x loop executes 128 times, your y loop executes 128*128=16,384 times, and your z loop executes 128^3=2,097,152 times. There are a number of terms inside your z loop that are only dependent on the x, or the y iterations, but they are recalculated at every z iteration. For example,
int poxox = plusOffsetX - offsetX;
and
double poxxpoxox = ((plusOffsetX - x) / (double)poxox);
These two terms are being calculated more than 2 million times, but only need to be calculated 128 times if my cursory scan of your function is correct. Move terms to the loop level that is appropriate so you don't waste cycles recalculating the same values many multiple times.
Here is your code with basic optimizations made. I'm curious to know how this affects your run times. Several of the terms are only dependent on the iteration value, and are the same for x, y, and z. So I pulled them out entirely and precompute them once. I also have moved the outer mod operations out of the inner loop, and modified the logic to ensure short circuit of the evaluation, which should should remove the majority of mod operations that were previously being executed.
int[] offsets = new int[128];
int[] plusOffsets = new int[128];
double[] poii = new double[128];
double[] ioip = new double[128];
for (int i = 0; i < 128; i++) {
offsets[i] = (i / SAMPLE_RATE_3D_HOR) * SAMPLE_RATE_3D_HOR;
plusOffsets[i] = SAMPLE_RATE_3D_HOR + offsets[i];
double poioi = (double) (plusOffsets[i] - offsets[i]);
poii[i] = ((plusOffsets[i] - i) / poioi);
ioip[i] = ((i - offsets[i]) / poioi);
}
float[, ,] DensityMap = new float[128, 128, 128];
float[, ,] PressureMap = new float[128, 128, 128];
for (int x = 0; x < g_CraftWorldConstants.RegionSizeX; x++)
{
int offsetX = offsets[x];
int plusOffsetX = plusOffsets[x];
double poxxpoxox = poii[x];
double xoxpoxox = ioip[x];
bool xModNot0 = !(x % SAMPLE_RATE_3D_HOR == 0);
for (int y = 0; y < g_CraftWorldConstants.RegionSizeY; y++)
{
int offsetY = offsets[y];
int plusOffsetY = plusOffsets[y];
double poyypoyoy = poii[y];
double yoypoyoy = ioip[y];
bool yModNot0 = !(y % SAMPLE_RATE_3D_VERT == 0);
for (int z = 0; z < g_CraftWorldConstants.RegionSizeZ; z++)
{
//if (!(x % SAMPLE_RATE_3D_HOR == 0 && y % SAMPLE_RATE_3D_VERT == 0 && z % SAMPLE_RATE_3D_HOR == 0))
if (xModNot0 || yModNot0 || !(z % SAMPLE_RATE_3D_HOR == 0))
{
int offsetZ = offsets[z];
int plusOffsetZ = plusOffsets[z];
double pozzpozoz = poii[z];
double zozpozoz = ioip[z];
double x00 = poxxpoxox * DensityMap[offsetX, offsetY, offsetZ] + xoxpoxox * DensityMap[plusOffsetX, offsetY, offsetZ];
double x10 = poxxpoxox * DensityMap[offsetX, offsetY, plusOffsetZ] + xoxpoxox * DensityMap[plusOffsetX, offsetY, plusOffsetZ];
double x01 = poxxpoxox * DensityMap[offsetX, plusOffsetY, offsetZ] + xoxpoxox * DensityMap[plusOffsetX, plusOffsetY, offsetZ];
double x11 = poxxpoxox * DensityMap[offsetX, plusOffsetY, plusOffsetZ] + xoxpoxox * DensityMap[plusOffsetX, plusOffsetY, plusOffsetZ];
double r0 = poyypoyoy * x00 + yoypoyoy * x01;
double r1 = poyypoyoy * x10 + yoypoyoy * x11;
DensityMap[x, y, z] = (float)(pozzpozoz * r0 + zozpozoz * r1);
double x02 = poxxpoxox * PressureMap[offsetX, offsetY, offsetZ] + xoxpoxox * PressureMap[plusOffsetX, offsetY, offsetZ];
double x12 = poxxpoxox * PressureMap[offsetX, offsetY, plusOffsetZ] + xoxpoxox * PressureMap[plusOffsetX, offsetY, plusOffsetZ];
double x03 = poxxpoxox * PressureMap[offsetX, plusOffsetY, offsetZ] + xoxpoxox * PressureMap[plusOffsetX, plusOffsetY, offsetZ];
double x13 = poxxpoxox * PressureMap[offsetX, plusOffsetY, plusOffsetZ] + xoxpoxox * PressureMap[plusOffsetX, plusOffsetY, plusOffsetZ];
double r2 = poyypoyoy * x02 + yoypoyoy * x03;
double r3 = poyypoyoy * x12 + yoypoyoy * x13;
PressureMap[x, y, z] = (float)(pozzpozoz * r2 + zozpozoz * r3);
}
}
}
}
There are some things you can do to speed up your code:
To simulate the 3D-array, you just can do it this way:
Single[] DensityMap = new Single[128 * 128 * 128];
DensityMap[z + (y * 128) + (x * 128 * 128)] = ...;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With