Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slicing a Span<T> row from a 2D matrix - not sure why this works

Tags:

c#

.net

cil

c#-7.2

I've been looking for a way to extract slices from a 2D matrix without having to actually reallocate-copy the contents, and

public static Span<float> Slice([NotNull] this float[,] m, int row)
{
    if (row < 0 || row > m.GetLength(0) - 1) throw new ArgumentOutOfRangeException(nameof(row), "The row index isn't valid");
    return Span<float>.DangerousCreate(m, ref m[row, 0], m.GetLength(1));
}

I've checked this method with this simple Unit tests and apparently it works:

[TestMethod]
public void Foo()
{
    float[,] m =
    {
        { 1, 2, 3, 4 },
        { 5, 6, 7, 8 },
        { 9, 9.5f, 10, 11 },
        { 12, 13, 14.3f, 15 }
    };
    Span<float> s = m.Slice(2);
    var copy = s.ToArray();
    var check = new[] { 9, 9.5f, 10, 11 };
    Assert.IsTrue(copy.Select((n, i) => Math.Abs(n - check[i]) < 1e-6f).All(b => b));
}

This doesn't seem right to me though. I mean, I'd like to understand what's exactly happening behind the scenes here, as that ref m[x, y] part doesn't convince me.

How is the runtime getting the actual reference to the value at that location inside the matrix, since the this[int x, int y] method in the 2D array is just returning a value and not a reference?

Shouldn't the ref modifier only get a reference to the local copy of that float value returned to the method, and not a reference to the actual value stored within the matrix? I mean, otherwise having methods/parameters with ref returns would be pointless, and that's not the case.

I took a peek into the IL for the test method and noticed this:

enter image description here

Now, I'm not 100% sure since I'm not so great at reading IL, but isn't the ref m[x, y] call being translated to a call to that other Address method, which I suppose just returns a ref value on its own?

If that's the case, is there a way to directly use that method from C# code?

And is there a way to discover methods like this one, when available?

I mean, I just noticed that by looking at the IL and I had no idea it existed or why was the code working before, at this point I wonder how much great stuff is there in the default libs without a hint it's there for the average dev.

Thanks!

like image 238
Sergio0694 Avatar asked Jan 03 '18 00:01

Sergio0694


2 Answers

Standard 1D (SZ) arrays have three opcodes to work with them - ldelem, stelem, and ldelema. They represent the actions that can be performed on a variable - getting its value, setting its value, and obtaining a reference to it. a[i] syntax is just translated to whatever represents what you do with the element. Other variables have similar opcodes (ldloc, stloc, ldloca; ldfld, stfld, ldflda etc.)

However, these opcodes cannot be used with multidimensional arrays. Quoting ECMA-335:

For one-dimensional arrays that aren’t zero-based and for multidimensional arrays, the array class provides a Get method.

For one-dimensional arrays that aren’t zero-based and for multidimensional arrays, the array class provides a StoreElement [sic] method

For one-dimensional arrays that aren’t zero-based and for multidimensional arrays, the array class provides an Address method.

The StoreElement method has been since renamed to Set, but this still holds. Accesing elements of a multidimensional array is translated to whatever action you perform on them.

This triplet of methods have these signatures:

instance int32 int32[0...,0...]::Get(int32, int32)
instance void int32[0...,0...]::Set(int32, int32, int32)
instance int32& int32[0...,0...]::Address(int32, int32)

These intrinsic methods are implemented by the CLR. Notice the reference returned by the last method. While the ability to return a reference has been added to C# quite recently, CLI supported it from the beginning.

Also notice that at no point an indexer is involved. In fact, arrays don't even have an indexer, because that is a C# thing and it is not sufficient to implement all actions for a variable, because the get reference accessor is missing.

To sum things up, a[x] on an array and a[x] on a non-array (any object with an indexer) are massively different things.

By the way, DangerousCreate also works thanks to this statement (ECMA-335 again):

Array elements shall be laid out within the array object in row-major order (i.e., the elements associated with the rightmost array dimension shall be laid out contiguously from lowest to highest index). The actual storage allocated for each array element can include platform-specific padding.

like image 152
IS4 Avatar answered Nov 17 '22 00:11

IS4


It seems to me that the crux of your confusion is here:

Shouldn't the ref modifier only get a reference to the local copy of that float value returned to the method, and not a reference to the actual value stored within the matrix?

You seem to be under the mistaken impression that the indexer syntax for an array works exactly the same as for other types. But it doesn't. An indexer for an array is a special case in .NET, and treated as a variable, not a property or pair of methods.

For example:

void M1()
{
    int[] a = { 1, 2, 3 };

    M2(ref a[1]);
    Console.WriteLine(string.Join(", ", a);
}

void M2(ref int i)
{
    i = 17;
}

yields:

1, 17, 3

This works because the expression a[1] is not a call to some indexer getter, but rather describes a variable that is physically located in the second element of the given array.

Likewise, when you call DangerousCreate() and pass ref m[row, 0], you are passing the reference to the variable that is exactly the element of the m array at [row, 0].

Since a reference to the actual memory location is what's being passed, the rest should be no surprise. That is, that the Span<T> class is able to then use that address to wrap a specific subset of the original array, without allocating any extra memory.

like image 26
Peter Duniho Avatar answered Nov 17 '22 00:11

Peter Duniho