Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should changing the contents of a string like this cause an exception?

Consider the following code:

using System;
using System.Runtime.InteropServices;

namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            const string test = "ABCDEF"; // Strings are immutable, right?
            char[] chars = new StringToChar{str=test}.chr;
            chars[0] = 'X';

            // On an x32 release or debug build or on an x64 debug build, 
            // the following prints "XBCDEF".
            // On an x64 release build, it prints "ABXDEF".
            // In both cases, we have changed the contents of 'test' without using
            // any 'unsafe' code...

            Console.WriteLine(test);
        }
    }

    [StructLayout(LayoutKind.Explicit)]
    public struct StringToChar
    {
        [FieldOffset(0)]
        public string str;
        [FieldOffset(0)]
        public char[] chr;
    }
}

By running this code, we are able to change the contents of a string without an exception occuring. We did not have to declare any unsafe code to do so. This code is clearly very dodgy!

My question is simply this: Do you think that an exception should be thrown by the code above?

[EDIT1: Note that other people have tried this for me, and some people get different results - which isn't too suprising given the nastyness of what I'm doing... ;)]

[EDIT2: Note that I'm using Visual Studio 2010 on Windows 7 Ultimate 64 bit]

[EDIT3: Made the test string const, just to make it even more dodgy!]

like image 215
Matthew Watson Avatar asked Dec 23 '10 14:12

Matthew Watson


People also ask

What are the reasons that cause an exception?

11.1 The Causes of Exceptionsevaluation of an expression violates the normal semantics of the language, such as an integer divide by zero, as summarized in §15.6. an error occurs in loading or linking part of the program (§12.2, §12.3) some limitation on a resource is exceeded, such as using too much memory.

Which keyword causes an exception to be created?

In Java programming, the throws keyword is used to declare an exception in the program. It notifies the programmer that an exception may occur. Hence the programmer should provide exception handling code to maintain the regular flow in the program execution.

In which of the following An exception error can occur?

Following are some scenarios where an exception occurs. A user has entered an invalid data. A file that needs to be opened cannot be found. A network connection has been lost in the middle of communications or the JVM has run out of memory.

What does exception ToString do?

ToString returns a representation of the current exception that is intended to be understood by humans. Where the exception contains culture-sensitive data, the string representation returned by ToString is required to take into account the current system culture.


2 Answers

The SSCLI20 source code for clr/src/vm/class.cpp, MethodTableBuilder::HandleExplicitLayout can provide some insight. It is unusually heavily commented, this comment describes the rules (edited for readability):

// go through each field and look for invalid layout
// (note that we are more permissive than what Ecma allows. We only disallow 
// the minimum set necessary to close security holes.)
//
// This is what we implement:
//
// 1. Verify that every OREF is on a valid alignment
// 2. Verify that OREFs only overlap with other OREFs.
// 3. If an OREF does overlap with another OREF, the class is marked unverifiable.
// 4. If an overlap of any kind occurs, the class will be marked NotTightlyPacked (affects ValueType.Equals()).

Rule 1 ensures that a reference assignment stays atomic. Rule 2 says why you can do what you did, any object type reference may overlap. Overlap with a value type value is not permitted, that screws up the garbage collector. Rule 3 states the consequence, it only makes the type non-verifiable.

It is otherwise not the only way to screw up a string without the unsafe keyword. Just pinvoke a function that stomps the string. It gets a pointer to the string content on the GC heap or loader heap (interned strings), no copy is made. That's unverifiable code as well and just as un-exploitable when running in a sandbox.

Driving the point home: the C# unsafe keyword is not at all related to what the CLR considers verifiable, and thus actually safe code. It takes care of the blatant cases, using pointers or custom value types (fixed). Whether that's a leak in the C# language spec is debatable. Pinvoke being the more obvious edge case. Pinvoking a operating system function is pretty doggone safe. Pinvoking some 3rd party C library is not.

But I have to agree with @fej, [FieldOffset] should have gotten the "are you sure" treatment. Too bad there's no syntax for that. Admittedly, I haven't figured out yet why this actually needed to affect the managed layout. It would make much more sense that this attribute would only apply to the marshaled layout. Weird, somebody holding a ace his sleeve in the early days, maybe.

like image 122
Hans Passant Avatar answered Sep 29 '22 02:09

Hans Passant


My vote's on making FieldOffset unsafe.

like image 29
fejesjoco Avatar answered Sep 29 '22 02:09

fejesjoco