Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does C++ allow private members to be modified using this approach?

Tags:

After seeing this question a few minutes ago, I wondered why the language designers allow it as it allows indirect modification of private data. As an example

 class TestClass {
   private:
    int cc;
   public:
     TestClass(int i) : cc(i) {};
 };

 TestClass cc(5);
 int* pp = (int*)&cc;
 *pp = 70;             // private member has been modified

I tested the above code and indeed the private data has been modified. Is there any explanation of why this is allowed to happen or this just an oversight in the language? It seems to directly undermine the use of private data members.

like image 549
mathematician1975 Avatar asked Aug 23 '12 13:08

mathematician1975


1 Answers

Because, as Bjarne puts it, C++ is designed to protect against Murphy, not Machiavelli.

In other words, it's supposed to protect you from accidents -- but if you go to any work at all to subvert it (such as using a cast) it's not even going to attempt to stop you.

When I think of it, I have a somewhat different analogy in mind: it's like the lock on a bathroom door. It gives you a warning that you probably don't want to walk in there right now, but it's trivial to unlock the door from the outside if you decide to.

Edit: as to the question @Xeo discusses, about why the standard says "have the same access control" instead of "have all public access control", the answer is long and a little tortuous.

Let's step back to the beginning and consider a struct like:

struct X {
    int a;
    int b;
};

C always had a few rules for a struct like this. One is that in an instance of the struct, the address of the struct itself has to equal the address of a, so you can cast a pointer to the struct to a pointer to int, and access a with well defined results. Another is that the members have to be arranged in the same order in memory as they are defined in the struct (though the compiler is free to insert padding between them).

For C++, there was an intent to maintain that, especially for existing C structs. At the same time, there was an apparent intent that if the compiler wanted to enforce private (and protected) at run-time, it should be easy to do that (reasonably efficiently).

Therefore, given something like:

struct Y { 
    int a;
    int b;
private:
    int c;
    int d;
public:
    int e;

    // code to use `c` and `d` goes here.
};

The compiler should be required to maintain the same rules as C with respect to Y.a and Y.b. At the same time, if it's going to enforce access at run time, it may want to move all the public variables together in memory, so the layout would be more like:

struct Z { 
    int a;
    int b;
    int e;
private:
    int c;
    int d;
    // code to use `c` and `d` goes here.
};

Then, when it's enforcing things at run-time, it can basically do something like if (offset > 3 * sizeof(int)) access_violation();

To my knowledge nobody's ever done this, and I'm not sure the rest of the standard really allows it, but there does seem to have been at least the half-formed germ of an idea along that line.

To enforce both of those, the C++98 said Y::a and Y::b had to be in that order in memory, and Y::a had to be at the beginning of the struct (i.e., C-like rules). But, because of the intervening access specifiers, Y::c and Y::e no longer had to be in order relative to each other. In other words, all the consecutive variables defined without an access specifier between them were grouped together, the compiler was free to rearrange those groups (but still had to keep the first one at the beginning).

That was fine until some jerk (i.e., me) pointed out that the way the rules were written had another little problem. If I wrote code like:

struct A { 
    int a;
public:
    int b;
public:
    int c;
public:
    int d;
};

...you ended up with a little bit of self contradition. On one hand, this was still officially a POD struct, so the C-like rules were supposed to apply -- but since you had (admittedly meaningless) access specifiers between the members, it also gave the compiler permission to rearrange the members, thus breaking the C-like rules they intended.

To cure that, they re-worded the standard a little so it would talk about the members all having the same access, rather than about whether or not there was an access specifier between them. Yes, they could have just decreed that the rules would only apply to public members, but it would appear that nobody saw anything to be gained from that. Given that this was modifying an existing standard with lots of code that had been in use for quite a while, the opted for the smallest change they could make that would still cure the problem.

like image 131
Jerry Coffin Avatar answered Oct 09 '22 16:10

Jerry Coffin