When reading through the OpenJDK code for class ArrayList, for instance, in JDK17
(https://github.com/openjdk/jdk17/blob/master/src/java.base/share/classes/java/util/ArrayList.java)
I stumbled upon the following constructor:
public ArrayList(Collection<? extends E> c) {
Object[] a = c.toArray();
if ((size = a.length) != 0) {
if (c.getClass() == ArrayList.class) {
elementData = a;
} else {
elementData = Arrays.copyOf(a, size, Object[].class);
}
} else {
// replace with empty array.
elementData = EMPTY_ELEMENTDATA;
}
}
What is the reason behind the distinction whether c.getClass() is an ArrayList.class or not? Is this case splitting necessary at all?
(I was just trying to understand Java code that is part of the OpenJDK distribution for class ArrayList.)
If you do a git blame on that source file, you will find that this code was introduced as part of the commit 343ecd806bb050, whose commit message
8231800: Better listing of arrays
references the issue JDK-8231800 in the JDK bugtracker. Alas, this issue is not visible to the public, but googling it reveals it has been shipped with security updates to all Java versions supported at the time, providing strong evidence that this was related to a security issue (which also explains why the details of the issue haven't been made public, and why the commit message is not descriptive).
However, googling also finds an enhancement request filed by the autor of that code, which does describe the challenges this piece of code is intended to solve:
Consider new ArrayList(Collection arg). This calls arg.toArray() and makes a defensive copy of it for use use as the internal array to the ArrayList. This copy is necessary in case the arg's toArray() implementation produces an array whose class is something other than Object[].class or if it retains a reference to the returned array. (Both of these are spec violations, but they're unenforceable.)
That is, ArrayList needs its array to
EImplementations of Collection.toArray are specified to satisfy this:
Object[] toArray()Returns an array containing all of the elements in this collection. If this collection makes any guarantees as to what order its elements are returned by its iterator, this method must return the elements in the same order. The returned array's runtime component type is Object.
The returned array will be "safe" in that no references to it are maintained by this collection. (In other words, this method must allocate a new array even if this collection is backed by an array). The caller is thus free to modify the returned array.
however, we can't be sure that an implementation actually does this. For instance, an implementation might do:
class StringList implements Collection<String> {
String[] elements = new String[10];
int size = 0;
// other methods omitted
@Override
Object[] toArray() {
return Arrays.copyOf(elements, size);
}
}
which looks totally innocent, right? But this implementation actually violates the spec, because copyOf will return a String[] rather than an Object[], so if somebody did:
var strings = new StringList("Hello", "World");
var objects = new ArrayList<Object>(strings);
objects.add(42);
That code is totally correct as far as the compile time type system is concerned, but would fail at runtime with an ArrayStoreException if ArrayList reused the array returned by StringList.toArray().
The second issue is more nefarious. Suppose you have code like this:
class AccessManager {
final List<Permission> permissions;
public AccessManager(List<Permission> requestedPermissions) {
permissions = new ArrayList<>(requestedPermissions);
verifyPermissions();
}
private verifyPermissions() {
for (var p : permissions) {
if (!currentUser.has(p)) {
throw new SecurityException();
}
}
}
boolean checkAccess(Permission p) {
return permissions.contains(p);
}
}
and some nefarious hacker declared:
class NefariousList implements Collection<Permission> {
Object[] permissions = { new InnocentPermission() };
@Override
public Object[] toArray() {
return permissions;
}
}
and then did:
var nefarious = new NefariousList();
var accessManager = new AccessManager(nefarious);
var service = new Service(accessManager);
nefarious.permissions[0] = new AdminPermission();
service.deleteDatabase();
they could successfully delete the database, because AccessManager finds an AdminPermission in the list of verified permissions ...
(if you believe such hand-written security code with reference sharing bugs to be contrived, behold this real world example from a recent stackoverflow question)
Since core classes of the Java Language are required to work correctly even if untrusted code executes within the same JVM, ArrayList can not, in general, rely on Collection.toArray to be implemented correctly unless it has verified that the implementation is, in fact, safe, because it comes from a trustworthy JDK class.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With