I would like to enumerate the strings that are in the string intern pool.
That is to say, I want to get the list of all the instances s
of string
such that:
string.IsInterned(s) != null
Does anyone know if it's possible?
The method intern() creates an exact copy of a String object in the heap memory and stores it in the String constant pool. Note that, if another String with the same contents exists in the String constant pool, then a new object won't be created and the new reference will point to the other String.
The distinct values are stored in a string intern pool. The single copy of each string is called its intern and is typically looked up by a method of the string class, for example String. intern() in Java. All compile-time constant strings in Java are automatically interned using this method.
String Interning is a method of storing only one copy of each distinct String Value, which must be immutable. By applying String. intern() on a couple of strings will ensure that all strings having the same contents share the same memory.
Thanks to the advice of @HansPassant, I managed to get the list of string literals in an assembly. Which is extremely close to what I originally wanted.
You need to use read assembly meta-data, and enumerate user-strings. This can be done with these three methods of IMetaDataImport
:
[ComImport, Guid("7DAC8207-D3AE-4C75-9B67-92801A497D44")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
public interface IMetaDataImport
{
void CloseEnum(IntPtr hEnum);
uint GetUserString(uint stk, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)] char[] szString, uint cchString, out uint pchString);
uint EnumUserStrings(ref IntPtr phEnum, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)]uint[] rStrings, uint cmax, out uint pcStrings);
// interface also contains 62 irrelevant methods
}
To get the instance of IMetaDataImport
, you need to get a IMetaDataDispenser
:
[ComImport, Guid("809C652E-7396-11D2-9771-00A0C9B4D50C")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
[CoClass(typeof(CorMetaDataDispenser))]
interface IMetaDataDispenser
{
uint OpenScope([MarshalAs(UnmanagedType.LPWStr)]string szScope, uint dwOpenFlags, ref Guid riid, [MarshalAs(UnmanagedType.Interface)] out object ppIUnk);
// interface also contains 2 irrelevant methods
}
[ComImport, Guid("E5CB7A31-7512-11D2-89CE-0080C792E5D8")]
class CorMetaDataDispenser
{
}
Here is how it goes:
var dispenser = new IMetaDataDispenser();
var metaDataImportGuid = new Guid("7DAC8207-D3AE-4C75-9B67-92801A497D44");
object scope;
var hr = dispenser.OpenScope(location, 0, ref metaDataImportGuid, out scope);
metaDataImport = (IMetaDataImport)scope;
where location
is the path to the assembly file.
After that, calling EnumUserStrings()
and GetUserString()
is straighforward.
Here is a blog post with more detail, and a demo project on GitHub.
The SSCLI function that its pointing to is
STRINGREF*AppDomainStringLiteralMap::GetStringLiteral(EEStringData *pStringData)
{
...
DWORD dwHash = m_StringToEntryHashTable->GetHash(pStringData);
if (m_StringToEntryHashTable->GetValue(pStringData, &Data, dwHash))
{
STRINGREF *pStrObj = NULL;
pStrObj = ((StringLiteralEntry*)Data)->GetStringObject();
_ASSERTE(!bAddIfNotFound || pStrObj);
return pStrObj;
}
else { ... }
return NULL; //Here, if this returns, the string is not interned
}
If you manage to find the native address of m_StringToEntryHashTable, you can enumerate the strings that exist.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With