Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Successive calls to RegGetValue return two different sizes for the same string

In some code I use the Win32 RegGetValue() API to read a string from the registry.

I call the aforementioned API twice:

  1. The purpose of the first call is to get the proper size to allocate a destination buffer for the string.

  2. The second call reads the string from the registry into that buffer.

What is odd is that I found that RegGetValue() returns different size values between the two calls.

In particular, the size value returned in the second call is two bytes (equivalent to one wchar_t) less than the first call.

It's worth noting that the size value compatible with the actual string length is the value returned by the second call (this corresponds to the actual string length, including the terminating NUL).
But I don't understand why the first call returns a size two bytes (one wchar_t) bigger than that.

A screenshot of program output and Win32 C++ compilable repro code are attached.

Different size values returned by RegGetValue()


Repro Source Code

#include <windows.h>
#include <iostream>
#include <string>
#include <vector>
using namespace std;


void PrintSize(const char* const message, const DWORD sizeBytes)
{
    cout << message << ": " << sizeBytes << " bytes (" 
         << (sizeBytes/sizeof(wchar_t)) << " wchar_t's)\n";
}


int main()
{
    const HKEY key = HKEY_LOCAL_MACHINE;
    const wchar_t* const subKey = L"SOFTWARE\\Microsoft\\Windows\\CurrentVersion";
    const wchar_t* const valueName = L"CommonFilesDir";

    //
    // Get string size
    //
    DWORD keyType = 0;
    DWORD dataSize = 0;
    const DWORD flags = RRF_RT_REG_SZ;
    LONG result = ::RegGetValue(
        key, 
        subKey,
        valueName, 
        flags, 
        &keyType, 
        nullptr, 
        &dataSize);
    if (result != ERROR_SUCCESS)
    {
        cout << "Error: " << result << '\n';
        return 1;
    }
    PrintSize("1st call size", dataSize);
    const DWORD dataSize1 = dataSize; // store for later use


    //
    // Allocate buffer and read string into it
    //
    vector<wchar_t> buffer(dataSize / sizeof(wchar_t));
    result = ::RegGetValue(
        key, 
        subKey,
        valueName, 
        flags, 
        nullptr, 
        &buffer[0], 
        &dataSize);
    if (result != ERROR_SUCCESS)
    {
        cout << "Error: " << result << '\n';
        return 1;
    }
    PrintSize("2nd call size", dataSize);

    const wstring text(buffer.data());
    cout << "Read string:\n";
    wcout << text << '\n';
    wcout << wstring(dataSize/sizeof(wchar_t), L'*')  << "  <-- 2nd call size\n";
    wcout << wstring(dataSize1/sizeof(wchar_t), L'-') << "  <-- 1st call size\n"; 
}

Operating System: Windows 7 64-bit with SP1


EDIT

Some confusion seems to be arisen by the particular registry key I happened to read in the sample repro code.
So, let me clarify that I read that key from the registry just as a test. This is not production code, and I'm not interested in that particular key. Feel free to add a simple test key to the registry with some test string value.
Sorry for the confusion.

like image 226
Mr.C64 Avatar asked Mar 24 '15 00:03

Mr.C64


1 Answers

RegGetValue() is safer than RegQueryValueEx() because it artificially adds a null terminator to the output of a string value if it does not already have a null terminator.

The first call returns the data size plus room for an extra null terminator in case the actual data is not already null terminated. I suspect RegGetValue() does not look at the real data at this stage, it just does an unconditional data size + sizeof(wchar_t) to be safe.

(36 * sizeof(wchar_t)) + (1 * sizeof(wchar_t)) = 74

The second call returns the real size of the actual data that was read. That size would include the extra null terminator only if one had to be artificially added. In this case, your data has 35 characters in the path, and a real null terminator present (which well-behaved apps are supposed to do), thus the extra null terminator did not need to be added.

((35+1) * sizeof(wchar_t)) + (0 * sizeof(wchar_t)) = 72

Now, with that said, you really should not be reading from the Registry directly to get the CommonFilesDir path (or any other system path) in the first place. You should be using SHGetFolderPath(CSIDL_PROGRAM_FILES_COMMON) or SHGetKnownFolderPath(FOLDERID_ProgramFilesCommon) instead. Let the Shell deal with the Registry for you. This is consistent across Windows versions, as Registry settings are subject to be moved around from one version to another, as well as accounting for per-user paths vs system-global paths. These are the main reasons why the CSIDL API was introduced in the first place.

like image 134
Remy Lebeau Avatar answered Nov 02 '22 02:11

Remy Lebeau