Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do EnumPrintersA and EnumPrintersW request the same amount of memory?

I call EnumPrintersA/EnumPrintersW functions using node-ffi to get list of local printers accessible from my PC.
You should create the buffer which will be filled with information by EnumPrinters function.
But you do not know the required size of the buffer.
In this case you need to execute EnumPrintersA/EnumPrintersW twice.
During the first call this function calculates the amount of memory for information about printers, during the second call this function fills the buffer with information about printers.
In case of Unicode version of EnumPrinters function, each letter in printers name will be encoded using two characters in Windows.

Why the first call to EnumPrintersW returns the same required amount of memory as the first call to EnumPrintersA?
Unicode strings are twice as long as not-unicode strings, but required buffer size is the same.

var ffi = require('ffi')
var ref = require('ref')
var Struct = require('ref-struct')
var wchar_t = require('ref-wchar')

var int = ref.types.int
var intPtr = ref.refType(ref.types.int)
var wchar_string = wchar_t.string

var getPrintersA =  function getPrinters() {
   var PRINTER_INFO_4A = Struct({
      'pPrinterName' : ref.types.CString,
      'pServerName' : ref.types.CString,
      'Attributes' : int
   });

   var printerInfoPtr = ref.refType(PRINTER_INFO_4A);

   var winspoolLib = new ffi.Library('winspool', {
      'EnumPrintersA': [ int, [ int, ref.types.CString, int, printerInfoPtr, int, intPtr, intPtr ] ]
   });

   var pcbNeeded = ref.alloc(int, 0);
   var pcReturned = ref.alloc(int, 0);

   //Get amount of memory for the buffer with information about printers
   var res = winspoolLib.EnumPrintersA(6, ref.NULL, 4, ref.NULL, 0, pcbNeeded, pcReturned);
   if (res != 0) {
      console.log("Cannot get list of printers. Error during first call to EnumPrintersA. Error: " + res);
      return;
   }

   var bufSize = pcbNeeded.deref();
   var buf = Buffer.alloc(bufSize);

   console.log(bufSize);

   //Fill buf with information about printers
   res = winspoolLib.EnumPrintersA(6, ref.NULL, 4, buf, bufSize, pcbNeeded, pcReturned);
   if (res == 0) {
      console.log("Cannot get list of printers. Eror: " + res);
      return;
   }

   var countOfPrinters = pcReturned.deref();

   var printers = Array(countOfPrinters);
   for (var i = 0; i < countOfPrinters; i++) {
      var pPrinterInfo = ref.get(buf, i*PRINTER_INFO_4A.size, PRINTER_INFO_4A);
      printers[i] = pPrinterInfo.pPrinterName;
   }

   return printers;
};

var getPrintersW =  function getPrinters() {
   var PRINTER_INFO_4W = Struct({
      'pPrinterName' : wchar_string,
      'pServerName' : wchar_string,
      'Attributes' : int
   });

   var printerInfoPtr = ref.refType(PRINTER_INFO_4W);

   var winspoolLib = new ffi.Library('winspool', {
      'EnumPrintersW': [ int, [ int, wchar_string, int, printerInfoPtr, int, intPtr, intPtr ] ]
   });

   var pcbNeeded = ref.alloc(int, 0);
   var pcReturned = ref.alloc(int, 0);

   //Get amount of memory for the buffer with information about printers
   var res = winspoolLib.EnumPrintersW(6, ref.NULL, 4, ref.NULL, 0, pcbNeeded, pcReturned);
   if (res != 0) {
      console.log("Cannot get list of printers. Error during first call to EnumPrintersW. Eror code: " + res);
      return;
   }

   var bufSize = pcbNeeded.deref();
   var buf = Buffer.alloc(bufSize);

   console.log(bufSize);

   //Fill buf with information about printers
   res = winspoolLib.EnumPrintersW(6, ref.NULL, 4, buf, pcbNeeded.deref(), pcbNeeded, pcReturned);
   if (res == 0) {
      console.log("Cannot get list of printers. Eror code: " + res);
      return;
   }

   var countOfPrinters = pcReturned.deref();
   var printers = new Array(countOfPrinters);
   for (var i = 0; i < countOfPrinters; i++) {
      var pPrinterInfo = ref.get(buf, i*PRINTER_INFO_4W.size, PRINTER_INFO_4W);
      printers[i] = pPrinterInfo.pPrinterName;
   }

   return printers;
};

https://msdn.microsoft.com/ru-ru/library/windows/desktop/dd162692(v=vs.85).aspx

BOOL EnumPrinters(
  _In_  DWORD   Flags,
  _In_  LPTSTR  Name,
  _In_  DWORD   Level,
  _Out_ LPBYTE  pPrinterEnum,
  _In_  DWORD   cbBuf,
  _Out_ LPDWORD pcbNeeded,
  _Out_ LPDWORD pcReturned
);

https://msdn.microsoft.com/ru-ru/library/windows/desktop/dd162847(v=vs.85).aspx

typedef struct _PRINTER_INFO_4 {
  LPTSTR pPrinterName;
  LPTSTR pServerName;
  DWORD  Attributes;
} PRINTER_INFO_4, *PPRINTER_INFO_4;
like image 767
Volodymyr Bezuglyy Avatar asked Dec 14 '16 16:12

Volodymyr Bezuglyy


2 Answers

I can confirm that what you found with EnumPrintersA and EnumPrintersW is reproducible. In my machine, they both require 240 bytes.

This got me curious, so I decided to allocate a separate buffer for each function and dump each buffer to a file and opened them with a hex editor. The interesting part of each file is of course the names of the printers.

To keep this short, I'll show you the first 3 names of the printers. The first line is from EnumPrintersA, the second is from EnumPrintersW:

Fax.x...FX DocuPrint C1110 PCL 6..C.1.1.1.0. .P.C.L. .6...Microsoft XPS Document Writer.o.c.u.m.e.n.t. .W.r.i.t.e.r...
F.a.x...F.X. .D.o.c.u.P.r.i.n.t. .C.1.1.1.0. .P.C.L. .6...M.i.c.r.o.s.o.f.t. .X.P.S. .D.o.c.u.m.e.n.t. .W.r.i.t.e.r...

From this result, it appears that EnumPrintersA calls EnumPrintersW for the actual work and then simply converts each string in the buffer to single byte characters and puts the resulting string in the same place. To confirm this, I decided to trace EnumPrintersA code and I found that it definitely calls EnumPrintersW at position winspool.EnumPrintersA + 0xA7. The actual position is likely different in a different Windows version.

This got me even more curious, so I decided to test other functions that have A and W versions. This is what I found:

EnumMonitorsA 280 bytes needed
EnumMonitorsW 280 bytes needed
EnumServicesStatusA 20954 bytes needed
EnumServicesStatusW 20954 bytes needed
EnumPortsA 2176 bytes needed
EnumPortsW 2176 bytes needed
EnumPrintProcessorsA 24 bytes needed
EnumPrintProcessorsW 24 bytes needed

From this result, my conclusion is that EnumPrintersA calls EnumPrintersW for the actual work and converts the string in the buffer and other functions that have A and W versions also do the same thing. This appears to be a common mechanism to avoid duplication of code in expense of larger buffers, maybe because buffers can be deallocated anyway.

like image 60
Rei Avatar answered Oct 06 '22 00:10

Rei


At the beginning I thought that there's something wrong with your code, so I kept looking for a mistake (introduced by the FFI or JS layers, or a typo or something similar), but I couldn't find anything.

Then, I started to write a program similar to yours in C (to eliminate any extra layers that could introduce errors).

main.c:

#include <stdio.h>
#include <Windows.h>
#include <conio.h>  // !!! Deprecated!!!


typedef BOOL (__stdcall *EnumPrintersAFuncPtr)(_In_ DWORD Flags, _In_ LPSTR Name, _In_ DWORD Level, _Out_ LPBYTE pPrinterEnum, _In_ DWORD cbBuf, _Out_ LPDWORD pcbNeeded, _Out_ LPDWORD pcReturned);
typedef BOOL (__stdcall *EnumPrintersWFuncPtr)(_In_ DWORD Flags, _In_ LPWSTR Name, _In_ DWORD Level, _Out_ LPBYTE pPrinterEnum, _In_ DWORD cbBuf, _Out_ LPDWORD pcbNeeded, _Out_ LPDWORD pcReturned);


void testFunc()
{
    PPRINTER_INFO_4A ppi4a = NULL;
    PPRINTER_INFO_4W ppi4w = NULL;
    BOOL resa, resw;
    DWORD neededa = 0, returneda = 0, neededw = 0, returnedw = 0, gle = 0, i = 0, flags = PRINTER_ENUM_LOCAL | PRINTER_ENUM_CONNECTIONS;
    LPBYTE bufa = NULL, bufw = NULL;
    resa = EnumPrintersA(flags, NULL, 4, NULL, 0, &neededa, &returneda);
    if (resa) {
        printf("EnumPrintersA(1) succeeded with NULL buffer. Exiting...\n");
        return;
    } else {
        gle = GetLastError();
        if (gle != ERROR_INSUFFICIENT_BUFFER) {
            printf("EnumPrintersA(1) failed with %d(0x%08X) which is different than %d. Exiting...\n", gle, gle, ERROR_INSUFFICIENT_BUFFER);
            return;
        } else {
            printf("EnumPrintersA(1) needs a %d(0x%08X) bytes long buffer.\n", neededa, neededa);
        }
    }
    resw = EnumPrintersW(flags, NULL, 4, NULL, 0, &neededw, &returnedw);
    if (resw) {
        printf("EnumPrintersW(1) succeeded with NULL buffer. Exiting...\n");
        return;
    } else {
        gle = GetLastError();
        if (gle != ERROR_INSUFFICIENT_BUFFER) {
            printf("EnumPrintersW(1) failed with %d(0x%08X) which is different than %d. Exiting...\n", gle, gle, ERROR_INSUFFICIENT_BUFFER);
            return;
        } else {
            printf("EnumPrintersW(1) needs a %d(0x%08X) bytes long buffer.\n", neededw, neededw);
        }
    }

    bufa = (LPBYTE)calloc(1, neededa);
    if (bufa == NULL) {
        printf("calloc failed with %d(0x%08X). Exiting...\n", errno, errno);
        return;
    } else {
        printf("buffera[0x%08X:0x%08X]\n", (long)bufa, (long)bufa + neededa - 1);
    }
    bufw = (LPBYTE)calloc(1, neededw);
    if (bufw == NULL) {
        printf("calloc failed with %d(0x%08X). Exiting...\n", errno, errno);
        free(bufa);
        return;
    } else {
        printf("bufferw[0x%08X:0x%08X]\n", (long)bufw, (long)bufw + neededw - 1);
    }

    resa = EnumPrintersA(flags, NULL, 4, bufa, neededa, &neededa, &returneda);
    if (!resa) {
        gle = GetLastError();
        printf("EnumPrintersA(2) failed with %d(0x%08X). Exiting...\n", gle, gle);
        free(bufa);
        free(bufw);
        return;
    }
    printf("EnumPrintersA(2) copied %d bytes in the buffer out of which the first %d(0x%08X) represent %d structures of size %d\n", neededa, returneda * sizeof(PRINTER_INFO_4A), returneda * sizeof(PRINTER_INFO_4A), returneda, sizeof(PRINTER_INFO_4A));
    resw = EnumPrintersW(flags, NULL, 4, bufw, neededw, &neededw, &returnedw);
    if (!resw) {
        gle = GetLastError();
        printf("EnumPrintersW(2) failed with %d(0x%08X). Exiting...\n", gle, gle);
        free(bufw);
        free(bufa);
        return;
    }
    printf("EnumPrintersW(2) copied %d bytes in the buffer out of which the first %d(0x%08X) represent %d structures of size %d\n", neededw, returnedw * sizeof(PRINTER_INFO_4W), returnedw * sizeof(PRINTER_INFO_4W), returnedw, sizeof(PRINTER_INFO_4W));

    ppi4a = (PPRINTER_INFO_4A)bufa;
    ppi4w = (PPRINTER_INFO_4W)bufw;
    printf("\nPrinting ASCII results:\n");
    for (i = 0; i < returneda; i++) {
        printf("  Item %d\n    pPrinterName: [%s]\n", i, ppi4a[i].pPrinterName ? ppi4a[i].pPrinterName : "NULL");
    }
    printf("\nPrinting WIDE results:\n");
    for (i = 0; i < returnedw; i++) {
        wprintf(L"  Item %d\n    pPrinterName: [%s]\n", i, ppi4w[i].pPrinterName ? ppi4w[i].pPrinterName : L"NULL");
    }

    free(bufa);
    free(bufw);
}


int main()
{
    testFunc();
    printf("\nPress a key to exit...\n");
    getch();
    return 0;
}

Note: in terms of variable names (I kept them short - and thus not very intuitive), the a or w at the end of their names means that they are used for ASCII / WIDE version.

Initially, I was afraid that EnumPrinters might not return anything, since I'm not connected to any printer at this point, but luckily I have some (7 to be more precise) "saved". Here's the output of the above program (thank you @qxz for correcting my initial (and kind of faulty) version):

EnumPrintersA(1) needs a 544(0x00000220) bytes long buffer.
EnumPrintersW(1) needs a 544(0x00000220) bytes long buffer.
buffera[0x03161B20:0x03161D3F]
bufferw[0x03165028:0x03165247]
EnumPrintersA(2) copied 544 bytes in the buffer out of which the first 84(0x00000054) represent 7 structures of size 12
EnumPrintersW(2) copied 544 bytes in the buffer out of which the first 84(0x00000054) represent 7 structures of size 12

Printing ASCII results:
  Item 0
    pPrinterName: [Send To OneNote 2013]
  Item 1
    pPrinterName: [NPI060BEF (HP LaserJet Professional M1217nfw MFP)]
  Item 2
    pPrinterName: [Microsoft XPS Document Writer]
  Item 3
    pPrinterName: [Microsoft Print to PDF]
  Item 4
    pPrinterName: [HP Universal Printing PCL 6]
  Item 5
    pPrinterName: [HP LaserJet M4345 MFP [7B63B6]]
  Item 6
    pPrinterName: [Fax]

Printing WIDE results:
  Item 0
    pPrinterName: [Send To OneNote 2013]
  Item 1
    pPrinterName: [NPI060BEF (HP LaserJet Professional M1217nfw MFP)]
  Item 2
    pPrinterName: [Microsoft XPS Document Writer]
  Item 3
    pPrinterName: [Microsoft Print to PDF]
  Item 4
    pPrinterName: [HP Universal Printing PCL 6]
  Item 5
    pPrinterName: [HP LaserJet M4345 MFP [7B63B6]]
  Item 6
    pPrinterName: [Fax]

Press a key to exit...

Amazingly (at least for me), the behavior you described could be reproduced.

Note that the above output is from the 032bit compiled version of the program (064bit pointers are harder to read :) ), but the behavior is reproducible when building for 064bit as well (I am using VStudio 10.0 on Win10).

Since there are for sure strings at the end of the buffer, I started debugging:

VStudio 10.0 Debug window

Above is a picture of VStudio 10.0 Debug window, with the program interrupted at the end of testFunc, just before freeing the 1st pointer. Now, I don't know how familiar are you with debugging on VStudio, so I'm going to walk through the (relevant) window areas:

  • At the bottom, there are 2 Watch windows (used to display variables while the program is running). As seen, the variable Name, Value and Type are displayed

    • At the right, (Watch 1): the 1st (0th) and the last (6th - as there are 7) of the structures at the beginning of each of the 2 buffers

    • At the left, (Watch 2): the addresses of the 2 buffers

  • Above the Watch windows, (Memory 2) is the memory content for bufw. A Memory window contains a series of rows and in each row there's the memory address (grayed, at the left), followed by its contents in hex (each byte corresponds to 2 hex digits - e.g. 1E), then at the right the same contents in char representation (each byte corresponds to 1 char - I'm going to come back on this), then the next row, and so on

  • Above Memory 2, (Memory 1): it's the memory content for bufa

Now, going back to the memory layout: not all the chars at the right are necessarily what they seem, some of them are just displayed like that for human readability. For example there are a lot of dots (.) on the right side, but they are not all dots. If you look for a dot at the corresponding hex representation, you'll notice that for many of them it's 00 or NULL (which is a non printable char, but it's displayed as a dot).

Regarding the buffer contents each of the 2 Memory windows (looking at the char representation), there are 3 zones:

  • The PRINTER_INFO_4* zone or the gibberish at the beginning: 544 bytes corresponding to approximately the 1st 3 rows

  • The funky chars from the last ~1.5 rows: they are outside of our buffers so we don't care about them

  • The mid zone: where the strings are stored

Let's look at the WIDE strings zone (Memory 2 - mid zone): as you mentioned, each character has 2 bytes: because in my case they're all ASCII chars, the MSB (or the codepage byte) is always 0 (that's why you see chars and dots interleaved: e.g. ".L.a.s.e.r.J.e.t" in row 4).

Since there are multiple strings in the buffer (or string, if you will) - or even better: multiple TCHAR*s in a TCHAR* - they must be separated: that is done by a NULL WIDE char (hex: 00 00, char: "..") at the end of each string; combined with the fact that the next string's 1st byte (char) is also 00 (.), you'll see a sequence of 3 NULL bytes (hex: 00 00 00, char: "...") and that is the separator between 2 (WIDE) strings in the mid zone.

Now, comparing the 2 mid parts (corresponding to the 2 buffers), you'll notice that the string separators are exactly in the same positions and more: the last parts of each string are the also same (the last halves of each string to be more precise).

Considering this, here's my theory:

I think EnumPrintersA calls EnumPrintersW, and then it iterates through each of the strings (at the end of the buffer), and calls wcstombs or even better: [MS.Docs]: WideCharToMultiByte function on them (converting them in place - and thus the resulting ASCII string only takes the 1st half of the WIDE string, leaving the 2nd half unmodified), without converting all the buffer. I'll have to verify this by looking with a disassembler in winspool.drv.

Personally (if I'm right) I think that it is a lame workaround (or a gainarie as I like to call it), but who knows, maybe all the *A, *W function pairs (at least those who return multiple char*s in a char*) work like this. Anyway, there are also pros for this approach (at least for these 2 funcs):

  • dev-wise: it's OK for one function to call the other and keep the implementation in 1 place (instead of duping it in both functions)

  • performance-wise: it's OK not to recreate the buffer since that would imply additional computation; after all, the buffer consumer doesn't normally reach the second halves of each ASCII string in the buffer

like image 35
CristiFati Avatar answered Oct 06 '22 01:10

CristiFati