Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Piping into SET /P fails due to uninitialised data pointer?

Supposing we have got a text file sample.txt:

one
two
...

Now we want to remove the first line:

two
...

A quick way to do that is to use input redirection, set /P and findstr1 (I know there are other ways using more or for /F, but let us forget about them for now):

@echo off
< "sample.txt" (
    set /P =""
    findstr "^"
)

The output is going to be as expected.

However, why is the output empty when I replace the input redirection < by type and a pipe | :

@echo off
type "sample.txt" | (
    set /P =""
    findstr "^"
)

When I replace set /P ="" by pause > nul, the output is what I expect -- the input file is output but with the first character of the first line missing (as it is consumed by pause). But why does set /P seem to consume everything instead of only the first line like it does with the redirection < approach? Is that a bug?

To me it looks like set /P fails to adequately initialise the reading pointer to the piped data.

I watched that strange behaviour on Windows 7 and on Windows 10.


It becomes even more weird: when calling the script containing the pipe multiple times, for instance by a loop like for /L %I in (1,1,1000) do @pipe.bat, and the input file contains about fifteen lines or more, sometimes (a few times out of thousand) a fragment of the input file is returned; that fragment is exactly the same each time; it seems that there are always 80 bytes missing at the beginning.


1) findstr hangs in case the last line is not terminated by a line-break, so let us assume such is there.

like image 341
aschipfl Avatar asked Dec 27 '16 20:12

aschipfl


1 Answers

When retrieving data, the set /p tries to fill a 1023 character buffer (if they are available) with data from stdin. Once this read operation has ended, the first end of line is searched and once it has been found (or the end of the buffer has been reached), the SetFilePointer API is called to reposition the input stream pointer after the end of the read line. This way the next read operation will start to retreive data after the read line.

This works flawlessly when a disk file is associated with the input stream, but as Microsoft states in the SetFilePointer documentation

The hFile parameter must refer to a file stored on a seeking device; for example, a disk volume. Calling the SetFilePointer function with a handle to a non-seeking device such as a pipe or a communications device is not supported, even though the SetFilePointer function may not return an error. The behavior of the SetFilePointer function in this case is undefined.

What is happening is that, while not generating any error, the call to reposition the read pointer fails when stdin is associated with a pipe, the pointer is not moved back and the 1023 bytes (or the number of available read bytes) keep read.

edited in response to Aacini request

The set command is processed by the eSet function, who calls SetWork to determine which type of set command will be executed.

As it is a set /p the SetPromptUser function is called and from this function the ReadBufFromInput function is called

add     esp, 0Ch
lea     eax, [ebp+var_80C]
push    eax             ; int
push    3FFh            ; int
lea     eax, [ebp+Value]
push    eax             ; int
xor     esi, esi
push    0FFFFFFF6h      ; nStdHandle
mov     word ptr [ebp+Value], si
call    edi ; GetStdHandle(x) ; GetStdHandle(x)
push    eax             ; hFile
call    _ReadBufFromInput@16 ; ReadBufFromInput(x,x,x,x)

it requests 3FFh (1023) characters from standard input handle (0FFFFFFF6h = -10 = STD_INPUT_HANDLE)

ReadBufFromInput uses the GetFileType API to determine if it should read from the console or from a file

; Attributes: bp-based frame

; int __stdcall ReadBufFromInput(HANDLE hFile, int, int, int)
_ReadBufFromInput@16 proc near

hFile= dword ptr  8

; FUNCTION CHUNK AT .text:4AD10D3D SIZE 00000006 BYTES

mov     edi, edi
push    ebp
mov     ebp, esp
push    [ebp+hFile]     ; hFile
call    ds:__imp__GetFileType@4 ; GetFileType(x)
and     eax, 0FFFF7FFFh
cmp     eax, 2
jz      loc_4AD10D3D

and, as in this case it is a pipe (GetFileType returns 3) the code jumps to the ReadBufFromFile function

; Attributes: bp-based frame

; int __stdcall ReadBufFromFile(HANDLE hFile, LPWSTR lpWideCharStr, DWORD cchWideChar, LPDWORD lpNumberOfBytesRead)
_ReadBufFromFile@16 proc near

var_C= dword ptr -0Ch
cchMultiByte= dword ptr -8
NumberOfBytesRead= dword ptr -4
hFile= dword ptr  8
lpWideCharStr= dword ptr  0Ch
cchWideChar= dword ptr  10h
lpNumberOfBytesRead= dword ptr  14h

This function will call the ReadFile API function to retrive the indicated number of characters

push    ebx             ; lpOverlapped
push    [ebp+lpNumberOfBytesRead] ; lpNumberOfBytesRead
mov     [ebp+var_C], eax
push    [ebp+cchWideChar] ; nNumberOfBytesToRead
push    edi             ; lpBuffer
push    [ebp+hFile]     ; hFile
call    ds:__imp__ReadFile@20 ; ReadFile(x,x,x,x,x)

The returned buffer is iterated in search of an end of line, and once it is found, the pointer in the input stream is moved after the found poisition

.text:4AD06A15 loc_4AD06A15:                           
.text:4AD06A15                 cmp     [ebp+NumberOfBytesRead], 3
.text:4AD06A19                 jl      short loc_4AD06A2D
.text:4AD06A1B                 mov     al, [esi]
.text:4AD06A1D                 cmp     al, 0Ah
.text:4AD06A1F                 jz      loc_4AD06BCF
.text:4AD06A25
.text:4AD06A25 loc_4AD06A25:                           
.text:4AD06A25                 cmp     al, 0Dh
.text:4AD06A27                 jz      loc_4AD06D14
.text:4AD06A2D
.text:4AD06A2D loc_4AD06A2D:                           
.text:4AD06A2D                 movzx   eax, byte ptr [esi]
.text:4AD06A30                 cmp     byte ptr _DbcsLeadCharTable[eax], bl
.text:4AD06A36                 jnz     loc_4AD12018
.text:4AD06A3C                 dec     [ebp+NumberOfBytesRead]
.text:4AD06A3F                 inc     esi
.text:4AD06A40
.text:4AD06A40 loc_4AD06A40:                           
.text:4AD06A40                 cmp     [ebp+NumberOfBytesRead], ebx
.text:4AD06A43                 jg      short loc_4AD06A15

.text:4AD06BCF loc_4AD06BCF:                          
.text:4AD06BCF                 cmp     byte ptr [esi+1], 0Dh
.text:4AD06BD3                 jnz     loc_4AD06A25
.text:4AD06BD9                 jmp     loc_4AD06D1E

.text:4AD06D14 loc_4AD06D14:                           
.text:4AD06D14                 cmp     byte ptr [esi+1], 0Ah
.text:4AD06D18                 jnz     loc_4AD06A2D
.text:4AD06D1E
.text:4AD06D1E loc_4AD06D1E:                          
.text:4AD06D1E                 mov     eax, [ebp+var_C]
.text:4AD06D21                 mov     [esi+2], bl
.text:4AD06D24                 sub     esi, edi
.text:4AD06D26                 inc     esi
.text:4AD06D27                 inc     esi
.text:4AD06D28                 push    ebx             ; dwMoveMethod
.text:4AD06D29                 push    ebx             ; lpDistanceToMoveHigh
.text:4AD06D2A                 mov     [ebp+cchMultiByte], esi
.text:4AD06D2D                 add     esi, eax
.text:4AD06D2F                 push    esi             ; lDistanceToMove
.text:4AD06D30                 push    [ebp+hFile]     ; hFile
.text:4AD06D33                 call    ds:__imp__SetFilePointer@16 ; SetFilePointer(x,x,x,x)
like image 184
MC ND Avatar answered Nov 01 '22 09:11

MC ND