Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to send EOF from command prompt *without newline*?

Sure, to send EOF from command prompt, Enter followed by Ctrl-Z does the trick.

C:\> type con > file.txt
line1
line2
^Z

This works, and file.txt contains line1\r\nline2\r\n. But how can you do the same without the last newline, so that file.txt contains line1\r\nline2?

In Linux, the solution is to hit Ctrl-D twice1. But what is the equivalent on Windows? Command prompt will happily print ^Zs at the end of a line without doing sending EOF. (And if you press Enter, then any ^Zs you typed get written to the file as literal escape characters!)

If there is no way to do this on Windows, then why?


1https://askubuntu.com/questions/118548/how-do-i-end-standard-input-without-a-newline-character

like image 682
mxxk Avatar asked May 09 '17 07:05

mxxk


1 Answers

The command type con > file.txt doesn't have any special handling for ^Z in the cmd shell, since the target file isn't con and the type command wasn't run in Unicode (UTF-16LE) output mode. In this case, the only ^Z handling is in the ReadFile call itself, which for a console input buffer has an undocumented behavior to return 0 bytes read if a line starts with ^Z.

Let's examine this with a debugger attached, noting that the number of bytes read (lpNumberOfBytesRead) is the 4th argument (register r9 in x64), which is returned by reference as an output parameter.

C:\Temp>type con > file.txt
Breakpoint 1 hit
KERNELBASE!ReadFile:
00007ffc`fb573cc0 48895c2410      mov     qword ptr [rsp+10h],rbx
                                          ss:00000068`c5d1dfa8=000001e3000001e7
0:000> r r9
r9=00000068c5d1dfd0

0:000> pt
line1
KERNELBASE!ReadFile+0xa9:
00007ffc`fb573d69 c3              ret

0:000> dd 68c5d1dfd0 l1
00000068`c5d1dfd0  00000007

As you see above, reading "line1\r\n" is 7 characters, as expected. Next let's enter "\x1aline2\r\n" and see how many bytes ReadFile reportedly reads:

0:000> g
Breakpoint 1 hit
KERNELBASE!ReadFile:
00007ffc`fb573cc0 48895c2410      mov     qword ptr [rsp+10h],rbx
                                          ss:00000068`c5d1dfa8=0000000000000000
0:000> r r9
r9=00000068c5d1dfd0

0:000> pt
^Zline2
KERNELBASE!ReadFile+0xa9:
00007ffc`fb573d69 c3              ret

0:000> dd 68c5d1dfd0 l1
00000068`c5d1dfd0  00000000

As you see above, this time it reads 0 bytes, i.e. EOF. Everything typed after ^Z was simply ignored.

However, what you want instead is to get this behavior in general, wherever ^Z appears in the input buffer. type will do this for you, but only if it's executed in Unicode mode, i.e. cmd /u /c type con > file.txt. In this case cmd does have special handling to scan the input for ^Z. But I bet you don't want a UTF-16LE file, especially since cmd doesn't write a BOM to allow editors to detect the UTF encoding.

You're in luck, because it happens that copy con file.txt does exactly what you want. Internally it calls cmd!ZScanA to scan each line for a ^Z character. We can see this in action back in the debugger, but this time we're in completely undocumented territory. On inspection, it appears that this function's 3rd parameter (register r8 in x64) is the number of bytes read as an in-out argument.

Let's begin again by entering the 7 character string "line1\r\n":

C:\Temp>copy con file.txt
line1
Breakpoint 0 hit
cmd!ZScanA:
00007ff7`cf4c26d0 48895c2408      mov     qword ptr [rsp+8],rbx
                                          ss:00000068`c5d1e9d0=0000000000000000
0:000> r r8; dd @r8 l1
r8=00000068c5d1ea64
00000068`c5d1ea64  00000007

On output, the scanned length remains 7 characters:

0:000> pt
cmd!ZScanA+0x4f:
00007ff7`cf4c271f c3              ret
0:000> dd 68c5d1ea64 l1
00000068`c5d1ea64  00000007
0:000> g

Next enter the 23 (0x17) character string "line2\x1a Ignore this...\r\n":

line2^Z Ignore this...
Breakpoint 0 hit
cmd!ZScanA:
00007ff7`cf4c26d0 48895c2408      mov     qword ptr [rsp+8],rbx
                                          ss:00000068`c5d1e9d0=0000000000000000
0:000> r r8; dd @r8 l1
r8=00000068c5d1ea64
00000068`c5d1ea64  00000017

This time the scanned length is only the 5 characters that precede the ^Z:

0:000> pt
cmd!ZScanA+0x4f:
00007ff7`cf4c271f c3              ret
0:000> dd 68c5d1ea64 l1
00000068`c5d1ea64  00000005

We expect file.txt to be 12 bytes, which it is:

C:\Temp>for %a in (file.txt) do @echo %~za
12

More generally, if a Windows console program wants to implement Ctrl+D handling that approximates the behavior of a Unix terminal, it can use the wide-character console function ReadConsoleW, passing a CONSOLE_READCONSOLE_CONTROL struct by reference as pInputControl. This struct's dwCtrlWakeupMask field is a bit mask that sets which control characters will immediately terminate the read. For example, bit 4 enables Ctrl+D. I wrote a simple test program that demonstrates this case:

C:\Temp>.\test
Enter some text: line1
You entered: line1\x04

You can't see this in the above example, but this read was immediately terminated by pressing Ctrl+D, without even pressing enter. The ^D control character (i.e. '\x04') remains in the input buffer, which is useful in case you want different behavior for multiple control characters.

like image 164
Eryk Sun Avatar answered Oct 22 '22 17:10

Eryk Sun