Sure, to send EOF from command prompt, Enter followed by Ctrl-Z does the trick.
C:\> type con > file.txt
line1
line2
^Z
This works, and file.txt
contains line1\r\nline2\r\n
. But how can you do the same without the last newline, so that file.txt
contains line1\r\nline2
?
In Linux, the solution is to hit Ctrl-D twice1. But what is the equivalent on Windows? Command prompt will happily print ^Z
s at the end of a line without doing sending EOF. (And if you press Enter, then any ^Z
s you typed get written to the file as literal escape characters!)
If there is no way to do this on Windows, then why?
1https://askubuntu.com/questions/118548/how-do-i-end-standard-input-without-a-newline-character
The command type con > file.txt
doesn't have any special handling for ^Z
in the cmd shell, since the target file isn't con
and the type
command wasn't run in Unicode (UTF-16LE) output mode. In this case, the only ^Z
handling is in the ReadFile
call itself, which for a console input buffer has an undocumented behavior to return 0 bytes read if a line starts with ^Z
.
Let's examine this with a debugger attached, noting that the number of bytes read (lpNumberOfBytesRead
) is the 4th argument (register r9 in x64), which is returned by reference as an output parameter.
C:\Temp>type con > file.txt
Breakpoint 1 hit
KERNELBASE!ReadFile:
00007ffc`fb573cc0 48895c2410 mov qword ptr [rsp+10h],rbx
ss:00000068`c5d1dfa8=000001e3000001e7
0:000> r r9
r9=00000068c5d1dfd0
0:000> pt
line1
KERNELBASE!ReadFile+0xa9:
00007ffc`fb573d69 c3 ret
0:000> dd 68c5d1dfd0 l1
00000068`c5d1dfd0 00000007
As you see above, reading "line1\r\n"
is 7 characters, as expected. Next let's enter "\x1aline2\r\n"
and see how many bytes ReadFile
reportedly reads:
0:000> g
Breakpoint 1 hit
KERNELBASE!ReadFile:
00007ffc`fb573cc0 48895c2410 mov qword ptr [rsp+10h],rbx
ss:00000068`c5d1dfa8=0000000000000000
0:000> r r9
r9=00000068c5d1dfd0
0:000> pt
^Zline2
KERNELBASE!ReadFile+0xa9:
00007ffc`fb573d69 c3 ret
0:000> dd 68c5d1dfd0 l1
00000068`c5d1dfd0 00000000
As you see above, this time it reads 0 bytes, i.e. EOF. Everything typed after ^Z
was simply ignored.
However, what you want instead is to get this behavior in general, wherever ^Z
appears in the input buffer. type
will do this for you, but only if it's executed in Unicode mode, i.e. cmd /u /c type con > file.txt
. In this case cmd does have special handling to scan the input for ^Z
. But I bet you don't want a UTF-16LE file, especially since cmd doesn't write a BOM to allow editors to detect the UTF encoding.
You're in luck, because it happens that copy con file.txt
does exactly what you want. Internally it calls cmd!ZScanA
to scan each line for a ^Z
character. We can see this in action back in the debugger, but this time we're in completely undocumented territory. On inspection, it appears that this function's 3rd parameter (register r8 in x64) is the number of bytes read as an in-out argument.
Let's begin again by entering the 7 character string "line1\r\n"
:
C:\Temp>copy con file.txt
line1
Breakpoint 0 hit
cmd!ZScanA:
00007ff7`cf4c26d0 48895c2408 mov qword ptr [rsp+8],rbx
ss:00000068`c5d1e9d0=0000000000000000
0:000> r r8; dd @r8 l1
r8=00000068c5d1ea64
00000068`c5d1ea64 00000007
On output, the scanned length remains 7 characters:
0:000> pt
cmd!ZScanA+0x4f:
00007ff7`cf4c271f c3 ret
0:000> dd 68c5d1ea64 l1
00000068`c5d1ea64 00000007
0:000> g
Next enter the 23 (0x17) character string "line2\x1a Ignore this...\r\n"
:
line2^Z Ignore this...
Breakpoint 0 hit
cmd!ZScanA:
00007ff7`cf4c26d0 48895c2408 mov qword ptr [rsp+8],rbx
ss:00000068`c5d1e9d0=0000000000000000
0:000> r r8; dd @r8 l1
r8=00000068c5d1ea64
00000068`c5d1ea64 00000017
This time the scanned length is only the 5 characters that precede the ^Z
:
0:000> pt
cmd!ZScanA+0x4f:
00007ff7`cf4c271f c3 ret
0:000> dd 68c5d1ea64 l1
00000068`c5d1ea64 00000005
We expect file.txt to be 12 bytes, which it is:
C:\Temp>for %a in (file.txt) do @echo %~za
12
More generally, if a Windows console program wants to implement Ctrl+D handling that approximates the behavior of a Unix terminal, it can use the wide-character console function ReadConsoleW
, passing a CONSOLE_READCONSOLE_CONTROL
struct by reference as pInputControl
. This struct's dwCtrlWakeupMask
field is a bit mask that sets which control characters will immediately terminate the read. For example, bit 4 enables Ctrl+D. I wrote a simple test program that demonstrates this case:
C:\Temp>.\test
Enter some text: line1
You entered: line1\x04
You can't see this in the above example, but this read was immediately terminated by pressing Ctrl+D, without even pressing enter. The ^D
control character (i.e. '\x04'
) remains in the input buffer, which is useful in case you want different behavior for multiple control characters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With