I have a small powershell script which reads a document with UTF8 encoding, makes some replacements in it and saves it back which looks like this:
(Get-Content $path) -Replace "myregex","replacement" | Set-Content $path2 -Encoding utf8
This will create a new file with the right encoding and right contents but there are additional new line characters at the end. According to this answer and many others, I am told to either:
-NoNewLine
to Set-Content
[System.IO.File]::WriteAllText($path2,$content,[System.Text.Encoding]::UTF8)
Both solutions remove the trailing new lines... and every other new lines in the file.
Is there a way to both:
To complement Ansgar Wiechers' helpful answer:
Using Set-Content -NoNewline
(PSv5+) is an option, but only if you pass the output as a single string with embedded newlines, which Get-Content -Raw
can do:
(Get-Content -Raw $path) -replace 'myregex', 'replacement' |
Set-Content -NoNewline $path2 -Encoding utf8
Note, however, that the semantics of -replace
change with the use of -Raw
: now a single-replace
operation is performed on a multi-line string (the entire file contents) - as opposed to line-individual operations with an array as the LHS.
Also note that -Raw
will preserve the trailing-newline-or-not status of the input.
If you want the line-by-line semantics and/or want to ensure that the output's final line has no trailing newline (even if the input file had one), use Get-Content
without -Raw
, and then -join
:
(Get-Content $path) -replace 'myregex', 'replacement' -join [Environment]::NewLine |
Set-Content -NoNewline $path2 -Encoding utf8
The above uses the platform-appropriate newline character(s) on output, but note that there's no guarantee that the input file used the same.
As for what you tried:
As you've observed, Set-Content -NoNewline
with an array of strings causes all strings to be concatenated without a separator - unlike what one might expect, -NoNewline
doesn't just omit a trailing newline:
> 'one', 'two' | Set-Content -NoNewline t.txt; Get-Content -Raw t.txt
onetwo # Strings were directly concatenated.
Note: Newlines embedded in input strings are preserved, however.
The reason for the [IO.File]::WriteAllText()
approach not resulting in any newlines is different, as explained in Ansgar's answer.
[IO.File]::WriteAllText()
assumes that $content
is a single string, but Get-Content
produces an array of strings (and removes the line breaks from the end of each line/string). Mangling that string array into a single string joins the strings using the $OFS
character (see here).
To avoid this behavior you need to ensure that $content
already is a single string when it's passed to WriteAllText()
. There are various ways to do that, for instance:
Use Get-Content -Raw
(PowerShell v3 or newer):
$content = (Get-Content $path -Raw) -replace 'myregex', 'replacement'
Pipe the output through Out-String
:
$content = (Get-Content $path | Out-String) -replace 'myregex', 'replacement' -replace '\r\n$'
Note, however, that Out-String
(just like Set-Content
) adds a trailing line break, as was pointed out in the comments. You need to remove that with a second replacement operation.
Join the array with the -join
operator:
$content = (Get-Content $path) -replace 'myregex', 'replacement' -join "`r`n"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With