Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace CRLF using powershell

Editor's note: Judging by later comments by the OP, the gist of this question is: How can you convert a file with CRLF (Windows-style) line endings to a LF-only (Unix-style) file in PowerShell?

Here is my powershell script:

 $original_file ='C:\Users\abc\Desktop\File\abc.txt'  (Get-Content $original_file) | Foreach-Object {  $_ -replace "'", "2"` -replace '2', '3'` -replace '1', '7'` -replace '9', ''` -replace "`r`n",'`n' } | Set-Content "C:\Users\abc\Desktop\File\abc.txt" -Force 

With this code i am able to replace 2 with 3, 1 with 7 and 9 with an empty string. I am unable to replace the carriage return line feed with just the line feed. But this doesnt work.

like image 783
Angel_Boy Avatar asked Oct 01 '13 23:10

Angel_Boy


People also ask

How do I replace a new line in PowerShell?

You can use "\\r\\n" also for the new line in powershell .

How do I replace multiple characters in a string in PowerShell?

Replace Multiple Instance Strings Using the replace() Function in PowerShell. Since the replace() function returns a string, to replace another instance, you can append another replace() function call to the end. Windows PowerShell then invokes the replace() method on the original output.


2 Answers

This is a state-of-the-union answer as of Windows PowerShell v5.1 / PowerShell Core v6.2.0:

  • Andrew Savinykh's ill-fated answer, despite being the accepted one, is, as of this writing, fundamentally flawed (I do hope it gets fixed - there's enough information in the comments - and in the edit history - to do so).

  • Ansgar Wiecher's helpful answer works well, but requires direct use of the .NET Framework (and reads the entire file into memory, though that could be changed). Direct use of the .NET Framework is not a problem per se, but is harder to master for novices and hard to remember in general.

  • A future version of PowerShell Core will have a
    Convert-TextFile cmdlet with a -LineEnding parameter to allow in-place updating of text files with a specific newline style, as being discussed on GitHub.

In PSv5+, PowerShell-native solutions are now possible, because Set-Content now supports the -NoNewline switch, which prevents undesired appending of a platform-native newline[1] :

# Convert CRLFs to LFs only. # Note: #  * (...) around Get-Content ensures that $file is read *in full* #    up front, so that it is possible to write back the transformed content #    to the same file. #  * + "`n" ensures that the file has a *trailing LF*, which Unix platforms #     expect. ((Get-Content $file) -join "`n") + "`n" | Set-Content -NoNewline $file 

The above relies on Get-Content's ability to read a text file that uses any combination of CR-only, CRLF, and LF-only newlines line by line.

Caveats:

  • You need to specify the output encoding to match the input file's in order to recreate it with the same encoding. The command above does NOT specify an output encoding; to do so, use -Encoding; without -Encoding:

    • In Windows PowerShell, you'll get "ANSI" encoding, your system's single-byte, 8-bit legacy encoding, such as Windows-1252 on US-English systems.
    • In PowerShell Core, you'll get UTF-8 encoding without a BOM.
  • The input file's content as well as its transformed copy must fit into memory as a whole, which can be problematic with large input files.

  • There's a risk of file corruption, if the process of writing back to the input file gets interrupted.


[1] In fact, if there are multiple strings to write, -NoNewline also doesn't place a newline between them; in the case at hand, however, this is irrelevant, because only one string is written.

like image 199
mklement0 Avatar answered Sep 22 '22 18:09

mklement0


You have not specified the version, I'm assuming you are using Powershell v3.

Try this:

$path = "C:\Users\abc\Desktop\File\abc.txt" (Get-Content $path -Raw).Replace("`r`n","`n") | Set-Content $path -Force 

Editor's note: As mike z points out in the comments, Set-Content appends a trailing CRLF, which is undesired. Verify with: 'hi' > t.txt; (Get-Content -Raw t.txt).Replace("`r`n","`n") | Set-Content t.txt; (Get-Content -Raw t.txt).EndsWith("`r`n"), which yields $True.

Note this loads the whole file in memory, so you might want a different solution if you want to process huge files.

UPDATE

This might work for v2 (sorry nowhere to test):

$in = "C:\Users\abc\Desktop\File\abc.txt" $out = "C:\Users\abc\Desktop\File\abc-out.txt" (Get-Content $in) -join "`n" > $out 

Editor's note: Note that this solution (now) writes to a different file and is therefore not equivalent to the (still flawed) v3 solution. (A different file is targeted to avoid the pitfall Ansgar Wiechers points out in the comments: using > truncates the target file before execution begins). More importantly, though: this solution too appends a trailing CRLF, which may be undesired. Verify with 'hi' > t.txt; (Get-Content t.txt) -join "`n" > t.NEW.txt; [io.file]::ReadAllText((Convert-Path t.NEW.txt)).endswith("`r`n"), which yields $True.

Same reservation about being loaded to memory though.

like image 37
Andrew Savinykh Avatar answered Sep 19 '22 18:09

Andrew Savinykh