Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I remove the BOM from a UTF-8 file? [duplicate]

I have a file in UTF-8 encoding with BOM and want to remove the BOM. Are there any linux command-line tools to remove the BOM from the file?

$ file test.xml test.xml:  XML 1.0 document, UTF-8 Unicode (with BOM) text, with very long lines 
like image 517
m13r Avatar asked Jul 21 '17 14:07

m13r


People also ask

Does UTF-8 have BOM?

The Unicode Standard permits the BOM in UTF-8, but does not require or recommend its use. Byte order has no meaning in UTF-8, so its only use in UTF-8 is to signal at the start that the text stream is encoded in UTF-8, or that it was converted to UTF-8 from a stream that contained an optional BOM.

How do I get rid of BOM?

How to remove BOM. If you want to remove the byte order mark from a source code, you need a text editor that offers the option of saving the mark. You read the file with the BOM into the software, then save it again without the BOM and thereby convert the coding. The mark should then no longer appear.

How do I save file in UTF-8 without BOM?

Download and install this powerful free text editor: Notepad++ Open the file you want to verify/fix in Notepad++ In the top menu select Encoding > Convert to UTF-8 (option without BOM) Save the file.


1 Answers

Using VIM

  1. Open file in VIM:

     vi text.xml 
  2. Remove BOM encoding:

     :set nobomb 
  3. Save and quit:

     :wq 

For a non-interactive solution, try the following command line:

vi -c ":set nobomb" -c ":wq" text.xml 

That should remove the BOM, save the file and quit, all from the command line.

like image 185
Joshua Pinter Avatar answered Sep 28 '22 00:09

Joshua Pinter