How can I add BOM (unicode signature) while saving file in python:
file_old = open('old.txt', mode='r', encoding='utf-8') file_new = open('new.txt', mode='w', encoding='utf-16-le') file_new.write(file_old.read())
I need to convert file to utf-16-le + BOM
. Now script is working great, except that there is no BOM.
The UTF-8 BOM is a sequence of bytes at the start of a text stream ( 0xEF, 0xBB, 0xBF ) that allows the reader to more reliably guess a file as being encoded in UTF-8. Normally, the BOM is used to signal the endianness of an encoding, but since endianness is irrelevant to UTF-8, the BOM is unnecessary.
"sig" in "utf-8-sig" is the abbreviation of "signature" (i.e. signature utf-8 file). Using utf-8-sig to read a file will treat the BOM as metadata that explains how to interpret the file, instead of as part of the file contents.
Write it directly at the beginning of the file:
file_new.write('\ufeff')
It's better to use constants from 'codecs' module.
import codecs f.write(codecs.BOM_UTF16_LE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With