I am using a Javascript file that is a concatenation of other JavaScript files.
Unfortunately, the person who concatenated these JavaScript files together did not use the proper encoding when reading the file, and allowed a BOM for every single JavaScript file to get written to the concatenated JavaScript file.
Does anyone know a simple way to search through the concatenated file and remove any/all BOM markers?
Using PHP or a bash script for Mac OSX would be great.
How to remove BOM. If you want to remove the byte order mark from a source code, you need a text editor that offers the option of saving the mark. You read the file with the BOM into the software, then save it again without the BOM and thereby convert the coding. The mark should then no longer appear.
Set the encoding to utf-8-sig to remove the BOM character when reading from a file, e.g. with open('example. txt', 'r', encoding='utf-8-sig') as f: . The utf-8--sig encoding skips the BOM byte if it appears as the first byte in the file.
The Byte-Order-Mark (or BOM), is a special marker added at the very beginning of an Unicode file encoded in UTF-8, UTF-16 or UTF-32. It is used to indicate whether the file uses the big-endian or little-endian byte order. The BOM is mandatory for UTF-16 and UTF-32, but it is optional for UTF-8.
fetch BOM files
grep -rIlo $’^\xEF\xBB\xBF’ ./
remove BOM files
grep -rIlo $’^\xEF\xBB\xBF’ . | xargs sed –in-place -e ‘s/\xef\xbb\xbf//’
exclude .svn dir
grep -rIlo –exclude-dir=”.svn” $’^\xEF\xBB\xBF’ . | xargs sed –in-place -e ‘s/\xef\xbb\xbf//’
I normally do it using vim
:
vim -c "set nobomb" -c wq! myfile
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With