Get ant concat to ignore BOM's'?

Question

I have an ant build that concatenates my javascript into one file and then compresses it. The problem is that Visual Studio's default encoding attaches a BOM to every file. How do I configure ant to strip out BOM's that would otherwise appear in the middle of the resulting concatenated file?

My googl'ing revealed this discussion which is the exact problem I'm having but doesn't provide a solution: http://marc.info/?l=ant-user&m=118598847927096

McDowell · Accepted Answer

The Unicode byte order mark codepoint is U+FEFF. This concatenation command will strip out all BOM characters when concatenating two files:

<concat encoding="UTF-8" outputencoding="UTF-8" destfile="nobom-concat.txt">
  <filelist dir="." files="bom1.txt,bom2.txt" />
  <filterchain>
    <deletecharacters chars="&#xFEFF;" />
  </filterchain>
</concat>

This form of the concat command tells the task to decode the files as UTF-8 character data. I'm assuming UTF-8 as this is usually where Java/BOM issues occur.

In UTF-8, the BOM is encoded as the bytes EF BB BF. If you needed it to appear at the start of the resultant file, you could use a subsequent concatenation to prefix the output file with a BOM again.

Encoded values for U+FEFF in other UTF encodings are listed here.

Get ant concat to ignore BOM's'?

Tags:

javascript

utf-8

ant

yui-compressor

Breck Fresen

1 Answers

McDowell

Recent Activity

Donate For Us

Get ant concat to ignore BOM's'?

Tags:

javascript

utf-8

ant

yui-compressor

Breck Fresen

1 Answers

McDowell

Related questions

Recent Activity

Donate For Us