Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove ÿþ from string

Tags:

php

id3

I'm trying to read ID3 data in bulk. On some of the tracks, ÿþ appears. I can remove the first 2 characters, but that hurts the tracks that don't have it.

This is what I currently have:

$trackartist=str_replace("\0", "", $trackartist1);

Any suggestions would be greatful, thanks!

like image 549
austinh Avatar asked Dec 25 '22 03:12

austinh


1 Answers

ÿþ is 0xfffe in UTF-8; this is the byte order mark in UTF-16. You can convert your string to UTF-8 with iconv or mb_convert_encoding():

$trackartist1 = iconv('UTF-16LE', 'UTF-8', $trackartist1);

# Same as above, but different extension
$trackartist1 = mb_convert_encoding($trackartist1, 'UTF-16LE', 'UTF-8');

# str_replace() should now work
$trackartist1 = str_replace('ÿþ', '', $trackartist1);

This assumes $trackartist1 is always in UTF-16LE; check the documentation of your ID3 tag library on how to get the encoding of the tags, since this may be different for different files. You usually want to convert everything to UTF-8, since this is what PHP uses by default.

like image 51
Martin Tournoij Avatar answered Dec 27 '22 18:12

Martin Tournoij