Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PGError: ERROR: invalid byte sequence for encoding "UTF8

I'm getting the following PGError while ingesting Rails emails from Cloudmailin:

PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xbb HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". : INSERT INTO "comments" ("content") VALUES ('Reply with blah blah  ����������������������������������������������������� .....

So it seems pretty clear I have some invalid UTF8 characters getting into the email right? So I tried to clean that up but something is still Sneaking through. Here's what I have so far:

message_all_clean = params[:message]
Iconv.conv('UTF-8//IGNORE', 'UTF-8', message_all_clean)
message_plain_clean = params[:plain]
Iconv.conv('UTF-8//IGNORE', 'UTF-8', message_plain_clean)

@incoming_mail = IncomingMail.create(:message_all => Base64.encode64(message_all_clean), :message_plain => Base64.encode64(message_plain_clean))

Any ideas, thoughts or suggestions? Thanks

like image 392
AnApprentice Avatar asked Jan 23 '11 02:01

AnApprentice


1 Answers

When encountering this issue on Heroku, we converted to US-ASCII to sanitize incoming data appropriately (i.e. pasted from Word):

Iconv.conv("UTF-8//IGNORE", "US-ASCII", content)

With this, we had no more issues with character encoding.

Also, double check that there's no other fields that need the same conversion, as it could affect anything that's passing a block of text to the database.

like image 186
Dominic Avatar answered Oct 24 '22 19:10

Dominic