Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I solve: pandoc: ...:hGetContents: invalid argument (invalid byte sequence)

I'm running the following code in terminal on Mac OSX 10.6.8:

find . -name \*.html -type f -exec pandoc -o {}.md {} \;

It parses some documents, but gives me this error on quite a few:

pandoc: ./Teaching/how_16825_make-lesson-book.html: hGetContents: invalid argument (invalid byte sequence)

Any idea how to fix this?

like image 725
rev Avatar asked Apr 27 '12 22:04

rev


1 Answers

Having the same problem I also see this is in the Pandoc README.html file:---

Pandoc uses the UTF-8 character encoding for both input and output. If your local character encoding is not UTF-8, you should pipe input and output through iconv:

iconv -t utf-8 input.txt | pandoc | iconv -f utf-8

Of course you may need iconv instqalled first (Mac Osx already has it I beleive) ...

http://gnuwin32.sourceforge.net/packages/libiconv.htm Gnu Win32

https://code.google.com/p/win-iconv/ Google Win-Iconv

like image 128
PaulANormanNZ Avatar answered Sep 24 '22 17:09

PaulANormanNZ