Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Unicode in fancyvrb’s VerbatimOut

Problem

VerbatimOut from the “fancyvrb” package doesn’t play nicely with UTF-8 characters.

Minimal working example:

\documentclass{minimal}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{fancyvrb}

\begin{document}
\begin{VerbatimOut}{\jobname.test}
é
\end{VerbatimOut}

\input{\jobname.test}
\end{document}

Error message

When compiled using pdflatex mini, this gives the error

File ended while scanning use of \UTFviii@three@octets.

A different error occurs when the sole occurrence of é above is replaced by something else, e.g. é */:

Package inputenc Error: Unicode char \u8:### not set up for use with LaTeX.

– indicating that in this case, LaTeX succeeds in reading a multi-byte UTF-8 character, but not knowing what to do with it (i.e. it’s the wrong character).

In fact, when I open the produced .test file manually, it contains the character é, but in Latin-1 encoding!

Proof: when I open the files in a hex editor, I get the following:

  • Original file: C3 A9 (corresponds to LATIN SMALL LETTER E WITH ACUTE in UTF-8)
  • Written file: E9 (corresponds to é in Latin-1)

Question

How to set VerbatimOut up correctly?

filecontents* (from “filecontents”) shows that it can work. Unfortunately, I don’t understand either code so I cannot fix fancyvrb’s code by replicating the logic from filecontents manually.

I also cannot use filecontents* instead of VerbatimOut because the former doesn’t work within a \newenvironment, while the latter does.

(Oh, by the way: vanilla Verbatim instead of VerbatimOut also works as expected. The error seems to occur when writing the file, not when reading the verbatim input)

like image 768
Konrad Rudolph Avatar asked Jan 25 '10 14:01

Konrad Rudolph


5 Answers

This is still unfixed? I'll take another look. What exactly do you want: your package to use VerbatimOut, or for it not to interfere with it?

Tests

TexLive 2009's Xelatex compiles fine. With pdflatex, version

This is pdfTeX, Version 3.1415926-1.40.10 (TeX Live 2009)

I get an error message that is rather more useful error message than you got:


! Argument of \UTFviii@three@octets has an extra }.
 
                \par 
l.8 é

? i \makeatletter\show\UTFviii@three@octets
! Undefined control sequence.
\GenericError  ...                                
                                                    #4  \errhelp \@err@     ...
l.8 é

If I were to make a wild guess, I'd say that inputenc with pdftex uses the pdftex primitives to do some hairy storing and restoring of character tables, and some table somewhere has got a rarely mistake in it.

Possibly related

I saw a post by Vladimir Volovich in the pdf-tex mailing list archives, all the way back from 2003, that discusses a conflict between inputenc & fancyvrb, and posts a patch to "solve the problem". Who knows, maybe he faced the same problem? It might be worth emailing him.

like image 33
Charles Stewart Avatar answered Nov 15 '22 04:11

Charles Stewart


XeTeX has much better Unicode support. The following run through xelatex produces “é” both in \jobname.test and the output PDF.

\documentclass{minimal}
\usepackage{fontspec}
\tracingonline=1
\usepackage{fancyvrb}

\begin{document}
\begin{VerbatimOut}{\jobname.test}
é
\end{VerbatimOut}

\input{\jobname.test}
\end{document}

fontspec loads the Latin Modern fonts, which have Unicode support. The standard TeX Computer Modern fonts don’t have the right tables for Unicode support.

If you use a character that does not have a glyph in the current font, by default XeTeX writes a blank space to the PDF and prints a warning in the log but not on the terminal. \tracingonline=1 prints the warning to the terminal.

like image 27
andrewdotn Avatar answered Nov 15 '22 03:11

andrewdotn


Is your end goal to write symbols and accents in Verbatim? Because you can do that like this:

\documentclass{article}
\usepackage{fancyvrb}
\begin{document}
\begin{Verbatim}[commandchars=\\\{\}]
\'{e} \~{e} \`{e} \^{e}
\end{Verbatim}
\end{document}

The commandchars option allows the \ { } characters to work as they normally would.

Source: http://ctan.mirror.garr.it/mirrors/CTAN/macros/latex/contrib/fancyvrb/fancyvrb.pdf

like image 149
Steve Tjoa Avatar answered Nov 15 '22 02:11

Steve Tjoa


On http://wiki.portal.chalmers.se/agda/pmwiki.php?n=Main.LiterateAgda, they suggest that you should use

\usepackage{ucs}
\usepackage[utf8x]{inputenc}

in the preabmle. I successfully used this in order to insert unicode into a verbatim environment.

like image 42
Alex Avatar answered Nov 15 '22 03:11

Alex


\documentclass{article}

\usepackage{fancyvrb}

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\newenvironment{MonVerbatim}{%
\count0=128\relax %
\loop
   \catcode\count0=11\relax
   \advance\count0 by 1\relax 
   \ifnum\count0<256
   \repeat
   \VerbatimOut[commandchars=\\\{\}]{VerbatimText.tex}%
}{\endVerbatimOut}

\newcommand\test{A command producing accented characters éà}

\begin{document}
\begin{MonVerbatim}
     A little bit text in verbatim mode éà_].
     \test
\end{MonVerbatim}
Followed by some accented character éà.
\end{document}

This code is working for me with TeXLive 2018 and pdflatex. Yous should probably avoid changing catcode if you are using a 16 bits TeX (lualatex or xelatex).

You can use the package "iftex" to check the tex engine used.

like image 1
Alan Avatar answered Nov 15 '22 02:11

Alan