Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Insert non-breaking space in [R]Markdown math expression for HTML output

I am writing scientific reports in bookdown and I would like to use non-breaking spaces as thousands separators follwoing the SI/ISO 31-0 standard.

Actually, I would prefer the non-breaking thin space (U+202F/ ) but for simplicty let's consider U+00A0/  for here.

I setup a knitr hook to do this on the fly:

knitr::knit_hooks$set(inline=function(output)
                               ifelse(is.numeric(output),
                                      prettyNum(round(output, 1),
                                                big.mark=' '),
                                      output))

This works as intended as long as I don't use any inline R-expressions returning numerical output > 999 within math expressions.

The following bookdown MWE illustrates the problem:

---
output:
  bookdown::html_document2: default
---
```{r set-output-hook, include=FALSE}
knitr::knit_hooks$set(inline=function(output)
                               ifelse(is.numeric(output),
                                      prettyNum(round(output, 1),
                                                big.mark=' '),
                                      output))
```

This works:
The product of $\pi$ and `r 1000` is `r pi*1000`.

This fails to render: 
$\pi\cdot`r 1000`=`r pi*1000`$

This renders but is cumbersome as it requires me to know *a priori* which
values might exceed 999:
$\pi\cdot1000=`r as.character(round(pi*1000, 1))`$

I tried to track it down and came up with the following rmarkdown MWE:

---
output:
  rmarkdown::html_document:
    keep_md: true
---

| Rmarkdown    | Render     | HTML                                                | Markdown     |
|--------------|------------|-----------------------------------------------------|--------------|
| `1000`       | 1000       |`1000`                                               | `1000`       |
|`$1000$`      |$1000$      |`<span class="math inline">\(1000\)</span>`          |`$1000$`      |
|              |            |                                                     |              |
|  `100,0`     | 100,0      |`100,0`                                              | `100,0`      |
|`$100,0$`     |$100,0$     |`<span class="math inline">\(100,0\)</span>`         |`$100,0$`     |
|              |            |                                                     |              |
|  `100 0`     | 100 0      |`100 0`                                              | `100 0`      |
|`$100 0$`     |$100 0$     |`<span class="math inline">\(100 0\)</span>`         |`$100 0$`     |
|              |            |                                                     |              |
|  `100&nbsp;0`| 100&nbsp;0 |`100 0`                                              | `100&nbsp;0` |
|`$100&nbsp;0$`|$100&nbsp;0$|`<span class="math inline">\(100&amp;nbsp;0\)</span>`|`$100&nbsp;0$`|

The first two columns of the table are sufficient to see the problem: Each pair of rows shows the number 1000 1 000) in text and math context; without any space, with a comma, with a simple space, and with a non-breaking space as thousands separator. The latter fails to render in math context.

To track down the problem, I inspected the resulting HTML and Markdown (keep_md: true) output and added the corresponding code as columns three and four for a better overview what's going on.

For clarity, here is an adjusted version of the above rmarkdown MWE replacing simple spaces by _ and non-breaking spaces by - in the HTML and Markdown output columns:

---
output:
  rmarkdown::html_document:
    keep_md: true
---

| Rmarkdown    | Render     | HTML                                                | Markdown     |
|--------------|------------|-----------------------------------------------------|--------------|
| `1000`       | 1000       |`1000`                                               | `1000`       |
|`$1000$`      |$1000$      |`<span_class="math_inline">\(1000\)</span>`          |`$1000$`      |
|              |            |                                                     |              |
|  `100,0`     | 100,0      |`100,0`                                              | `100,0`      |
|`$100,0$`     |$100,0$     |`<span_class="math_inline">\(100,0\)</span>`         |`$100,0$`     |
|              |            |                                                     |              |
|  `100 0`     | 100 0      |`100_0`                                              | `100_0`      |
|`$100 0$`     |$100 0$     |`<span_class="math_inline">\(100_0\)</span>`         |`$100_0$`     |
|              |            |                                                     |              |
|  `100&nbsp;0`| 100&nbsp;0 |`100-0`                                              | `100&nbsp;0` |
|`$100&nbsp;0$`|$100&nbsp;0$|`<span_class="math_inline">\(100&amp;nbsp;0\)</span>`|`$100&nbsp;0$`|

So from what I can tell

  1. This is not a bookdown issue as it can be reproduced by plain rmarkdown.
    • I'm just mentioning bookdown as I would be happy with a bookdown-specific work-around.
  2. This is not an rmarkdown issue, as the generated Markdown looks exactly as I would expect it to look like.
    • I'm just mentioning rmarkdown as I would be happy with an rmarkdown-specific work-around.
  3. This is not a MathJax issue, as the HTML code has the plain & replaced by &amp; and I would not expect that to render properly.
    • Anyways,I would be happy with an MathJax-related work-around.
  4. I suspect it's pandoc that replaces & by &amp; in code and math context but not in text context.
    • I'm sure if there is a way to convince pandoc not to do this, it will be easy to configure this through the rmarkdown YAML header.

Any idea on how to get the &nbsp; transferred literally from Markdown to HTML in math context would probably help me to figure out the rest.


Addendum:

As pointed out by @tarleb, $100&nbsp;0$ is not valid Latex. However, modifiying the HTML manually to contain \(100&nbsp;0\) works just fine as MathJax treats non-breaking spaces as spaces. As I am not concerned about PDF output via LaTex, this means simply not converting $100&nbsp;0$ to \(100&amp;nbsp;0\) but to \(100&nbsp;0\) (just as 100&nbsp;0 is not converted to 100&amp;nbsp;0 either) when converting the Markdown to HTML would be all that I need.

like image 436
mschilli Avatar asked Oct 17 '22 00:10

mschilli


1 Answers

Pandoc expects math environments to contain LaTeX math markup, not HTML. Conversion fails as pandoc tries to output $100&nbsp;000$ as LaTeX, but that gives \(100&amp;nbsp;000\) instead of what you intended.

As a solution, you could try to use the literal narrow no-break space unicode character "" in your hook.

Alternatively, one could use a pandoc lua filter (or possibly a R pandoc-filter) to force pandoc to pass-through math content unaltered:

-- filename: force plain math
function Math (el)
  if el.mathtype == 'DisplayMath' then
    return pandoc.RawInline('html', '\\[' .. el.text .. '\\]')
  else -- InlineMath
    return pandoc.RawInline('html', '\\(' .. el.text .. '\\)')
  end
end

Save to a file and use it by adding

output:
  bookdown::html_document2:
    pandoc_args: --lua-filter=force-plain-math.lua

to your document.

like image 92
tarleb Avatar answered Oct 21 '22 07:10

tarleb