I'm generating some odt/docx reports via markdown using knitr and pandoc and am now wondering how you'd go about formating tables. Primarily I'm interested in adding rules (at least top, bottom and one below the header, but being able to add arbitrary ones inside the table would be nice too).
Running the following example from the pandoc documentation through pandoc (without any special parameters) just yields a "plain" table without any kind of rules/colours/guides (in either -t odt
or -t docx
).
+---------------+---------------+--------------------+ | Fruit | Price | Advantages | +===============+===============+====================+ | Bananas | $1.34 | - built-in wrapper | | | | - bright color | +---------------+---------------+--------------------+ | Oranges | $2.10 | - cures scurvy | | | | - tasty | +---------------+---------------+--------------------+
I've looked through the "styles" for the possibility of specifying table formating in a reference .docx/.odt but found nothing obvious beyond "table header" and "table contents" styles, both of which seem to concern only the formatting of text within the table.
Being rather unfamiliar with WYSIWYG-style document processors I'm lost as to how to continue.
An extension can be enabled by adding +EXTENSION to the format name and disabled by adding -EXTENSION . For example, --from markdown_strict+footnotes is strict Markdown with footnotes enabled, while --from markdown-footnotes-pipe_tables is pandoc's Markdown without footnotes or pipe tables.
If the template is not found, pandoc will search for it in the templates subdirectory of the user data directory (see --data-dir ). If this option is not used, a default template appropriate for the output format will be used (see -D/--print-default-template ).
Pandoc is a tool that runs from the command window or Linux shell to convert a markdown file to another file format. For this workflow, with the use of an appropriate CSS is available to help convert HTML, PDF, DOCX, and EPUB. You can also use your own or discovered CSS as well.
Generating PDF from Markdown with Pandoc There are actually two steps involved in converting a Markdown file to a PDF file: The Markdown source file is converted to a LaTeX source file. Pandoc invokes the pdflatex or xelatex or other TeX command and converts the . tex source file to a PDF file.
Here's how I searched how to do this:
The way to add a table in Docx is to use the <w:tbl>
tag. So I searched for this in the github repository, and found it in this file (called Writers/Docx.hs, so it's not a big surprise)
blockToOpenXML opts (Table caption aligns widths headers rows) = do let captionStr = stringify caption caption' <- if null caption then return [] else withParaProp (pStyle "TableCaption") $ blockToOpenXML opts (Para caption) let alignmentFor al = mknode "w:jc" [("w:val",alignmentToString al)] () let cellToOpenXML (al, cell) = withParaProp (alignmentFor al) $ blocksToOpenXML opts cell headers' <- mapM cellToOpenXML $ zip aligns headers rows' <- mapM (\cells -> mapM cellToOpenXML $ zip aligns cells) $ rows let borderProps = mknode "w:tcPr" [] [ mknode "w:tcBorders" [] $ mknode "w:bottom" [("w:val","single")] () , mknode "w:vAlign" [("w:val","bottom")] () ] let mkcell border contents = mknode "w:tc" [] $ [ borderProps | border ] ++ if null contents then [mknode "w:p" [] ()] else contents let mkrow border cells = mknode "w:tr" [] $ map (mkcell border) cells let textwidth = 7920 -- 5.5 in in twips, 1/20 pt let mkgridcol w = mknode "w:gridCol" [("w:w", show $ (floor (textwidth * w) :: Integer))] () return $ [ mknode "w:tbl" [] ( mknode "w:tblPr" [] ( [ mknode "w:tblStyle" [("w:val","TableNormal")] () ] ++ [ mknode "w:tblCaption" [("w:val", captionStr)] () | not (null caption) ] ) : mknode "w:tblGrid" [] (if all (==0) widths then [] else map mkgridcol widths) : [ mkrow True headers' | not (all null headers) ] ++ map (mkrow False) rows' ) ] ++ caption'
I'm not familiar at all with Haskell, but I can see that the border-style is hardcoded, since there is no variable in it:
let borderProps = mknode "w:tcPr" [] [ mknode "w:tcBorders" [] $ mknode "w:bottom" [("w:val","single")] () , mknode "w:vAlign" [("w:val","bottom")] () ]
That means that you can't change the style of the docx tables with the current version of PanDoc. Howewer, there's a way to get your own style.
word/document.xml
and search for the <w:tbl>
Here's a test with a border-style I created:
And here is the corresponding XML:
<w:tblBorders> <w:top w:val="dotted" w:sz="18" w:space="0" w:color="C0504D" w:themeColor="accent2"/> <w:left w:val="dotted" w:sz="18" w:space="0" w:color="C0504D" w:themeColor="accent2"/> <w:bottom w:val="dotted" w:sz="18" w:space="0" w:color="C0504D" w:themeColor="accent2"/> <w:right w:val="dotted" w:sz="18" w:space="0" w:color="C0504D" w:themeColor="accent2"/> <w:insideH w:val="dotted" w:sz="18" w:space="0" w:color="C0504D" w:themeColor="accent2"/> <w:insideV w:val="dotted" w:sz="18" w:space="0" w:color="C0504D" w:themeColor="accent2"/> </w:tblBorders>
I didn't have a look at it yet, ask if you don't find by yourself using a similar method.
Hope this helps and don't hesitate to ask something more
Same suggestion as edi9999: hack the xml content of converted docx. And the following is my R code for doing that.
The tblPr
variable contains the definition of style to be added to the tables in docx. You could modify the string to satisfy your own need.
require(XML) docx.file <- "report.docx" tblPr <- '<w:tblPr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"><w:tblStyle w:val="a8"/><w:tblW w:w="0" w:type="auto"/><w:tblBorders><w:top w:val="single" w:sz="4" w:space="0" w:color="000000" w:themeColor="text1"/><w:left w:val="single" w:sz="4" w:space="0" w:color="000000" w:themeColor="text1"/><w:bottom w:val="single" w:sz="4" w:space="0" w:color="000000" w:themeColor="text1"/><w:right w:val="single" w:sz="4" w:space="0" w:color="000000" w:themeColor="text1"/><w:insideH w:val="single" w:sz="4" w:space="0" w:color="000000" w:themeColor="text1"/><w:insideV w:val="single" w:sz="4" w:space="0" w:color="000000" w:themeColor="text1"/></w:tblBorders><w:jc w:val="center"/></w:tblPr>' ## unzip the docx converted by Pandoc system(paste("unzip", docx.file, "-d temp_dir")) document.xml <- "temp_dir/word/document.xml" doc <- xmlParse(document.xml) tbl <- getNodeSet(xmlRoot(doc), "//w:tbl") tblPr.node <- lapply(1:length(tbl), function (i) xmlRoot(xmlParse(tblPr))) added.Pr <- names(xmlChildren(tblPr.node[[1]])) for (i in 1:length(tbl)) { tbl.node <- tbl[[i]] if ('tblPr' %in% names(xmlChildren(tbl.node))) { children.Pr <- xmlChildren(xmlChildren(tbl.node)$tblPr) for (j in length(added.Pr):1) { if (added.Pr[j] %in% names(children.Pr)) { replaceNodes(children.Pr[[added.Pr[j]]], xmlChildren(tblPr.node[[i]])[[added.Pr[j]]]) } else { ## first.child <- children.Pr[[1]] addSibling(children.Pr[['tblStyle']], xmlChildren(tblPr.node[[i]])[[added.Pr[j]]], after=TRUE) } } } else { addSibling(xmlChildren(tbl.node)[[1]], tblPr.node[[i]], after=FALSE) } } ## save hacked xml back to docx saveXML(doc, document.xml, indent = F) setwd("temp_dir") system(paste("zip -r ../", docx.file, " *", sep="")) setwd("..") system("rm -fr temp_dir")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With