Convert between Rmd and Rnw files with R

Tired of making two copies of markup documents for your R scripts so that you can generate exactly the kind of html you want and exactly the kind of pdf you want? Me too. I would like some quasi-bidirectional quasi-conversion between Rmd and Rnw files. I have a messy solution of sorts that is tentatively “good enough” for my own needs.

My rpm code now includes an early and ugly R function, convertDocs, and a suite of support functions which assist in this type of file conversion.

I personally use Rmd files to mark up my R code for the resulting md and html documents much more often than I use Rnw files to yield pdf documents. However, I use enough of both, and often for the same code, that I find it frustrating to duplicate my documentation in cases where I have many files containing significant amounts of code, text, and/or graphics.

It is difficult to make a function which will seamlessly convert between both types of files so that I only have to write one of them. I would go so far as to say it is impossible to do generally. In fact it is often the case that the very reason someone may be using both types of files is because they wish to highlight different aspects of code and project documentation in a web page vs. in a pdf file for example.

The goal is not to make a perfect conversion, but rather to minimize the human labor involved in making two versions of the same thing. I write many of these files for my projects, and update them often. I love knitr and rmarkdown, packages which do all the real work, in conjunction with other libraries and software. I just happen to have much to do and never enough time in a day, so I need this last bit of legwork ironed out.

So I began a function to assist me in this task. It has a very minimalist and unrefined rule set for determining how to replace content from Rmd and Rnw files in each direction during file conversion. I won’t list the rules right here because they will be continue to be refined, but they can be viewed along with the code.

Also, note that when I speak of updating Rmd or Rnw documents, I do not mean updating R code from the scripts these documents parse. If you are making your documentation in the spirit of dynamic report generation then your documentation should not require hands-on alterations following source code changes. I am referring to changes to content that is original to the documents themselves such as all the encompassing text. If you have two copies, one Rmd and one Rnw, you have to edit both analogously every time, and make sure they never diverge. The more that can be converted automatically the better.

This code has a long way to go even if currently functional to some degree. I also don’t LaTeX a ton. So any feedback is welcome. I imagine there are others who would be great at this type of conversion, being highly versed in both in the syntax of both sides. Perhaps it’s even been done already to some extent.

This entry was posted by Matt Leonawicz.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: