Question

I'm a great fan of R markdown, finding it even easier than weaving LaTeX for quick project documentation (less than 15 pages). However, I also have to support sometimes other Statistics packages (SPSS, Stata + SAS) and was wondering for equivalent solutions for these.

To some extend this might go back to using some kind of original Noweb code + markdown file to be compiled over the command line. I guess calling the other packages from R is another option.

I have had a look at this example by John Muschelli: http://rpubs.com/muschellij2/3888 and it looks as though he knitted Stata code into an R markdown file.

Can someone provide specific examples of how this can be done in Stata, SAS or SPSS?

I do know of SASweave and StatWeave (the latter is apparently broken???), but think that a markdown solution would be far more advantageous in our case.

Was it helpful?

Solution 2

John Muschelli pointed me to this Stata program:

https://github.com/amarder/stata-tutorial/blob/master/knitr.do

It parses a .domd file which contains markdown and Stata code and produces a .md file with executed Stata code. The name of the file to be parsed is at the end of the knitr.do file.

More specifcally:

  1. Download the knitr.do file from https://github.com/amarder/stata-tutorial/blob/master/knitr.do

  2. Download the clustered-standard-errors.domd file from https://github.com/amarder/stata-tutorial/blob/master/clustered-standard-errors.domd

  3. Save them both in some directory.

  4. Modify the last line of knitr.do to reflect the complete path of its directory (e.g. D:\Desktop\knit_example\clustered-standard-errors.domd

  5. Run knitr.do to get your markdown (.md) file (and an intermediate .md1 file).

Note that knitr.do contains the programs that do the work and a line (the last one):

knit "whatever-file.domd"

that calls the program.

So you basically write a .domd file [that of step (2) is only an example] containing Markdown syntax and Stata commands, run knitr.do adjusting the file name, and get a Markdown file with executed Stata commands.

There are several caveats:

  • Only one-liner Stata commands are allowed. A loop, for example, won't work.
  • ".domd" can't be part of the file name.
  • If there is an error with a Stata command, the user gets no return code.
  • File handles need to be manually closed if user hits the Break button when the program is running or if there is a Stata command error.

OTHER TIPS

Stata has its own SMCL for annotation of logs, the M standing for mark-up. The main reason for a different language is that SMCL has to be created and interpreted line by line in situations where no end of document is in sight, namely within interactive sessions. This is created by Stata automatically as annotation when you ask for it and can be stipulated by users or programmers as a way of tuning Stata's display choices.

The possible connection to your question is that SMCL can be translated to HTML, which opens various doors. So, something that is easy in Stata is to do some work, keep a log file in SMCL and then translate the log file to HTML. You would not get anything really nice without further work, but the further work is easy and amounts to doing what you would done any way, but in your favourite text editor or text processor, rather than within Stata.

This is made easier by log2html which Stata users can install using ssc inst log2html. It exploits a feature undocumented in Stata.

Stata's help files can also be translated to HTML in the same way (but consider copyright issues if doing this with official help files; it's fair play with your own help files).

I'm not sure if this is what you want, but if you're looking to create .html files in SAS that contain statistical reports within them, then you can use the Output Delivery System (ODS).

Example syntax is:

ods html file='pathofdirectory\filename.html' <additional options>;
    proc print...  (SAS code that generates output)
    proc means...
    proc freq...
    proc gchart...
    proc gplot...
    ...        
ods html close;

SPSS (and SAS I presume) have some overhead by the need to write everything to disk that makes the compilation in one fell swoop less appealing. Similar as to what Yick mentioned, SPSS has an output system that one can write automated reports to begin with and export to HTML or PDF or Word. It isn't the easiest thing to make look nice, but it is possible and additions to ease automated editing (mainly via Python scripts) are being rolled out on a regular basis.

Basically the automated reports I write now using SPSS and R have html shells. The code then just updates or inserts the needed tables and graphs. They are entirely self-contained, reproducible, and run on weekly or monthly timers without human intervention. They just don't have inline code blocks exactly defining how the tables are produced (you would have to trace the code slightly further back to figure it out - but that isn't too onerous IMO).

Because SPSS allows you to run SPSS code from the Python command prompt you could theoretically knit a document with Python code calling SPSS. I'm not quite sure I see the advantage of this over having more segmented code in seperate places though. Do you really want to read 100 lines of SPSS code that begins with an SQL query, does some transformations and produces a table and a graph? Wouldn't you rather see the table and graph, and then if interested in the nitty gritty go back to see DataPrep.sps that prepares all of the data, then see Table1.sps and Figure1.sps etc. to see how they each were exactly produced?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top