Question

I'm sure this is an issue anyone who uses Stata for publications or reports has run into:

How do you conveniently export your output to something that can be parsed by a scripting language or Excel?

There are a few ado files that do this for specific commands. For example:

  • findit tabout
  • findit outreg2

But what about exporting the output of the table command? Or the results of an anova?

I would love to hear about how Stata users address this problem for either specific commands or in general.

Was it helpful?

Solution

This is an old question and a lot has happened since it was posted.

Stata now has several built-in commands and functions that allow anyone to export customized output fairly easily:

There are also equivalent Mata functions / classes, which offer greater flexibility:

From my experience, there aren't 100% general solutions. Community-contributed commands such as estout are now mature enough to handle most basic operations. That said, if you have something that deviates even slightly from the template you will have to program this yourself.

OTHER TIPS

After experimenting with this for a while, I've found a solution that works for me.

There are a variety of ADOs that handle exporting specific functions. I've made use of outreg2 for regressions and tabout for summary statistics.

For more simple commands, it's easy to write your own programs to save results automatically to plaintext in a standard format. Here are a few I wrote...note that these both display results (to be saved to a log file) and export them into text files – if you wanted to just save to text you could get rid of the di's and qui the sum, tab, etc. commands:

cap program drop sumout
program define sumout
    di ""
    di ""
    di "Summary of `1'"
    di ""
    sum `1', d
    qui matrix X = (r(mean), r(sd), r(p50), r(min), r(max))
    qui matrix colnames X = mean sd median min max
    qui mat2txt, matrix(X) saving("`2'") replace
end

cap program drop tab2_chi_out
program define tab2_chi_out
    di ""
    di ""
    di "Tabulation of `1' and `2'"
    di ""
    tab `1' `2', chi2
    qui matrix X = (r(p), r(chi2))
    qui matrix colnames X = chi2p chi2
    qui mat2txt, matrix(X) saving("`3'") replace
end


cap program drop oneway_out
program define oneway_out
    di ""
    di ""
    di "Oneway anova with dv = `1' and iv = `2'"
    di ""
    oneway `1' `2'
    qui matrix X = (r(F), r(df_r), r(df_m), Ftail(r(df_m), r(df_r), r(F)))
    qui matrix colnames X = anova_between_groups_F within_groups_df between_groups_df P
    qui mat2txt, matrix(X) saving("`3'") replace
end

cap program drop anova_out
program define anova_out
    di ""
    di ""
    di "Anova command: anova `1'"
    di ""
    anova `1'
    qui matrix X = (e(F), e(df_r), e(df_m), Ftail(e(df_m), e(df_r), e(F)), e(r2_a))
    qui matrix colnames X = anova_between_groups_F within_groups_df between_groups_df P RsquaredAdj
    qui mat2txt, matrix(X) saving("`2'") replace
end

The question is then how to get the output into Excel and format it. I found that the best way to import the text output files from Stata into Excel is to concatenate them into one big text file and then import that single file using the Import Text File... feature in Excel.

I concatenate the files by placing this Ruby code in the output folder and then running int from my Do file with qui shell cd path/to/output/folder/ && ruby table.rb:

output = ""
Dir.new(".").entries.each do |file|
  next if file =~/\A\./ || file == "table.rb" || file == "out.txt"
  if file =~ /.*xml/
    system "rm #{file}"
    next
  end

  contents = File.open(file, "rb").read

  output << "\n\n#{file}\n\n" << contents
end


File.open("out.txt", 'w') {|f| f.write(output)}

Once I import out.txt into its own sheet in Excel, I use a bunch of Excel's built-in functions to pull the data together into nice, pretty tables.

I use a combination of vlookup, offset, match, iferror, and hidden columns with cell numbers and filenames to do this. The source .txt file is included in out.txt just above the contents of that file, which lets you look up the contents of the file using these functions and then reference specific cells using vlookup and offset.

This Excel business is actually the most complicated part of this system and there's really no good way to explain it without showing you the file, though hopefully you can get enough of an idea to figure it out for yourself. If not, feel free to contact me through http://maxmasnick.com and I can get you more info.

I have found that the estout package is the most developed and has good documentation.

Most tutorials throw in several packages where it would indeed nice to have only one exporting everything, which is what Max suggests above with his interesting method.

I personally use tabout for summary statistics and frequencies, estout for regression output, and am trying out mkcorr for correlation matrixes.

It's been a while, but I believe you can issue a log command to capture the output.
log using c:\data\anova_analysis.log, text
[commands]
log close

I use estpost-- part of the estout package-- to tabulate results from non-estimation commands. You can then store them and export easily.

Here's an example:

estpost corr varA varB varC varD, matrix
est store corrs
esttab corrs using corrs.rtf, replace 

You can then add options to change formatting, etc.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top