Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating docx documents with markstat

    I thought it would be better to start a new thread to answer questions about the -markstat- command available from SSC, rather than keep bumping an old thread. I'll start with the last post #24 from here.

    @erik_a-booth noted that -markstat- could not find -markstat.docx- and confirmed that
    there is not a markstat.docx file in my ado plus folder.
    The -markstat.docx- file is essential for generating docx documents and should have been copied to the ado plus folder when you installed the command. Perhaps reinstall? As a check, there should be four files and a folder in ~/Library/Application Support/Stata/ado/plus/m: markstat.ado, markstat.css, markstat.docx, markstat.sthlp and s5/.

    Erick also provided a useful trace that shows that the command is looking for the file in the right place. It's just not there.

  • #2
    See the discussion here for more on the problem with markstat installtion (I didn't want to hijack this thread with a long post not directly about the markstat command itself).

    Comment


    • #3
      Reinstalling -markstat- worked. The explanation that William points to in the other thread makes sense for why those ancillary files were missing ( markstat.css, markstat.docx), but after reinstalling everything is working well.

      Another question on inserting images. When I run the commands:

      Code:
      copy http://data.princeton.edu/stata/markdown/fuel.stmd fuel.stmd , replace
      markstat using fuel,  
      markstat using fuel, pdf
      markstat using fuel, docx
      These all work except for this line in the fuel.stmd file:

      ![Fuel Efficiency](auto.png){width="4.5in"}

      leaves the sizing part in the docx, html, and pdf files ....looks like this underneath the scatterplot:


      Click image for larger version

Name:	Screen Shot 2017-12-13 at 8.23.17 AM.png
Views:	2
Size:	98.6 KB
ID:	1422278



      I cannot get rid of the text {width="4.5in"} in any of the documents ( or rather I cannot get markstat to interpret it).
      I tried changing it to {width=4.5in}, {width=50%}, and thinking it might be using something like Kramdown I used {:width=4.5in} and {width=4.5in;} but couldnt get it to work.

      Attached Files
      Eric A. Booth
      Senior Research Scientist, Gibson Consulting Group

      eric.a.booth@gmail.com | http://www.eric-a-booth.com

      Comment


      • #4
        Thanks William, I commented on installation and ancillary files on your thread here. There was an SSC pkg issue that affected recent markstat installations, but it was quickly resolved by the amazing Kit Baum,as noted in that thread.

        I was going to say this should explain why Eric was missing markstat.docx, but I see now he has already been there, done that. Regarding the problem with images, I just tried
        Code:
        % Troubleshooting
        
            sysuse auto, clear
            scatter mpg weight
            graph export see.png, width(800) replace
            
        ![Test](see.png){width="4.5in"}
        generating HTML, pdf and docx, and all three worked. The problem is that it looks exactly the same as you typed. I would check the pandoc version. I am using 2.0.1.1. The latest is 2.0.5 released yesterday.

        Comment


        • #5
          German Rodriguez I updated to pandoc 2.0.5 and now it works perfectly - that was it! Thanks again for all the help.
          Eric A. Booth
          Senior Research Scientist, Gibson Consulting Group

          eric.a.booth@gmail.com | http://www.eric-a-booth.com

          Comment


          • #6
            First, of all happy new year to everybody.
            I hope it is not bad to hijack this thread but I have at the moment one issues that fit exactly into the threads' title

            German Rodriguez Thanks a lot for this very useful tool I start using. I have to say that I am still quite puzzled by the multitude of options within Stata to do literate programming and for someone who wants to start with it, it is really hard to get set up and to proper evaluate all options.
            -markstat- looks to be the best allround option to me at the moment. Nonetheless, I have at themoment one issue with the folllowing setup: Stata 15.1, markstat 2.1.1, pandoc 2.06, miktex 2.9.6515, sjlatex 1.2.2, tabout 3.0.3

            I like to use the docx option in order to be able to share the results to my colleagues without puzzling them too much.

            I used your tabout-example, added a tab/ to both -using tab/table34.html- and the - .include tab/table34.html-. This works when I am using the
            Code:
            markstat using simplescript, strict
            but not with the -docx- option.

            I am rather sure that the problem is caused in the .include argument.

            See this code - that is tweaked to store files in different folders.
            Code:
            Let us read the fuel efficiency data that is shipped with Stata
            ```s/
                qui sysuse auto, clear
                
            *    qui fre foreign
                
                list make price headroom trunk weight length turn displacement gear_ratio if foreign==1
            ```
            To study how fuel efficiency depends on weight it is useful to
            transform the dependent variable from "miles per gallon" to
            "gallons per 100 miles"
            ```s
                gen gphm = 100/mpg
            ```
            We then obtain a more linear relationship
            ```s
                twoway scatter gphm weight || lfit gphm weight ///
                    , ytitle(Gallons per Mile) legend(off)
                graph export tab/auto.png, width(500) replace
            ```
            ![Fuel Efficiency](tab/auto.png)
            
            That's all for now!
            
            
            ```s/
                quietly sysuse auto, clear
                quietly sum weight
                local meanwt = r(mean)
                quietly sum length
                local meanlen = r(mean)
                local obs = r(N)
                quietly tabout rep78 foreign using tab/table34.html, replace ///
                    style(htm) font(italic) c(mean weight) f(0c) sum ///
                    twidth(9)  ///
                    title(Table 34: Short report example) fn(auto.dta)
            ```
            
            The title of my short report
            ----------------------------
            
            This is an excerpt of the report to focus on the dynamic parts.
            And now we have an important result: __`s %3.2f `meanwt'`__ is
            the average weight of all vehicles. And a second important
            result: __`s %3.2f `meanlen'`__ is the average length.
            
            And now we have the first table.
            
            <center>
            
            .include tab/table34.html
            
            </center>

            Comment


            • #7
              Another more side note is that I am not happy with the -list- command. It is not really readable when in Word.
              I tried to use -listtab- (from SSC) but somehow it has not produced a readble table.
              Code:
              listtab make price headroom trunk weight length turn displacement gear_ratio using tab/listtab.htm if foreign==1, rstyle(html) replace
              If someone knows a good way to include a -list- or -tab- command in literate programming (converting to docx) I would really be happy to read it.

              Comment


              • #8
                Happy New Year!

                Marc Kaulisch I am glad you like -markstat- so far. Your problem is related to the way the command works, weaving your Markdown narrative with Stata output. As long as the Stata commands generate standard SMCL output, things should work for all output formats.

                Some commands, such as -tabout-, can generate html or LaTeX output, and .include lets you include it directly in the output file. For this to work the formats have to match; the command has to generate html for a web page or LaTeX for a pdf document.

                Unfortunately this strategy cannot be applied to docx documents at the moment. In particular, we can't include a bit of html in a docx document. There is, however, a request for Pandoc to allow html chunks in a docx document. If/when that is implemented -markstat- will follow suit.

                I think your second question may reflect the same issue.

                Comment


                • #9
                  Marc Kaulisch - In addition to the suggestions from German Rodriguez, the strategies I've been using to quickly include summary info in docx files are (1) exporting the -tabout- table to .csv/txt , re-insheeting the data and listing it or (2) using -esttab-/-estpost-/-esttab- to produce descriptive/summary results. I've included some examples for both of these in the edited version of your code below (I removed the 'tab/' references since the make the code choke for those without this subdirectory).

                  Code:
                  Let us read the fuel efficiency data that is shipped with Stata
                  ```s
                      qui sysuse auto, clear
                      
                      *qui fre foreign
                      
                      list make price headroom trunk weight length turn displacement gear_ratio if foreign==1
                  ```
                  
                  To study how fuel efficiency depends on weight it is useful to
                  transform the dependent variable from "miles per gallon" to
                  "gallons per 100 miles"
                  ```s
                      gen gphm = 100/mpg
                  ```
                  We then obtain a more linear relationship
                  ```s/
                      su
                      twoway scatter gphm weight || lfit gphm weight ///
                          , ytitle(Gallons per Mile) legend(off)
                      graph export auto.png, width(500) replace
                  ```
                  ![Fuel Efficiency](auto.png)
                  
                  That's all for now!
                  
                  
                  ```s/
                      quietly sysuse auto, clear
                      quietly sum weight
                      local meanwt = r(mean)
                      quietly sum length
                      local meanlen = r(mean)
                      local obs = r(N)
                      
                      
                        qui  tabout2 rep78 foreign using table35.docx, replace ///
                           style(docx) font(italic) c(mean weight) f(0c) sum ///
                          twidth(9)  ///
                          title(Table 34: Short report example) fn(auto.dta)
                          
                  ```
                  
                  qui tabout rep78 foreign using table34.txt, replace ///
                             c(mean weight) f(0c) sum ///
                          title(Table 34: Short report example) fn(auto.dta)
                    
                  ```s/
                    qui sa temp.dta, replace
                    
                      qui insheet using table34.txt, nonames clear
                  ```
                  ```s
                      list *, noobs t div noh
                  ```    
                  ```s/
                   qui  u temp.dta, clear
                  ```        
                          
                  Here is a table with -eststo/estpost-
                  ```s/        
                      eststo clear
                    estpost tabstat price mpg rep78, listwise  statistics(mean sd)
                    esttab
                  ```
                  
                  The title of my short report
                  ----------------------------
                  
                  This is an excerpt of the report to focus on the dynamic parts.
                  And now we have an important result: __`s %3.2f `meanwt'`__ is
                  the average weight of all vehicles. And a second important
                  result: __`s %3.2f `meanlen'`__ is the average length.
                  Produces these types of tables in docx - which are not as clean/nice as html/tex tables, but get the job done for me at least.


                  Click image for larger version

Name:	Screen Shot 2018-01-04 at 7.35.49 AM.png
Views:	1
Size:	98.5 KB
ID:	1424413
                  Eric A. Booth
                  Senior Research Scientist, Gibson Consulting Group

                  eric.a.booth@gmail.com | http://www.eric-a-booth.com

                  Comment


                  • #10
                    German Rodriguez I hoped that there is a good way to produce the HTML-doc first and then automatically convert it to Word.

                    eric_a_booth I am not yet familiar with the estout/esttab-family - so there is something new to learn ;-) But at the moment I am more dealing with descriptive statistics and not regression results. (I fear that it is no coincidence that most examples presented in the commands are regression or summary examples and less examples from categorical data.)

                    About your script I wonder why you do not preserve and restore. Something like
                    Code:
                    ```s/
                    qui tabout rep78 foreign using table34.txt, replace ///
                               c(mean weight) f(0c) sum ///
                            title(Table 34: Short report example) fn(auto.dta)
                    
                    *  qui sa temp.dta, replace
                    qui preserve
                       qui insheet using table34.txt, nonames clear
                        list *, noobs t div noh
                    * qui  u temp.dta, clear
                    qui restore
                    ```
                    At least this works for me and I save to have another file.

                    Speaking of having another file. I started my endeavour yesterday by using the -putdocx-. I wonder if it would be another strategy to use the -putdocx append- command in order to bring all those docs together.
                    This might destroy the whole literate programming idea because I would have to generate a master-do-file that calls all the do-files/stmd-files for each part of the text. So basically you cannot read down the text anymore. But anyway, I am also not convinced that my audience should be too much bothered with the actual commands I used (that they don't understand anyway). So I have to call the commands quietly anyway.
                    The other strategy might be as suggested in Ian Watsons' tabout tutorial to link the doc-files into my report(s).
                    Or I will see if the new -putwrap- (from SSC) helps to use the -putdocx- built-in functionalty better.

                    I guess I will report back to you when I am more advanced.

                    Comment


                    • #11
                      Marc Kaulisch Yes - I should have used the preserve/restore in this example.
                      ((I adopted the example over from another, more complicated file where I had already preserved / restored around that code bloc and so I couldnt use a second instance here. When I moved it over I should have changed it - sorry for the confusion.))
                      Eric A. Booth
                      Senior Research Scientist, Gibson Consulting Group

                      eric.a.booth@gmail.com | http://www.eric-a-booth.com

                      Comment


                      • #12
                        Originally posted by Marc Kaulisch View Post
                        German Rodriguez I hoped that there is a good way to produce the HTML-doc first and then automatically convert it to Word.
                        This was going to be my suggestion. Use -markstat- to get to HTML, then use one of the several user written -pandoc- utilities to convert from HTML to Word. You ought to be able to open an HTML manually from Word, too, right?

                        Comment


                        • #13
                          In my experience Word does a pretty good job opening HTML files, so that's certainly a way to go. And of course you can then save the Word document as pdf if that's what you want. Where the docx option of markstat may have an advantage is when you have mathematical equations authored in LaTeX notation,because thanks to Pandoc these are converted into native Word equation objects.

                          Comment


                          • #14
                            eric_a_booth preserve - I assumed you did it intentionally and I would've liked to know if there is a trick I do not know - so thanks for clarifiying.

                            Doug Hemken Do you have a suggestion?

                            German Rodriguez I retested it and it does not workout like that because the table-look-a-likes in the Stata-code areas <pre class='stata'> are not formatted as HTML-tables and as such I get worse results when importing the HTML-file into Word opposed to construct the Word-file directly with your -docx- option.
                            So I think to solve this problem would be to get the SMCL into HTML - it might be that -log2html- (from SSC) might help here but it looks to be more complicated.

                            Comment


                            • #15
                              Maybe I am getting there where I want to be (kind of...):
                              I use -listtab- (from SSC) in order to generate a HTML-File that can be included into the stmd-file.
                              Code:
                              Let us read the fuel efficiency data that is shipped with Stata
                              ```s/
                                  qui sysuse auto, clear
                                  
                                  qui fre foreign
                                  
                                  *list make price headroom trunk weight length turn displacement gear_ratio if foreign==1, noobs clean
                              
                                  qui listtab make price mpg using listtab_html.html if foreign==1, rstyle(html) replace  head(`"<table border="1">"' "<tr><th>Make and Model</th><th>Weight (lbs)</th><th>Mileage (mpg)</th></tr>") foot("</table>")
                              ```
                              
                              .include listtab_html.html
                              I will test further if I am able to get -fre- (from SSC) to be outputted into HTML.

                              Comment

                              Working...
                              X