Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by Hua Peng (StataCorp) View Post
    Maarten Buis,

    Code:
    hep putdocx paragraph
    in the text_options table, see hyperlink(link) option. Does this meet your need?
    I read this request as wanting to embed hyperlinks to locations within the document, e.g., to bookmarks. This option appears to be only for external URLs.

    Comment


    • Originally posted by Hua Peng (StataCorp) View Post
      Code:
      hep putdocx paragraph
      in the text_options table, see hyperlink(link) option. Does this meet your need?
      Hua Peng (StataCorp),

      Thank you for the suggestion. If I understand the helpfile correctly, then this enables one to include a link to an external webpage. What I was looking for was the possibility to link to another spot in the same document.

      To give you an idea of what I am looking for, here is a .do file for creating a codebook for the auto dataset. It starts with a list of variables on the first page, and than page after page of more detailed descriptions of each variable. The way someone would use the codebook is the she/he looks at the list of variables, see if some variable looks interesting based on the name or label, than go to the page with detailed description of that variable make a decision on whether that variable is really interesting for her/his purposes, go back to the list of variables again find the next potentially useful variable etc. So there is a lot of back and forth between the list of variables and the detailed descriptions of the variables (all in the same document). With only 12 variables that is manageable, in a dataset with 100s of variables repeatedly having to find the right page gets really annoying really quickly. What I would like is to have the name of the variable in the list of variables (the entries in the table vars) be a link to the page that contains the descriptives for that variable.

      As an aside, I know I could have used the new table command for some the tables in the code below, but I am preparing a course for an organization that has Stata 16 not 17.

      Code:
      clear all
      cd "c:\temp"
      
      sysuse auto, clear
      
      putdocx begin, footer(footer_pn) pagesize(A4)
      
      //footer
      putdocx paragraph, tofooter(footer_pn)
      putdocx pagenumber
      putdocx text ("/")
      putdocx pagenumber, totalpages
      
      //title
      putdocx paragraph, style(Heading1)
      mata: st_local("fn",pathbasename(`"`c(filename)'"'))
      local fl : data label
      putdocx text ("Codebook for `fn'")
      if "`fl'" != "" {
          putdocx text (": `fl'"), italic
      }
      
      //data properties
      putdocx paragraph, style(Heading2)
      putdocx text ("Properties of file")
      
      putdocx table file = (3,2), layout(autofitcontents)
      putdocx table file(1,1) = ("no. of variables"), bold
      putdocx table file(1,2) = ("`c(k)'")
      putdocx table file(2,1) = ("no. of observations"), bold
      putdocx table file(2,2) = ("`=_N'"), nformat(%9.0gc) trim
      putdocx table file(3,1) = ("last saved"), bold
      putdocx table file(3,2) = ("`c(filedate)'")
      
      putdocx table file(.,.), border(all,nil)
      putdocx table file(1,.), border(top,single)
      putdocx table file(3,.), border(bottom, single)
      
      //list of variables
      putdocx paragraph, style(Heading2)
      putdocx text ("List of variables")
      
      putdocx table vars = (`=`c(k)'+1',2), layout(autofitcontents)
      putdocx table vars(1,1) = ("variable name"), bold
      putdocx table vars(1,2) = ("label"), bold
      local i = 2
      foreach var of varlist * {
          putdocx table vars(`i'  , 1) = ("`var'") ,
          putdocx table vars(`i++', 2) = (`"`: variable label `var''"'),
      }
      putdocx table vars(.,.), border(all, nil)
      putdocx table vars(1,.) , border(bottom, single) border(top, single)
      putdocx table vars(`=`c(k)'+1',.) , border(bottom, single)
      
      //properties of variables
      
      foreach var of varlist * {
          // title
          putdocx pagebreak
          putdocx paragraph, style(Heading2)
          putdocx text ("`var'")
          if `"`: variable label `var''"' != "" {
              putdocx text (`": `: variable label `var''"'), italic
          }
          
          // collect info and frequency table or summary statistics
          capture confirm string variable `var'
          local tab = _rc == 0
          local rawtype = cond(`tab', "string", "numeric")
          local type : type `var'
          tempvar mark
          bys `var' : gen byte `mark' = _n == 1 if !missing(`var')
          count if `mark' == 1
          local tab = (r(N) <= 10) | `tab'
          local n_distinct = r(N)
          
          // variable properties
          putdocx paragraph, style(Heading3)
          putdocx text ("Properties of variable")
      
          putdocx table desc_`var' = (4,2), layout(autofitcontents)
          putdocx table desc_`var'(1,1) = ("type"), bold
      
          putdocx table desc_`var'(1,2) = ("`rawtype' (`type')")
      
          putdocx table desc_`var'(2,1) = ("missing values"), bold
          count if missing(`var')
          putdocx table desc_`var'(2,2) = (r(N)), nformat(%9.0gc) trim
       
          putdocx table desc_`var'(3,1) = ("non-missing values"), bold
          putdocx table desc_`var'(3,2) = (_N-r(N)), nformat(%9.0gc) trim
          
          putdocx table desc_`var'(4,1) = ("distinct non-missing values"), bold
          putdocx table desc_`var'(4,2) = (`n_distinct'), nformat(%9.0gc) trim
      
          putdocx table desc_`var'(.,.), border(all, nil)
          putdocx table desc_`var'(1,.), border(top, single)
          putdocx table desc_`var'(4,.), border(bottom, single)
      
          if `tab' {
                 putdocx paragraph , style(Heading3)
              putdocx text ("Table")
              
              frame
              local data = r(currentframe)
              frame copy `data' table, replace
              frame change table
      
              tempvar freq
              bysort `var' : gen `freq' = _N
              by     `var' : keep if _n == 1
      
              local val_lab : value label `var'
              if "`val_lab'" != "" {
                  tempvar lab
                  decode `var', gen(`lab')
                  local label "label"
              }
      
              keep `lab' `var' `freq'
              order `var' `lab' `freq'
              rename `var' value
              if "`val_lab'" != "" {
                  rename `lab' label
                  label values value .
              }
              rename `freq' frequency
          
              putdocx table tab_`var' = data(value `label' frequency), varnames layout(autofitcontents)
              putdocx table tab_`var'(.,.), border(all, nil)
              putdocx table tab_`var'(1,.), border(top, single) bold
              putdocx table tab_`var'(1,.), border(bottom, single)
              putdocx describe tab_`var'
              putdocx table tab_`var'(`r(nrows)',.), border(bottom, single)
      
              frame change `data'        
          }
          else {
                putdocx paragraph , style(Heading3)
              putdocx text ("Summary")
          
              sum `var', detail
      
              putdocx table tab_`var' = (5,2), layout(autofitcontents)
              putdocx table tab_`var'(1,1) = ("minimum"), bold
              putdocx table tab_`var'(2,1) = ("25th percentile"), bold
              putdocx table tab_`var'(3,1) = ("50th percentile"), bold
              putdocx table tab_`var'(4,1) = ("75th percentile"), bold
              putdocx table tab_`var'(5,1) = ("maximum"), bold
              putdocx table tab_`var'(1,2) = (r(min))
              putdocx table tab_`var'(2,2) = (r(p25))
              putdocx table tab_`var'(3,2) = (r(p50))
              putdocx table tab_`var'(4,2) = (r(p75))
              putdocx table tab_`var'(5,2) = (r(max))
      
              putdocx table tab_`var'(.,.), border(all, nil)
              putdocx table tab_`var'(1,.), border(top, single)
              putdocx table tab_`var'(5,.), border(bottom, single)        
          }
      }
      
      //close
      putdocx save cb, replace
      Last edited by Maarten Buis; 12 Jan 2022, 04:15.
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • Latent class growth curve models. If we have a 'k'-class solution for a model (estimated by gsem, where k=any numer 2/3 etc. of class solutions), Stata does not allow the categorical latent classes to have varying slopes. In M-plus that is possible as implemented in this paper . I attached the main figure in context where k=number of latent class solutions, age/vas etc are potential covariates for class membership, groups are randomised (0=control, 1=treatment), u=observed compliance of treatment (Treatment group: 0=no,1=yes; Control group: missing) and F_D* are the outcomes observed over six time periods.

        Click image for larger version

Name:	lc.jpg
Views:	1
Size:	71.7 KB
ID:	1644925
        Roman

        Comment


        • Maarten Buis

          You can use the hyperlink function to insert bookmark links. Replace your line 48 with:

          Code:
          putdocx table vars(`i'  , 1) = ("`var'"),hyperlink(`"cb.docx#`var': `: variable label `var''"') ,

          Comment


          • Originally posted by Ali Atia View Post
            Maarten Buis

            You can use the hyperlink function to insert bookmark links. Replace your line 48 with:

            Code:
            putdocx table vars(`i' , 1) = ("`var'"),hyperlink(`"cb.docx#`var': `: variable label `var''"') ,
            That may be a good solution for a document intended to remain a docx file. Many times I will convert these from Word to PDF afterwards, in which case those hard-coded links would fail. Compare for example, making a hyperlink in Word that points to an existing location in the document, which will be converted appropriately in a PDF to a link within that new file. This later functionality is what I hope Stata could provide a solution to.

            Comment


            • Ali Atia thank you very much

              Leonardo Guizzetti I see the problem. I don't know if this is a Stata problem or a Word problem. In essence Word needs to know that it is referring to itself, and translate that the pdf document it creates. I don't say you are wrong, I really mean that I don't know where the solution should be.
              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment


              • Maarten Buis I'm not sure either, but I think it may be something that might be handled by Stata. At least, Word represents links using field codes, and there is the option to that field code that directs Word to use a link to the external world into an internal location. I'll keep my fingers crossed in hopes that this is something StataCorp is willing and able to tackle.

                Comment


                • Ali Atia got me half-way to a solution. If one creates a hyperlink starting with a hash mark (#), this will automatically be recognized by Word as a hyperlink to a bookmark inside the document. Note the rules for valid bookmark names is that they must start with a letter, and may then be a mix of numbers, letters or underscores. E.g.,

                  Code:
                  putdocx text ("link to bookmark") , hyperlink("#Name")
                  Then one would have to insert/create those bookmarks in the final document, but the links remain and will be respected upon export to PDF.

                  Comment


                  • Do not know whether Stata 17 has this feature. Is there anyhow Stata report the error code with the number of the line where the error comes from in the do-file? If not, I think this would be interesting, specially for very long do-files.

                    Comment


                    • re: #249, see
                      Code:
                      help trace

                      Comment


                      • Fairly recently, Stata acquired the ability to access the value of a matrix cell identifying the cell by the names of the rows and columns. So, for example it became possible to say things like -local cell = M["rowname", "colname"]. Apparently this works only when reading a matrix. If you try -matrix M["rowname", "colname"] = expression-, you get a type mismatch error. It would be helpful to be able to use this syntax to also write to a matrix indexing it by row and column names instead of row and column numbers.

                        Comment


                        • Finally introducing a proper missing number that is NA. Missing = infinity is probably the most common cause of coding errors.
                          For instance, I just found out that the sorts I've been are wrong: https://github.com/matthieugomez/stata-pitfalls/wiki

                          Comment


                          • Henry Strawforrd #252 -

                            1) If sorting Stata missing values is a problem that NA solves, given this dataset
                            Code:
                                 +--------------+
                                 |  ID   profit |
                                 |--------------|
                              1. | 101        0 |
                              2. | 102       42 |
                              3. | 103       NA |
                              4. | 104     -666 |
                                 +--------------+
                            what is the expected output of
                            Code:
                            sort profit
                            2) My point in (1) is that different users will have different expectations, and so the behavior of NA, whatever it might be, will need to be documented. But the output of help sort begins with
                            Code:
                            Description
                            
                                sort arranges the observations of the current data into ascending order
                                based on the values of the variables in varlist.  There is no limit to the
                                number of variables in varlist.  Missing numeric values (see missing) are
                                interpreted as being larger than any other number, so they are placed last
                                with . < .a < .b < ... < .z.  When you sort on a string variable, however,
                                null strings are placed first and uppercase letters come before lowercase
                                letters.
                            If that is somehow insufficient, it is difficult to see what could be improved in the description of NA to avoid similar problems.

                            3) My point in (2) is that while the behavior of Stata missing values may be a common cause of coding errors, having followed Statalist for a few years, my reading suggests that it is just one of many manifestations of the underlying cause, lack of proficiency with Stata. Using Stata is like using any unfamiliar language. Reference to similar languages you already know will only take you so far. Ultimately, the grammar, syntax, and idioms of a language matter, and the effort to systematically learn those features, rather than relying on a mix of guesswork, assumption, and reference to Google, is what leads to proficiency.

                            4) As someone familiar with Stata's missing values, I would be reluctant to give up on the current implementation, which includes 26 "special" missing values that can be used to indicate the reason a value is missing, knowledge of which can be crucial in survey-based data, where we may need to distinguish "not applicable" from "refused to answer" from ... . Losing the ability to encode those outcomes within a missing value would require creating, for every variable with NA values, a second variable to track the reason for the NA.

                            Comment


                            • I haven't read it carefully, but this article

                              https://www.academia.edu/s/79074d3221

                              criticizes Stata's handling of missing data and champions the use of a user-written routine called validly instead. validly has been around since 2013 and I don't remember seeing it mentioned before, but if you don't like how Stata handles missing data see if you like validly any better.
                              -------------------------------------------
                              Richard Williams, Notre Dame Dept of Sociology
                              Stata Version: 17.0 MP (2 processor)

                              EMAIL: [email protected]
                              WWW: https://www3.nd.edu/~rwilliam

                              Comment


                              • Originally posted by Henry Strawforrd View Post
                                Finally introducing a proper missing number that is NA. Missing = infinity is probably the most common cause of coding errors.
                                For instance, I just found out that the sorts I've been are wrong: https://github.com/matthieugomez/stata-pitfalls/wiki
                                Originally posted by William Lisowski View Post
                                ... my reading suggests that it is just one of many manifestations of the underlying cause, lack of proficiency with Stata ...
                                I read the posted link and many of the stated "pitfalls" boil down to William's point about not being proficient in Stata (or failing to read the documentation). The list, I think, is intended for people with little or no experience coming to Stata from other languages, who may not have the same expectations about what code ouught or ought not to do.

                                For example, using egen functions, min() and max() mirror their counterpart functions min() and max(). But they do allow a switch to their behaviour which will *not* ignore missing values. However, you have to be deliberate in wanting those behaviours because they would not universally make sense in applications. And those proficient with Stata may expect the default behaviour to be consistent with their other namesakes.

                                Likewise, -egen sum, and rowsum transform missing values to zero.- refers to outdated -egen- functions. -egen rowtotal()- does also have a switch to handle the case when all values are missing. Those references are more than ~15 years old, and even then, were addressed here.

                                On a choice of treating missing values as infinitely large, is just that, a convention. Some Alternative Software defaults to making them infinitely small, and many similar issues bite inexperienced programmers in the opposite direction, yet you don't see their clients tossing the baby out with the bathwater.

                                The advice on comparing floats vs doubles is just wrong. Either value can be compared with implicit or explicit casting to float, or else created as doubles and compared directly. Conversion to strings should never happen in any context.

                                Comment

                                Working...
                                X