Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multilevel Zero-One Inflated Beta Regression Model

    Comment


    • I think it's really important to update the differencing operator to make it easy on researchers to estimate panel data models by first differencing. Not allowing factor notation with D.() and replacing the difference in the interaction with the interaction of the differences are both shortcomings that are easy to fix. I think it contributes to the confusion of what is a model and what is an estimating equation. In panel data applications especially, differencing is used to eliminate heterogeneity in the levels equation. That is, FD is an alternative to FE, and so any model that can be estimated using xtreg, fe should be estimable by differencing the entire equation. It's cumbersome to have to create interactions "by hand" and it means that one cannot use the margins options. Also, not allowing something like i.year is also inconvenient. This is fundamental stuff, and it should be allowed for both OLS instrumental variables commands.

      Comment


      • Originally posted by Jeff Wooldridge View Post
        I think it's really important to update the differencing operator to make it easy on researchers to estimate panel data models by first differencing. Not allowing factor notation with D.() and replacing the difference in the interaction with the interaction of the differences are both shortcomings that are easy to fix. I think it contributes to the confusion of what is a model and what is an estimating equation. In panel data applications especially, differencing is used to eliminate heterogeneity in the levels equation. That is, FD is an alternative to FE, and so any model that can be estimated using xtreg, fe should be estimable by differencing the entire equation. It's cumbersome to have to create interactions "by hand" and it means that one cannot use the margins options. Also, not allowing something like i.year is also inconvenient. This is fundamental stuff, and it should be allowed for both OLS instrumental variables commands.
        I agree that the way Stata deals with situations such as D.(c.x1#c.x2), which is expanded to cD.x1#cD.x2, is unfortunate. As I said in this other thread, I doubt that StataCorp will do anything about it.

        A particular problem arises in this context in combination with macro variables: Even though it feels natural to do so, you never ever should code D.`var' or D.(`var') if `var' might contain interaction effects such as the above. This often requires to replace the variable list in `var' with temporary variables to avoid unintended consequences.
        Last edited by Sebastian Kripfganz; 05 Jan 2022, 09:44.
        https://twitter.com/Kripfganz

        Comment


        • I would like to suggest a few changes to the way the RESET test (estat ovtest) is implemented:

          1 - The most important one is that the test should be based on the same type of covariance matrix used in the estimation of the main model. It does not make sense to run a model with some form of robust standard errors, and the perform the RESET with plain-vanilla standard errors.

          2 - This is more a question of taste, but personally I would prefer if the the number of powers included by default could be reduced to just one or two (or that we have an option to choose the number of powers to include).

          3 - Finally, it would be great if the misleading name of the command could be changed. I know that this may be asking too much, but at least perhaps we could have estat reset as synonymous to estat ovtest.

          Comment


          • Originally posted by Joao Santos Silva View Post
            I would like to suggest a few changes to the way the RESET test (estat ovtest) is implemented:

            1 - The most important one is that the test should be based on the same type of covariance matrix used in the estimation of the main model. It does not make sense to run a model with some form of robust standard errors, and the perform the RESET with plain-vanilla standard errors.

            2 - This is more a question of taste, but personally I would prefer if the the number of powers included by default could be reduced to just one or two (or that we have an option to choose the number of powers to include).

            3 - Finally, it would be great if the misleading name of the command could be changed. I know that this may be asking too much, but at least perhaps we could have estat reset as synonymous to estat ovtest.
            I agree with Joao about RESET. In fact, I had a Twitter thread on this back in March: https://twitter.com/jmwooldridge/sta...12169036201985

            Comment


            • Have the missing values tables store all results (not just the last one, as with mdesc) in a table.

              Comment


              • I expect that an earlier post in this topic has repeated the continual request that Stata address the problems created by the merge m:m command. In answering a question today I happened to review the help merge documentation and saw that it includes no reference to the problems with merge m:m.

                If the underlying issues cannot be addressed directly, I suggest that the output of help merge be expanded to include a warning derived from the warning that appears in the PDF documentation, since it's often an uphill battle to get new users to read the more than the help output, if even that.

                At the same time, there's a particular problem in both the PDF and the help output: the introduction includes

                merge can perform match merges (one-to-one, one-to-many, many-to-one, and many-to-many), which are often called 'joins' by database people.
                No database person anywhere ever used the SQL join command to accomplish what is produced by Stata's "many-to-many" merge command. But a database person might interpret the quoted statement as equating the Stata "many-to-many" merge with the SQL m-by-m join (I did when I was new to Stata). The m-by-m join is the equivalent not the Stata's merge m:m command but to Stata's joinby command, which is mentioned nowhere in the output of help merge.

                If this pothole can't be fixed, at least make a better effort to steer new users around it.

                Comment


                • I completely agree with the spirit of William's post #232. Users should be steered away from -merge m:m- since what they wish to accomplish is better handled by -joinby-.

                  To make things equivalent, -merge m:m- is precisely an SQL full join on a common, coalesced ID. That is, observations with unmatched identifiers in either dataset are retained by default with -merge m:m-. In contrast, -joinby- defaults to removing those unmatched observations, which is usually what is desired. This behaviour can be counteracted by adding the -unmatched(both)- option to -joinby-. That said, if you know you don't have unmatched identifiers, -merge m:m- wouldn't necessarily be wrong, but it would be comparatively inefficient, and that is easily avoided nevertheless.

                  Toy example

                  Code:
                  tempname a b
                  input byte(a b)
                  1 4
                  1 6
                  2 9
                  3 3
                  5 .
                  end
                  sort a
                  save `a', replace
                  list
                  
                  drop _all
                  input byte(a c)
                  1 2
                  2 8
                  2 3
                  3 5
                  3 6
                  4 .
                  end
                  sort a
                  save `b', replace
                  list
                  
                  use `a', clear
                  merge m:m a using `b'
                  sort a b c
                  list
                  
                  use `a', clear
                  joinby a using `b', unmatched(both)
                  sort a b c
                  list a b c _merge
                  Equivalent join using SQL (with a SAS accent)

                  Code:
                  select coalesce(a.a, b.a) as a,
                          a.b, b.c
                    from one as a full join two as b
                    on a.a=b.a
                    order by a,b,c;

                  Comment


                  • @Leonardo Guizzeti #233 -

                    I will start by confessing that the last SQL I wrote was in 2014, so my memories of SQL have passed their half-life several times over by now.

                    You write

                    merge m:m- is precisely an SQL full join on a common, coalesced ID
                    but that does not agree with my understanding, which is why I wrote the snarky comment beginning "No database person ..."

                    From the PDF documentation for merge we read

                    In an m:m merge, observations are matched within equal values of the key variable(s), with the first observation being matched to the first; the second, to the second; and so on. If the master and using have an unequal number of observations within the group, then the last observation of the shorter group is used repeatedly to match with subsequent observations of the longer group.
                    I do not recall any SQL that does matching in that fashion. To the limits of my memory it seems to me that every SQL join I did where key K appeared in I rows in the left table and J rows in the right table produced I*J rows in the resulting table with key K. Stata merge m:m describes a procedure that produces max(I,J) observations in the resulting dataset.

                    Starting with your example, if we replace datasets `a' and `b' with
                    Code:
                    input byte(a b)
                    1 1
                    1 2
                    end
                    
                    input byte(a c)
                    1 3
                    1 4
                    1 5
                    end

                    we achieve the following result from merge m:m
                    Code:
                    . list, clean
                    
                           a   b   c        _merge  
                      1.   1   1   3   Matched (3)  
                      2.   1   2   4   Matched (3)  
                      3.   1   2   5   Matched (3)
                    and from joinby
                    Code:
                    . list a b c _merge, clean
                    
                           a   b   c                          _merge  
                      1.   1   1   3   both in master and using data  
                      2.   1   1   4   both in master and using data  
                      3.   1   1   5   both in master and using data  
                      4.   1   2   3   both in master and using data  
                      5.   1   2   4   both in master and using data  
                      6.   1   2   5   both in master and using data
                    Last edited by William Lisowski; 07 Jan 2022, 10:56. Reason: Corrected J*K to I*J

                    Comment


                    • Originally posted by William Lisowski View Post
                      I do not recall any SQL that does matching in that fashion. To the limits of my memory it seems to me that every SQL join I did where key K appeared in I rows in the left table and J rows in the right table produced J*K rows in the resulting table with key K. Stata merge m:m describes a procedure that produces max(I,J) observations in the resulting dataset.
                      Ah well spotted, and clearly demonstrates the deeper issue with -merge m:m-. I took would have expected the result to e J*K in size, so I evidently picked a toy problem that didn't properly test the two programs. My previous post can be disregarded and thank you for the clear elucidation of the issue.

                      Comment


                      • A final followup to #236, the -merge m:m- behaviour is akin to a data step match-merge join in SAS, for those familiar. The two can only be made to coincide when, within id variables, the join operation involves a one-to-many or one-to-one (either direction) relationship.

                        Comment


                        • Pay no mind to #225, apparently state space models are the same thing (essentially) as Bayesian Structural time series models.

                          I really should take a class about Bayesian stats one of these days.

                          Comment


                          • I would like it if putdocx would be able to include hyperlinks. I use putdocs to automatically generate a codebook from a dataset. Since a codebook is not intended to be read cover to cover (unless you suffer from a really bad case of insomnia) you want to allow the user to jump back and forth between a list of variables and the detailed descriptions of the individual variables, which is what I want to use hyperlinks for.
                            ---------------------------------
                            Maarten L. Buis
                            University of Konstanz
                            Department of history and sociology
                            box 40
                            78457 Konstanz
                            Germany
                            http://www.maartenbuis.nl
                            ---------------------------------

                            Comment


                            • Still on the subject of putdocx, I would like to see Scalable Vector Graphics (,svg) added to the list of file formats that can be embedded by putdocx image. And it might be even better if this was done later in Stata 17.

                              Comment


                              • Maarten Buis,

                                Code:
                                hep putdocx paragraph
                                in the text_options table, see hyperlink(link) option. Does this meet your need?
                                Last edited by Hua Peng (StataCorp); 11 Jan 2022, 11:22.

                                Comment

                                Working...
                                X