Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to run regressions from varlist depending on levels of each item in varlist? (thoughts and attached halfway approach)

    Dear, Statalisters -
    I've attached a do-file that I hope clarifies this question, using one of the sysuse datasets. Following forum rules, I will try to be as specific as possible:

    What I've been doing (see lines 22-44 of attached do-file):
    - Typically, when I run a large number of regressions, I specify a varlist local and will look the appropriate regressions based on the levels of those variable.
    E.g., For binary y variables, I write a local and will loop logistic regressions, and for unordered categorical y variables will run mlogit, and for continuous variables, I will run regress

    What I'd like to do (see the half-baked attempts and notes):
    - Rather than specify each kind of variable manually, I want to create locals for varlists of binary (2 levels, e.g. male/female), categorical (3/6 levels), and continuous (e.g. > 10 levels).
    - From these locals I want to then run the appropriate regression model based on the functional form of the y variable.

    My intuition (again see lines 44+ of attached do-file):
    - use the "ds" command or "cond()" function to specify locals; I looked for this in StataList and could not find an example.

    Main motivation:
    - I'm working on a large, collaborative project, with limited resources, a short timeline, and I need flexible syntax because I *know* we will chage our dependent variables. I want modular syntax to save us money (and me time)
    - Project specifics: The data are confidentical, so I cannot share them, so the do-file includes one of the built-in Stata datasets. Details, however: I'm looking at perhaps 100 outcomes, perhaps more (we are trying to produce different groups of analysis). Obviously, the above strategy of automating the forms of each outcome variable is not perfect, but will help us in expediting exploratory data analysis; the specifications I identified for outcome variables meet 99% of our variables.

    Thanks much!!

    - Nate
    Attached Files
    Nathan E. Fosse, PhD
    [email protected]

  • #2
    Nathan,

    How about using levelsof and wordcount()? Here is an example of their use, which could easily be adapted for your purposes.

    Code:
    sysuse auto
    levelsof rep78, local(a)
    di wordcount("`a'")
    You may need to be careful with truly continuous variables with many levels, which might exceed the maximum macro size.

    Regards,
    Joe

    Comment


    • #3
      An alternative is

      Code:
      sysuse auto
      qui tab rep78 
      di r(r)

      Comment


      • #4
        Dear Nick and Joe,
        Thanks so much for your help! I think I'm quite a bit there. Nick, I used your code bit to do something similar, and included Ben Jann's esttab features. You may also see the commented out "janky" code.

        One question is how we might aggregate findings so that the estimates are "binned". The output, as it is, spits out separate tables for each regression... I know I'm overlooking something simple... ugh.

        Thanks for helping me thus far.

        Best,
        - Nate

        Revised code (also see attached with older version commented out):

        Code:
        ************************************************** ******************************
        * My stab at the new approach, thanks to your help! :-)
        
        * preliminaries
        set more off
        cap clear
        cap log close
        
        * use example dataset
        sysuse nlsw88.dta, clear
        
        * Define local macros
        
        loc yvars ///
        married collgrad union /// these should be logits
        age wage tenure /// these should be ols
        race /// this should be mlogit
        
        local xvar ///
        grade
        
        
        * loop over yvars
        * create local for each y var determine number of levels
        
        foreach y of varlist `yvars' { // loop over yvars
        qui tab `y' // determine levels
        loc `y'_r r(r) // localize number of levels
        di ``y'_r'
        if (``y'_r' <= 1) { // if num levels <= 1, use logit
        est clear
        estto: qui logit `y' `xvars'
        esttab, title(Logits)
        }
        else if (``y'_r' >= 2) & (``y'_r' <= 6) { // if num levels between 2 to 6 inclusive, use mlogit
        est clear
        eststo: qui mlogit `y' `xvars'
        esttab, unstack title(Mlogits)
        }
        else if (``y'_r' > 6) & (``y'_r' < .) { // if num levels than but < missingness, use ols
        est clear
        eststo: qui reg `y' `xvars'
        esttab, title(OLS)
        }
        }
        ************************************************** ******************************
        * Wrap it up
        beep
        Attached Files
        Last edited by Nathan E. Fosse; 16 Dec 2014, 00:28.
        Nathan E. Fosse, PhD
        [email protected]

        Comment

        Working...
        X