Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to make a subset of data?

    Good morning all,

    I am working on a database about transition teams and the acquisition success.
    To test one hypothesis, I only need the data of the respondents who answered 'yes' on the question if they had a transition team.
    However, I do not want to drop the rest of the data since it is important to answer other questions and to see the difference.
    Is someone able to explan me how to create this subset and not lose any data?

    Thanks a lot in advance!

  • #2
    Mirte:
    welcome to this forum.
    It's always wise to avoid dropping data and use -if- clause instead:
    Code:
    . sysuse auto.dta
    (1978 Automobile Data)
    
    . regress price mpg if foreign==0
    
          Source |       SS           df       MS      Number of obs   =        52
    -------------+----------------------------------   F(1, 50)        =     17.05
           Model |   124392956         1   124392956   Prob > F        =    0.0001
        Residual |   364801844        50  7296036.89   R-squared       =    0.2543
    -------------+----------------------------------   Adj R-squared   =    0.2394
           Total |   489194801        51  9592054.92   Root MSE        =    2701.1
    
    ------------------------------------------------------------------------------
           price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             mpg |  -329.2551   79.74034    -4.13   0.000    -489.4183   -169.0919
           _cons |   12600.54   1624.773     7.76   0.000     9337.085    15863.99
    ------------------------------------------------------------------------------
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hello Carlo,

      Thank you for your reply!
      This (in the end quite easy soluition ) worked!
      However, it shows me now that the independent variable and moderating variable are ommitted because of collinearity.
      Do you perhaps also know how to solve this?
      Greatly appreciated!

      Comment


      • #4
        Mirte:
        unfortunately, extreme multicollinearity is a matter of existing predictors; hence, you can get rid of them only by changing the specification of your regression model.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Mirte,

          You'll increase our ability to help you by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

          If you really have variables omitted due to colinearity, you might look into why they are omitted. There may be a reason they don't vary in the subsample. For example, if you have a variable that only folks without a transition team can answer and code not answered as 0, then you could get colinearity in the subsample with transition teams. You might find it helpful to regress the omitted variable on the other rhs variables in the subsample.

          Comment

          Working...
          X