How to make a subset of data?

Mirte Roos

Join Date: May 2019

Posts: 3
#1

How to make a subset of data?

16 May 2019, 02:41

Good morning all,

I am working on a database about transition teams and the acquisition success.
To test one hypothesis, I only need the data of the respondents who answered 'yes' on the question if they had a transition team.
However, I do not want to drop the rest of the data since it is important to answer other questions and to see the difference.
Is someone able to explan me how to create this subset and not lose any data?

Thanks a lot in advance!
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17714

16 May 2019, 02:59

Mirte:
welcome to this forum.
It's always wise to avoid dropping data and use -if- clause instead:

Code:

. sysuse auto.dta
(1978 Automobile Data)

. regress price mpg if foreign==0

      Source |       SS           df       MS      Number of obs   =        52
-------------+----------------------------------   F(1, 50)        =     17.05
       Model |   124392956         1   124392956   Prob > F        =    0.0001
    Residual |   364801844        50  7296036.89   R-squared       =    0.2543
-------------+----------------------------------   Adj R-squared   =    0.2394
       Total |   489194801        51  9592054.92   Root MSE        =    2701.1

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -329.2551   79.74034    -4.13   0.000    -489.4183   -169.0919
       _cons |   12600.54   1624.773     7.76   0.000     9337.085    15863.99
------------------------------------------------------------------------------

Kind regards,
Carlo
(Stata 19.0)

Comment

Mirte Roos

Join Date: May 2019

Posts: 3
#3

16 May 2019, 03:25

Hello Carlo,

Thank you for your reply!
This (in the end quite easy soluition ) worked!
However, it shows me now that the independent variable and moderating variable are ommitted because of collinearity.
Do you perhaps also know how to solve this?
Greatly appreciated!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#4

16 May 2019, 03:50

Mirte:
unfortunately, extreme multicollinearity is a matter of existing predictors; hence, you can get rid of them only by changing the specification of your regression model.

Kind regards,
Carlo
(Stata 19.0)
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#5

17 May 2019, 11:15

Mirte,

You'll increase our ability to help you by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

If you really have variables omitted due to colinearity, you might look into why they are omitted. There may be a reason they don't vary in the subsample. For example, if you have a variable that only folks without a transition team can answer and code not answered as 0, then you could get colinearity in the subsample with transition teams. You might find it helpful to regress the omitted variable on the other rhs variables in the subsample.
Comment

Announcement

How to make a subset of data?

Comment

Comment

Comment

Comment