Dear all,
I am working on a survey data, and I want to run a logit regression from a subsample of my data. My database is composed by 130,000 firms (identified by idstd variable), with their corresponding weight. I am interest in those firms for which have a loan, and it was approved in the last 4 years. Variable loan takes value 1 if the firm has a loan, 0 otherwise. Loan_approval indicates the number of years that have passed since the loan was granted. My dependent variable, fin11, is a binary one that records if the loan obtained, nedeed collateral or not. This is my code:
One of the explanatory variables, k9, records the type of financial institution which granted the loan. So, with all this on hand, I am asking if I define my subpopulation properly, taking into account that loan_duration=k9=fin11=. when loan=0 | loan=.
Here I post the output of my code:
However, I am struggling because I cannot understand what observations is Stata using, above all when I get the margins. I do not understand from where those 42,980 observationes come.
I hope you can help me.
Thank you in advanced!!
I am working on a survey data, and I want to run a logit regression from a subsample of my data. My database is composed by 130,000 firms (identified by idstd variable), with their corresponding weight. I am interest in those firms for which have a loan, and it was approved in the last 4 years. Variable loan takes value 1 if the firm has a loan, 0 otherwise. Loan_approval indicates the number of years that have passed since the loan was granted. My dependent variable, fin11, is a binary one that records if the loan obtained, nedeed collateral or not. This is my code:
Code:
svyset, clear svyset idstd [pweight=wt], strata(strata) singleunit(scaled) svy, subpop(k8 if loan_duration<=4): logit fin11 n_outcome i.k9 lnemployees margins, dydx(*)
One of the explanatory variables, k9, records the type of financial institution which granted the loan. So, with all this on hand, I am asking if I define my subpopulation properly, taking into account that loan_duration=k9=fin11=. when loan=0 | loan=.
Here I post the output of my code:
Code:
. svy, subpop(loan if loan_duration<=4): logit fin11 n_outcome i.k9 lnemployees (running logit on estimation sample) Survey: Logistic regression Number of strata = 1,049 Number of obs = 123,517 Number of PSUs = 123,517 Population size = 8,216,935 Subpop. no. obs = 31,993 Subpop. size = 1,828,643 Design df = 122,468 F( 5, 122464) = 15.01 Prob > F = 0.0000 --------------------------------------------------------------------------------------------------------- | Linearized fin11 | Coef. Std. Err. t P>|t| [95% Conf. Interval] ----------------------------------------+---------------------------------------------------------------- n_outcome | -.4922472 .1218816 -4.04 0.000 -.7311332 -.2533613 | k9 | State-owned banks or government agency | .4004102 .1653038 2.42 0.015 .0764176 .7244028 Non-bank financial institutions | -.0593137 .2005896 -0.30 0.767 -.4524659 .3338386 Other | -.6965148 .3219102 -2.16 0.030 -1.327453 -.0655761 | lnemployees | .2936962 .0494761 5.94 0.000 .196724 .3906685 _cons | -.1388981 .1671363 -0.83 0.406 -.4664825 .1886863 --------------------------------------------------------------------------------------------------------- Note: 196 strata omitted because they contain no subpopulation members. Note: Variance scaled to handle strata with a single sampling unit. . margins, dydx(*) Average marginal effects Number of obs = 42,890 Model VCE : Linearized Expression : Pr(fin11), predict() dy/dx w.r.t. : n_outcome 2.k9 3.k9 4.k9 lnemployees --------------------------------------------------------------------------------------------------------- | Delta-method | dy/dx Std. Err. t P>|t| [95% Conf. Interval] ----------------------------------------+---------------------------------------------------------------- n_outcome | -.1081822 .027127 -3.99 0.000 -.1613506 -.0550138 | k9 | State-owned banks or government agency | .0866893 .0347437 2.50 0.013 .0185923 .1547864 Non-bank financial institutions | -.0135546 .0460373 -0.29 0.768 -.103787 .0766778 Other | -.1634197 .0752488 -2.17 0.030 -.3109061 -.0159332 | lnemployees | .0645462 .0107651 6.00 0.000 .0434468 .0856456 --------------------------------------------------------------------------------------------------------- Note: dy/dx for factor levels is the discrete change from the base level. .
I hope you can help me.
Thank you in advanced!!
Comment