Rank Variable for Income, then further for region

Clyde Schechter

Join Date: Apr 2014

Posts: 30194
#16

03 Aug 2019, 10:37

That is what the code in #4 does.
Comment

Em Arham

Join Date: Jul 2019
Posts: 11

#17

08 Aug 2019, 15:40

Clyde Schechter Hi,

I used the code to create rel_rank, furthermore I included controls in the regression such as;

bq_lfsat1	Satisfaction with: health
ln_ann_inc	Log of Annual Income/ bq_fiyr
bq_lfsat5	Satisfaction with: job
bq_lfsat6	Satisfaction with: social life
bq_lfsat7	Satisfaction with: amount of leisure time

I have pasted the results below (dataex did not recognise reg),
I put the coefficients of most regions that give negative values in bold.
As you can see the R-squared is satisfactory, however my results that are in bold do not make sense in line with previous literature.

I would appreciate advice on how to proceed with these results?
Is it possible to have more of an income framed result for these region coefficients?
Please feel free to direct me to a better approach for such results.

Source SS df MS Number o
> f obs = 8,003
F(25, 79
> 77) = 349.18
Model 5085.82144 25 203.432858 Prob > F
> = 0.0000
Residual 4647.47782 7,977 .58260973 R-square
> d = 0.5225
Adj R-sq
> uared = 0.5210
Total 9733.29926 8,002 1.21635832 Root MSE
> = .76329

> -------------------
bq_lfsato Coef. Std. Err. t P>t [
> 95% Con
> f. Interval]

> -------------------
rel_rank .0175328 .0688033 0.25 0.799 -.
> 1173396
> .1524052
bq_lfsat1 .1688925 .0069769 24.21 0.000 .
> 1552159
> .1825691
bq_lfsat2 .1020794 .0068235 14.96 0.000 .
> 0887035
> .1154552
bq_lfsat5 .1505373 .0071263 21.12 0.000 .
> 1365679
> .1645067
bq_lfsat6 .2775043 .0089664 30.95 0.000 .
> 2599278
> .2950809
bq_lfsat7 .0913944 .008289 11.03 0.000 .
> 0751457
> .107643
ln_ann_inc .0012425 .0159535 0.08 0.938 -.
> 0300306
> .0325155

bq_region
Outer Lon.. -.0007561 .0861392 -0.01 0.993 -.
> 1696114
> .1680993
R. of Sou.. -.0486639 .0746536 -0.65 0.515 -.
> 1950045
> .0976767
South West .0324929 .0788431 0.41 0.680 -.
> 1220601
> .187046
East Anglia .0300825 .0873702 0.34 0.731 -.
> 1411859
> .201351
East Midl.. .0241474 .0797239 0.30 0.762 -.
> 1321323
> .1804272
West Midl.. -.0540362 .0962531 -0.56 0.575 -.
> 2427174
> .1346449
R. of Wes.. -.0336221 .0844893 -0.40 0.691 -.
> 1992432
> .1319991
Greater M.. .0894747 .0886921 1.01 0.313 -.
> 0843849
> .2633344
Merseyside .0181766 .1043024 0.17 0.862 -.
> 1862833
> .2226365
R. of Nor.. -.0341868 .0876984 -0.39 0.697 -.
> 2060985
> .1377249
South Yor.. -.0814491 .095367 -0.85 0.393 -.
> 2683934
> .1054952
West York.. -.1627926 .0939151 -1.73 0.083 -.
> 3468907
> .0213055
R. of Yor.. .0695254 .0920188 0.76 0.450 -.
> 1108555
> .2499062
Tyne & Wear -.065404 .104975 -0.62 0.533 -.
> 2711824
> .1403744
R. of North .0425326 .0898474 0.47 0.636 -.
> 1335918
> .218657
Wales -.0387855 .0739758 -0.52 0.600 -.
> 1837975
> .1062265
Scotland -.0545115 .073571 -0.74 0.459 -.
> 1987299
> .089707
Northern .. -.0180789 .0748584 -0.24 0.809 -.
> 1648209
> .1286631

_cons 1.384614 .1512606 9.15 0.000 1
> .088104
> 1.681125

> -------------------

Any help or advice would be greatly appreciated.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30194
#18

08 Aug 2019, 17:36

As you can see the R-squared is satisfactory, however my results that are in bold do not make sense in line with previous literature.

I can't give you any specific advice. Unless you are running the exact same commands on the exact same data as the previous literature you are looking at, you can't expect to get the same results. If your data are from a similar population, sampled in a similar way, are free of errors, and if your variables are defined in the same way and measured in the same way, and if your regression is using the same variables as the ones in the literature, then normally you would expect the results to be reasonably similar. But I would take a look at each of the "ifs" I mentioned in the previous sentence rather carefully: you may find that one or several of them are not true of your project.
Comment
Em Arham

Join Date: Jul 2019

Posts: 11
#19

09 Aug 2019, 03:22

I understand, I might have to work out which ifs are not satisfied and report which of them were not true.
However, as for the coefficient with 'rel_rank' variable, created using code in #4, It is my understanding that this is a position between 0 and 1.
If I was to multiply the coefficient with 100, and read the result as a percentage point increase in relative rank income results in such and such increment in life satisfaction, would this be correct?
As in, that is how 'rel_rank' can be interpreted?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30194
#20

09 Aug 2019, 10:09

If you were to multiply rel_rank by 100 and re-run the regressions, the coefficient of rel_rank would decrease by a factor of 100, and nothing else would change. And in that situation it would be reasonable to refer to differences in rel_rank as being denominated in percentage points.
Comment
Em Arham

Join Date: Jul 2019

Posts: 11
#21

15 Aug 2019, 06:22

Clyde Schechter Hi,

I've made significant progress with your relative rank variable and currently I am creating the very same for Age instead of Region.

I just wanted to check in with my code and gather opinion on if you think this is fine;

bysort bq_fiyr : gen Age_25_29_Rank = (_n-1)/(_N-1) if bq_age_dv>=25 & bq_age_dv<=29

where bq_age_dv is the age variable and linked to view.
bq_fiyr is the income variable also linked to view.
I have also taken into account missing values (-9) and (-7) in the variables.

Will the above mentioned code create the same rank variable for the Age group of 25 years - 29 years of age?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30194
#22

15 Aug 2019, 07:06

No, that will not be correct.

I don't know what you mean when you say you have "taken account" of missing values in the variables: the code you show says nothing at all about them. Unless those observations have been eliminated, you have to code around them. In addition, adding the -if- condition in the way you have restricts the observations for which a rank will be generated, but it does not restrict the rank calculation to those observations. You need something like this:

Code:

gen missing_income = missing(br_fiyr) gen byte included = inrange(age, 25, 29) by missing_income included (br_fiyr), sort: gen age_25_29_rel_rank = (_n-1)/(_N-1) /// if !missing_income & included
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35811
#23

15 Aug 2019, 08:01

Clyde Schechter is right on the main points but the code still assumes no ties, as again pointed out earlier. The FAQ cited in #9 says much more.
Comment
Em Arham

Join Date: Jul 2019

Posts: 11
#24

17 Aug 2019, 06:28

Clyde Schechter & Nick Cox,
Thank you to both of you for the help and direction to FAQ,

I have finished my results for the coding of Income and Rank, and now have moved on to another section of my paper;

My question is related to the -nlcom- command in Stata, I have an Age variable (bq_age_dv) and I created a quadratic control for it labelled Age Squared (AgeSQ).

I used nlcom -_b[bq_age_dv]/(2*_b[AgeSQ] to find the 'turning point' and I have the value '52', which I understand to be the turning point of the U-shaped relationship.

However, I would like to plot this and present this neatly in my paper.

Would you have any suggestions on what command would work best for this?

Any help is greatly appreciated,
Thank you
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30194
#25

17 Aug 2019, 07:17

First get rid of your AgeSQ variable, and re-run the regression using factor variable notation to represent the quadratic term. Then run -margins- and use -marginsplot-

Code:

regress ....c.bq_age_dv##c.bq_age_dv.... local turning_point = -_b[bq_age_dv]/(2*_b[bq_age_dv#bq_age_dv]) margins, at(bq_age_dv= (25(5)75)) marginsplot, xline(`turning_point')

Note: I assumed that 25 to 75 was a reasonable range of ages to show on the horizontal axis, with points on the curve separated by 5 years. If you have different preferences in that regard, just change those numbers.

-marginsplot- also accepts -graph twoway- options, so you can customize the appearance of the graph to your liking.

IMPORTANT: This question really has nothing to do with the topic of this thread. The only connection is that it relates to what is, for you, the same project. But it is important to keep threads on topic. These are not dialogs between you and somebody else. These are public discussions, and other people come looking for help by searching on topics. If somebody has a question about graphing the output of a quadratic model, they will be unable to find this thread in a search on that topic. And if somebody else has questions about rank variables and search this topic, they will waste their time reading this post and the preceding one. So, in the future, when the topic of the question changes, please start a new thread.
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment