Matrix for saving results of logistic regression

anne jagdberg

Join Date: Feb 2018

Posts: 28
#1

Matrix for saving results of logistic regression

24 Feb 2023, 00:58

Dear Stata Users,

I run logistic regressions with the dependent variable HED and 8 independent variables (v1-v8) in 35 countries (COU).
What I want to do is to save the odds ratio coefficients for the variables and countries. I.e., for all participants from a country there is the same value for each of the independent variables, which is the OR coefficient.
This can also be in a new dataset.
Based on that I would like to run a cluster analysis to cluster the countries based on the results of the logistic regression.

Can anyone help me with saving the coefficients? I learnt about the matrix command and tried to read in the forum, something like

Code:

matrix list r(table) matrix p_`x' = r(table) matrix f_`x' = p_`x'[1, 1...] matrix l_`x' = (`x')

... but i do really not how to apply it to my analyses-

My command for the logisitic regression is:

Code:

levelsof COU, local(levels) foreach j of local levels { display `j' *forval j = 100/`r(max)' { *di "{title:`: label (nation) `j''}" logistic HED v1 v2 v3 v4 v5 v6 v7 v8 if COU == `j' }

Would be really happy to get your help - it does not necessarily be the matrix command, happy to hear other possibilities of saving aswell!

Best
Anne
Tags: None

Maarten Buis

Join Date: Mar 2014
Posts: 3456

24 Feb 2023, 01:25

In this example the occupational categories play the role of country

Code:

// open example data
frames reset
sysuse nlsw88, clear

// prepare the data

gen byte occat = cond(occupation < 3, 1,                    ///
                 cond(inlist(occupation,5, 6, 8, 9), 2, 3)) ///
                 if !missing(occupation)
label variable occat "occupational category"
label define occat 1 "white collar" ///
                   2 "skilled"      ///
                   3 "unskilled"
label value occat occat

// make a copy of the data and use that to create a "dataset" of odds ratios
frame copy default or
frame change or
statsby _b _se, by(occat) clear : logit union i.south grade, or

// this stores the coefficients, to make them into odds ratios
// we use or = exp(coefficient)
gen or_south   = exp(_stat_2)
gen or_grade   = exp(union_b_grade)
gen odds_cons  = exp(union_b_cons)

// use the delta method to compute the standard errors:
// se_or = exp(coefficient)*se_coefficient
gen se_south = exp(_stat_2)*_stat_6
gen se_grade = exp(union_b_grade)*union_se_grade
gen se_cons  = exp(union_b_cons)*union_se_cons

// the confidence interval is computed using the confidence interval
// of the coefficients and than exponentiating the upper and lower bounds
// see: https://www.stata.com/support/faqs/statistics/delta-rule/
gen lb_south = exp(_stat_2       - invnormal(0.975)*_stat_6)
gen lb_grade = exp(union_b_grade - invnormal(0.975)*union_se_grade)
gen lb_cons  = exp(union_b_cons  - invnormal(0.975)*union_se_cons)
gen ub_south = exp(_stat_2       + invnormal(0.975)*_stat_6)
gen ub_grade = exp(union_b_grade + invnormal(0.975)*union_se_grade)
gen ub_cons  = exp(union_b_cons  + invnormal(0.975)*union_se_cons)

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Comment

anne jagdberg

Join Date: Feb 2018

Posts: 28
#3

24 Feb 2023, 02:27

Thank you very much for the fast reply, this was indeed very helpful...
what i used was

statsby _b _se, by(COU) clear : logistic HED v1 v2 v3 v4 v5 v6 v7 v8

...and somehow it worked but I think I didnt express clearly enough my goal, because I would like to have the value for each participant of each country, now its only one observation per country...
what do i have do change?

Thank you so much!

Best
Anne
Comment

Maarten Buis

Join Date: Mar 2014
Posts: 3456

24 Feb 2023, 02:39

Code:

// open example data
frames reset
sysuse nlsw88, clear

// prepare the data

gen byte occat = cond(occupation < 3, 1,                    ///
                 cond(inlist(occupation,5, 6, 8, 9), 2, 3)) ///
                 if !missing(occupation)
label variable occat "occupational category"
label define occat 1 "white collar" ///
                   2 "skilled"      ///
                   3 "unskilled"
label value occat occat

// make a copy of the data and use that to create a "dataset" of odds ratios
frame copy default or
frame change or
statsby _b, by(occat) clear : logit union i.south grade, or

// this stores the coefficients, to make them into odds ratios
// we use or = exp(coefficient)
gen or_south   = exp(_stat_2)
gen or_grade   = exp(union_b_grade)
gen odds_cons  = exp(union_b_cons)

// move back to the original data
frame change default

// establish a link between or and default
// 9 observations aren't matched bacause they have missing values on occat
frlink m:1 occat , frame(or)

// move the or variables to the original data
frget or_south or_grade odds_cons, from(or)

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Comment

anne jagdberg

Join Date: Feb 2018

Posts: 28
#5

24 Feb 2023, 06:05

sorry, I do not really get it.

firstly,

frame copy default or frame change or

this does not work, stata does not know a command frame?

then, i do now understand why to make the coefficients to OR because with my command logistic i get OR, right?

sorry for my stupid questions..

maybe

statsby _b _se, by(COU) clear : logistic HED v1 v2 v3 v4 v5 v6 v7 v8

this (or similarly) is already the solution... here I have for all countries one value. is there a possiblity to add it to my previous dataset but fill in for each participant? should be matched via the country variable or so (I think you tried to show me, but it did not work. sorry!)
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3456
#6

24 Feb 2023, 06:22

As to the first question: what version of Stata are you using?

As to the second question: logistic displays the odds ratios, but under the hood it is still a regular logistic regression. This means the coefficients that logistic leaves behind, and is collected by statsby, are the log(odds ratios) and not the odds ratios.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
anne jagdberg

Join Date: Feb 2018

Posts: 28
#7

24 Feb 2023, 06:24

To the first questions, it is Stata 15.1

Second one: Thanks, thats an important aspect. Got it!
Comment
anne jagdberg

Join Date: Feb 2018

Posts: 28
#8

24 Feb 2023, 06:46

what I tried is to use

foreach x of numlist 100 191 196 203 208 233 246 300 352 380 428 440 470 528 578 616 620 642 703 705 752 804 {
display `x'
logistic HED v1 v2 v3 v4 v5 v6 v7 v8 [pweight=WEIGHT] if COUNTRY==`x'
matrix p_`x' = r(table)
matrix f_`x' = p_`x'[1, 1...]
matrix l_`x' = (`x')
matrix p_`x' = f_`x', l_`x'
}

but i dont know how to go on
Comment

Maarten Buis

Join Date: Mar 2014
Posts: 3456

24 Feb 2023, 07:55

Originally posted by anne jagdberg View Post

To the first questions, it is Stata 15.1

That is important information. This is so important, that the Statalist FAQ asks you to mention this at the very beginning. There is nothing wrong with using an older version of Stata, but if you tell us nothing we have to make an assumption of what version of Stata you are using, and the assumption is that you are using the latest version. So now you know, the next time you ask a question on Statalist mention your version of Stata.

Adapting my example to that older version gives:

Code:

// open example data
sysuse nlsw88, clear

// prepare the data

gen byte occat = cond(occupation < 3, 1,                    ///
                 cond(inlist(occupation,5, 6, 8, 9), 2, 3)) ///
                 if !missing(occupation)
label variable occat "occupational category"
label define occat 1 "white collar" ///
                   2 "skilled"      ///
                   3 "unskilled"
label value occat occat

// make and save temporary copy of the data
tempfile tofill
save `tofill'

//create a "dataset" of odds ratios
statsby _b, by(occat) clear : logit union i.south grade, or

// this stores the coefficients, to make them into odds ratios
// we use or = exp(coefficient)
gen or_south   = exp(_stat_2)
gen or_grade   = exp(union_b_grade)
gen odds_cons  = exp(union_b_cons)

// keep only the relevant variables
keep occat or_south or_grade odds_cons

// merge those variables to the original data
merge 1:m occat using `tofill'

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Comment

anne jagdberg

Join Date: Feb 2018

Posts: 28
#10

24 Feb 2023, 10:29

Thank you very much and my huge apologies for not telling the version. Next time I will do it for sure.
Again, thanks a lot for you patience and giving advice that fast.

Best
Anne
Comment

Announcement