Hi All,
I'm hoping to check my interpretation of Poisson regression models in Stata17 makes sense.
I have a set of binary outcomes and several continuous and categorical variables from a cross-sectional study.
I have used Poisson regression with robust variance instead of log-binomial regression, because of the low prevalence and age as a continuous covariate.
My data looks like this
I then create a two-way table
To determine the females are associated with correctly identifying vitamin D requirements, I then perform step-wise modified Poisson regression. I also check the baseline for males.
Based on the results from the above- I would infer that males are more likely than females to over-estimate the vitamin D requirements in summer.
I think I can also infer that skin types 3 (p 0.032) + 4 (p 0.000) and VET education (p 0.007) affect whether females are likely to over-estimate the vitamin D requirements in summer.
Does this make sense?
Where I am a little confused is when I run my model the first option of the variable is 'dropped' or not present in the analysis, unless I manually add it in. Is there are reason for this? Or is there a way I can get both presented using only one line of code?
For example. I have to run :
instead of the following, otherwise I get the computation for males automatically :
Similarly - when running the above poisson regerssion. The results exclude the first variable option for the groups.
For example education has three options : no post school, VET or higher education.
But the model only shows us VET and higher education.
Thanks in advance and sorry if this is very basic!
I hope I used the dataex command correctly!
I'm hoping to check my interpretation of Poisson regression models in Stata17 makes sense.
I have a set of binary outcomes and several continuous and categorical variables from a cross-sectional study.
I have used Poisson regression with robust variance instead of log-binomial regression, because of the low prevalence and age as a continuous covariate.
My data looks like this
Code:
. input long(sex VitD_Summer) byte AGE long(FP educ)
sex VitD_Summer AGE FP educ
1. 2 1 71 2 2
2. 1 1 76 2 2
3. 2 1 81 2 0
4. 2 1 62 3 1
5. 2 0 47 2 1
6. 2 1 55 2 0
7. 1 0 36 2 1
8. 1 0 53 2 1
9. 1 1 26 2 2
10. 2 1 58 3 1
11. 2 0 65 2 2
12. 1 . 60 3 0
13. 1 0 36 2 1
14. 1 1 19 2 0
15. 1 0 37 2 2
16. 1 1 19 2 1
17. 1 1 51 3 2
18. 2 0 55 2 1
19. 2 1 65 2 0
20. 1 1 48 2 1
21. 1 1 34 2 2
22. 2 1 76 2 0
23. 2 1 74 2 0
24. 1 1 62 1 2
25. 1 0 37 2 0
26. 2 0 63 1 1
27. 1 0 80 1 1
28. 1 1 37 3 2
29. 1 1 83 2 1
30. 2 1 75 1 2
31. 1 1 31 2 2
32. 1 1 61 2 1
33. 1 0 75 1 2
34. 1 . 43 2 2
35. 2 1 66 1 1
36. 2 0 67 2 1
37. 1 0 56 2 1
38. 1 0 54 2 1
39. 1 1 71 2 1
40. 2 1 60 2 0
41. 2 1 73 2 0
42. 2 1 38 2 2
43. 1 0 64 2 1
44. 1 1 55 3 2
45. 2 1 46 2 1
46. 1 0 36 2 1
47. 1 1 50 4 1
48. 2 0 44 2 1
49. 2 . 73 2 0
50. 1 0 72 2 1
51. 1 1 71 2 1
52. 2 1 69 2 0
53. 1 0 82 2 1
54. 2 1 64 3 0
55. 2 1 78 1 0
56. 2 1 82 1 0
57. 1 1 19 2 2
58. 1 1 37 1 1
59. 2 1 71 2 1
60. 1 1 20 1 0
61. 1 0 20 2 1
62. 2 . 82 2 1
63. 2 1 81 2 1
64. 1 0 77 2 0
65. 2 0 62 1 1
66. 1 . 51 2 0
67. 1 1 54 3 0
68. 1 1 76 1 0
69. 1 1 84 1 0
70. 2 1 40 2 1
71. 1 0 64 1 1
72. 1 1 75 2 0
73. 1 1 46 2 1
74. 1 0 62 1 1
75. 2 1 38 2 0
76. 1 1 31 2 1
77. 2 1 59 2 1
78. 1 0 78 1 0
79. 1 1 72 2 1
80. 1 1 48 2 1
81. 1 1 45 2 2
82. 2 1 82 2 0
83. 1 1 18 2 0
84. 1 1 69 2 1
85. 1 1 39 2 2
86. 1 0 18 2 0
87. 1 1 28 2 2
88. 2 1 19 2 0
89. 2 1 28 3 0
90. 1 1 76 2 0
91. 1 1 54 2 2
92. 2 1 67 2 2
93. 2 1 88 1 0
94. 1 0 71 2 1
95. 1 0 40 2 2
96. 1 0 29 2 2
97. 2 1 60 3 0
98. 2 1 62 2 0
99. 1 1 79 1 1
100. 1 1 62 2 0
101. end
. label values sex sex
. label def sex 1 "FEMALE", modify
. label def sex 2 "MALE", modify
. label values VitD_Summer VitD_Summer
. label def VitD_Summer 0 "Correct <10mins", modify
. label def VitD_Summer 1 "Over-estimated", modify
. label values FP FP
. label def FP 1 "FP I", modify
. label def FP 2 "FP II", modify
. label def FP 3 "FP III", modify
. label def FP 4 "FP IV", modify
. label values educ educ
. label def educ 0 "No post-school qual", modify
. label def educ 1 "VET", modify
. label def educ 2 "Higher Education", modify
Code:
tabulate sex VitD_Summer, row
Code:
glm VitD_Summer ib2.sex, fam(poisson) link(log) vce(robust) eform glm VitD_Summer ib2.sex c.AGE, fam(poisson) link(log) vce(robust) eform glm VitD_Summer ib2.sex c.AGE i.FP, fam(poisson) link(log) vce(robust) eform glm VitD_Summer ib2.sex c.AGE i.FP i.educ, fam(poisson) link(log) vce(robust) eform glm VitD_Summer ib1.sex c.AGE i.FP, fam(poisson) link(log) vce(robust) eform
I think I can also infer that skin types 3 (p 0.032) + 4 (p 0.000) and VET education (p 0.007) affect whether females are likely to over-estimate the vitamin D requirements in summer.
Does this make sense?
Where I am a little confused is when I run my model the first option of the variable is 'dropped' or not present in the analysis, unless I manually add it in. Is there are reason for this? Or is there a way I can get both presented using only one line of code?
For example. I have to run :
Code:
glm VitD_Summer ib2.sex, fam(poisson) link(log) vce(robust) eform
Code:
glm VitD_Summer i.sex, fam(poisson) link(log) vce(robust) eform
For example education has three options : no post school, VET or higher education.
But the model only shows us VET and higher education.
Thanks in advance and sorry if this is very basic!
I hope I used the dataex command correctly!

Comment