Hello everyone,
I have prepared a country-level panel dataset for my research (67 countries). There are some missing values in both my dependent and independent variables. My independent variables include 6 factors, which are extracted from 6 questions WVS surveys, which are conducted every 4-5 years. There are gaps in the WVS country data. For instance, some countries are surveyed in just one wave. Some countries are surveyed in an earlier wave, then are dropped from the subsequent waves, and then are resurveyed following that. WVS questions are also not the same in every wave. For instance, one of these 6 questions was dropped in Wave 4, but appeared in the next Waves. Hence, it's an unbalanced panel data (In addition to PCA of these factors, I entered individual independent variables separately in the regressions in another table.
The same applies to my control variables, as I have some missing values for some of the country controls in some years.
Now my problem is that:
1- I’m not sure whether my methodology is panel data. How can I address the data attrition problem?
2- If I want to provide summary statistics for the full sample, including only data available for all variables (dependent, independent, and control variables), the number of observations reduces to 365 for each variable. I think this might not be realistic because the number of observations differs across regressions, depending on the number of observations for the six factors.
tabstat dependent indepednet1 indepednet2 independent3 independent4 independent5 independent6 control1 control2 control3 ......control10 if !missing(dependent indepednet1 indepednet2 independent3 independent4 independent5 independent6 control1 control2 control3 ......control10), stat(N mean sd min max) columns(statistics)
3-I’ve been asked to report basic panel data statistics using the xtsum command in Stata. But, I have the same problem I mentioned in point 2 above.
xtsum dependent indepednet1 indepednet2 independent3 independent4 independent5 independent6 control1 control2 control3 ......control10 if !missing(dependent indepednet1 indepednet2 independent3 independent4 independent5 independent6 control1 control2 control3 ......control10), stat(N mean sd min max) columns(statistics)
Part of the data is as below:
* Example generated by -dataex-. For more info, type help dataex
clear
input str52 CountryName int Year double Dependent float(Indep1 Indep2 Indep3 Indep4 Indep5 Indep6 PCA) double(Ln_GDP FM_Education)
"" . . . . . . . . . .
"Argentina" 2003 . 78.125 95.3125 . 15.865385 . . . 25.57206411 1.34669005870819
"Argentina" 2004 . 78.125 95.3125 . 15.865385 . . . 25.82713601 1.30524003505707
"Argentina" 2005 . 85.52895 98.00399 98.60279 16.887081 63.0752 30.09307 .56931937 26.01524866 1.32123994827271
"Argentina" 2006 . 85.52895 98.00399 98.60279 16.887081 63.0752 30.09307 .56931937 26.17240231 1.35178005695343
"Argentina" 2007 . 85.52895 98.00399 98.60279 16.887081 63.0752 30.09307 .56931937 26.38459481 1.3522200584411599
"Argentina" 2008 . 85.52895 98.00399 98.60279 16.887081 63.0752 30.09307 .56931937 26.61368841 1.35339999198914
"Argentina" 2009 . 85.52895 98.00399 98.60279 16.887081 63.0752 30.09307 .56931937 26.53133771 1.34807002544403
"Argentina" 2010 . 90.7767 99.12621 97.37864 23.19277 59.46809 28.17734 .4454222 26.77212018 1.34685003757477
"Argentina" 2011 . 90.7767 99.12621 97.37864 23.19277 59.46809 28.17734 .4454222 26.99644125 1.36926996707916
"Argentina" 2012 . 90.7767 99.12621 97.37864 23.19277 59.46809 28.17734 .4454222 27.02585253 1.37328004837036
"Argentina" 2013 . 90.7767 99.12621 97.37864 23.19277 59.46809 28.17734 .4454222 27.03685943 1.38708996772766
"Argentina" 2014 . 90.7767 99.12621 97.37864 23.19277 59.46809 28.17734 .4454222 26.98917461 1.3909900188446
"Argentina" 2015 . 91.79912 98.51625 97.74216 21.943024 60.24916 29.886215 .51180804 27.11140578 1.39732003211975
"Argentina" 2016 . 91.79912 98.51625 97.74216 21.943024 60.24916 29.886215 .51180804 27.04678631 1.40629005432129
"Argentina" 2017 11.100000381469727 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 27.19038737 1.40893995761871
"Argentina" 2018 11.100000381469727 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 26.98632099 1.40596997737885
"Argentina" 2019 6.400000095367432 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 26.82751135 1.41640996932983
"Argentina" 2020 9.4 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 26.67791184 1.42865002155304
"Argentina" 2021 10 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 26.91199623 1.46200001239777
"Argentina" 2022 . 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 27.17337329 .
"Argentina" 2023 . 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 . .
"Australia" 2003 . . . . . . . . 26.87066083 .
"Australia" 2004 . . . . . . . . 27.14379216 .
"Australia" 2005 . 79.80296 95.77762 97.25546 48.18246 76.238335 50.10707 2.493911 27.26761644 .
"Australia" 2006 . 79.80296 95.77762 97.25546 48.18246 76.238335 50.10707 2.493911 27.34054505 .
"Australia" 2007 . 79.80296 95.77762 97.25546 48.18246 76.238335 50.10707 2.493911 27.47369721 .
"Australia" 2008 . 79.80296 95.77762 97.25546 48.18246 76.238335 50.10707 2.493911 27.68521201 .
"Australia" 2009 8.399999618530273 79.80296 95.77762 97.25546 48.18246 76.238335 50.10707 2.493911 27.55697597 .
"Australia" 2010 10.199999809265137 85.98511 95.26067 96.20853 54.43383 73.809525 48.84534 2.455766 27.7695384 .
"Australia" 2011 13.800000190734863 85.98511 95.26067 96.20853 54.43383 73.809525 48.84534 2.455766 27.96639 .
"Australia" 2012 13.900000095367432 85.98511 95.26067 96.20853 54.43383 73.809525 48.84534 2.455766 28.06730811 .
"Australia" 2013 14 85.98511 95.26067 96.20853 54.43383 73.809525 48.84534 2.455766 28.08612028 .
"Australia" 2014 15.300000190734863 85.98511 95.26067 96.20853 54.43383 73.809525 48.84534 2.455766 28.01464265 .
"Australia" 2015 23.100000381469727 87.7802 95.94804 96.67018 54.22585 77.43997 51.93383 2.744394 27.93155549 1.30043005943298
"Australia" 2016 26 87.7802 95.94804 96.67018 54.22585 77.43997 51.93383 2.744394 27.81879686 1.2977499961853
"Australia" 2017 28.700000762939453 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 27.9135405 1.29990994930267
"Australia" 2018 31.5 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 27.9874826 1.27081000804901
"Australia" 2019 31.200000762939453 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 27.96191965 1.27968001365662
"Australia" 2020 34 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 27.91390014 1.29891002178192
"Australia" 2021 34.79999923706055 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 28.0710185 1.32912003993988
"Australia" 2022 37.20000076293945 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 28.1470842 .
"Australia" 2023 . 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 . .
"Austria" 2003 5.6 73.25887 92.11564 . 33.427563 . . . 26.29265419 .
"Austria" 2004 6 73.25887 92.11564 . 33.427563 . . . 26.43189509 .
"Austria" 2005 7.1 76.26433 82.45258 . 36.77686 . . . 26.47930001 .
"Austria" 2006 5.9 76.26433 82.45258 . 36.77686 . . . 26.54121017 .
"Austria" 2007 5 76.26433 82.45258 . 36.77686 . . . 26.68732211 .
"Austria" 2008 6.4 76.26433 82.45258 . 36.77686 . . . 26.79181164 .
"Austria" 2009 7 76.26433 82.45258 . 36.77686 . . . 26.71911759 .
"Austria" 2010 8.7 . . . . . . . 26.69522923 .
"Austria" 2011 11.1 . . . . . . . 26.7909625 .
"Austria" 2012 11.9 . . . . . . . 26.73796295 .
"Austria" 2013 12.6 . . . . . . . 26.78749509 1.16927003860474
"Austria" 2014 17.1 . . . . . . . 26.81589796 1.16376996040344
"Austria" 2015 20 88.16121 91.1209 . . . . . 26.66861092 1.16585004329681
"Austria" 2016 18.1 88.16121 91.1209 . . . . . 26.70426924 1.17218995094299
"Austria" 2017 19.2 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.75697813 1.17297005653381
"Austria" 2018 26.1 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.84354386 1.17596995830536
"Austria" 2019 31.3 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.82048847 1.18473994731903
"Austria" 2020 31.5 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.79912952 1.19034004211426
"Austria" 2021 31.4 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.89781915 1.20816004276276
"Austria" 2022 34.6 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.87897297 .
"Austria" 2023 . 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 . .
"Belgium" 2003 6 82.53138 86.08787 . 29.22149 . . . 26.48557671 1.14858996868134
"Belgium" 2004 6.8 82.53138 86.08787 . 29.22149 . . . 26.63464419 1.16425001621246
"Belgium" 2005 6.2 93.27116 94.6036 . 34.646194 . . . 26.67836398 1.18326997756958
"Belgium" 2006 5.8 93.27116 94.6036 . 34.646194 . . . 26.73516967 1.19087994098663
"Belgium" 2007 6.4 93.27116 94.6036 . 34.646194 . . . 26.87795864 1.19723999500275
"Belgium" 2008 7.2 93.27116 94.6036 . 34.646194 . . . 26.97194311 1.20142996311188
"Belgium" 2009 7.6 93.27116 94.6036 . 34.646194 . . . 26.90
I have prepared a country-level panel dataset for my research (67 countries). There are some missing values in both my dependent and independent variables. My independent variables include 6 factors, which are extracted from 6 questions WVS surveys, which are conducted every 4-5 years. There are gaps in the WVS country data. For instance, some countries are surveyed in just one wave. Some countries are surveyed in an earlier wave, then are dropped from the subsequent waves, and then are resurveyed following that. WVS questions are also not the same in every wave. For instance, one of these 6 questions was dropped in Wave 4, but appeared in the next Waves. Hence, it's an unbalanced panel data (In addition to PCA of these factors, I entered individual independent variables separately in the regressions in another table.
The same applies to my control variables, as I have some missing values for some of the country controls in some years.
Now my problem is that:
1- I’m not sure whether my methodology is panel data. How can I address the data attrition problem?
2- If I want to provide summary statistics for the full sample, including only data available for all variables (dependent, independent, and control variables), the number of observations reduces to 365 for each variable. I think this might not be realistic because the number of observations differs across regressions, depending on the number of observations for the six factors.
tabstat dependent indepednet1 indepednet2 independent3 independent4 independent5 independent6 control1 control2 control3 ......control10 if !missing(dependent indepednet1 indepednet2 independent3 independent4 independent5 independent6 control1 control2 control3 ......control10), stat(N mean sd min max) columns(statistics)
3-I’ve been asked to report basic panel data statistics using the xtsum command in Stata. But, I have the same problem I mentioned in point 2 above.
xtsum dependent indepednet1 indepednet2 independent3 independent4 independent5 independent6 control1 control2 control3 ......control10 if !missing(dependent indepednet1 indepednet2 independent3 independent4 independent5 independent6 control1 control2 control3 ......control10), stat(N mean sd min max) columns(statistics)
Part of the data is as below:
* Example generated by -dataex-. For more info, type help dataex
clear
input str52 CountryName int Year double Dependent float(Indep1 Indep2 Indep3 Indep4 Indep5 Indep6 PCA) double(Ln_GDP FM_Education)
"" . . . . . . . . . .
"Argentina" 2003 . 78.125 95.3125 . 15.865385 . . . 25.57206411 1.34669005870819
"Argentina" 2004 . 78.125 95.3125 . 15.865385 . . . 25.82713601 1.30524003505707
"Argentina" 2005 . 85.52895 98.00399 98.60279 16.887081 63.0752 30.09307 .56931937 26.01524866 1.32123994827271
"Argentina" 2006 . 85.52895 98.00399 98.60279 16.887081 63.0752 30.09307 .56931937 26.17240231 1.35178005695343
"Argentina" 2007 . 85.52895 98.00399 98.60279 16.887081 63.0752 30.09307 .56931937 26.38459481 1.3522200584411599
"Argentina" 2008 . 85.52895 98.00399 98.60279 16.887081 63.0752 30.09307 .56931937 26.61368841 1.35339999198914
"Argentina" 2009 . 85.52895 98.00399 98.60279 16.887081 63.0752 30.09307 .56931937 26.53133771 1.34807002544403
"Argentina" 2010 . 90.7767 99.12621 97.37864 23.19277 59.46809 28.17734 .4454222 26.77212018 1.34685003757477
"Argentina" 2011 . 90.7767 99.12621 97.37864 23.19277 59.46809 28.17734 .4454222 26.99644125 1.36926996707916
"Argentina" 2012 . 90.7767 99.12621 97.37864 23.19277 59.46809 28.17734 .4454222 27.02585253 1.37328004837036
"Argentina" 2013 . 90.7767 99.12621 97.37864 23.19277 59.46809 28.17734 .4454222 27.03685943 1.38708996772766
"Argentina" 2014 . 90.7767 99.12621 97.37864 23.19277 59.46809 28.17734 .4454222 26.98917461 1.3909900188446
"Argentina" 2015 . 91.79912 98.51625 97.74216 21.943024 60.24916 29.886215 .51180804 27.11140578 1.39732003211975
"Argentina" 2016 . 91.79912 98.51625 97.74216 21.943024 60.24916 29.886215 .51180804 27.04678631 1.40629005432129
"Argentina" 2017 11.100000381469727 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 27.19038737 1.40893995761871
"Argentina" 2018 11.100000381469727 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 26.98632099 1.40596997737885
"Argentina" 2019 6.400000095367432 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 26.82751135 1.41640996932983
"Argentina" 2020 9.4 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 26.67791184 1.42865002155304
"Argentina" 2021 10 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 26.91199623 1.46200001239777
"Argentina" 2022 . 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 27.17337329 .
"Argentina" 2023 . 92.82153 97.90628 98.10568 20.69328 61.03024 31.59509 .5781939 . .
"Australia" 2003 . . . . . . . . 26.87066083 .
"Australia" 2004 . . . . . . . . 27.14379216 .
"Australia" 2005 . 79.80296 95.77762 97.25546 48.18246 76.238335 50.10707 2.493911 27.26761644 .
"Australia" 2006 . 79.80296 95.77762 97.25546 48.18246 76.238335 50.10707 2.493911 27.34054505 .
"Australia" 2007 . 79.80296 95.77762 97.25546 48.18246 76.238335 50.10707 2.493911 27.47369721 .
"Australia" 2008 . 79.80296 95.77762 97.25546 48.18246 76.238335 50.10707 2.493911 27.68521201 .
"Australia" 2009 8.399999618530273 79.80296 95.77762 97.25546 48.18246 76.238335 50.10707 2.493911 27.55697597 .
"Australia" 2010 10.199999809265137 85.98511 95.26067 96.20853 54.43383 73.809525 48.84534 2.455766 27.7695384 .
"Australia" 2011 13.800000190734863 85.98511 95.26067 96.20853 54.43383 73.809525 48.84534 2.455766 27.96639 .
"Australia" 2012 13.900000095367432 85.98511 95.26067 96.20853 54.43383 73.809525 48.84534 2.455766 28.06730811 .
"Australia" 2013 14 85.98511 95.26067 96.20853 54.43383 73.809525 48.84534 2.455766 28.08612028 .
"Australia" 2014 15.300000190734863 85.98511 95.26067 96.20853 54.43383 73.809525 48.84534 2.455766 28.01464265 .
"Australia" 2015 23.100000381469727 87.7802 95.94804 96.67018 54.22585 77.43997 51.93383 2.744394 27.93155549 1.30043005943298
"Australia" 2016 26 87.7802 95.94804 96.67018 54.22585 77.43997 51.93383 2.744394 27.81879686 1.2977499961853
"Australia" 2017 28.700000762939453 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 27.9135405 1.29990994930267
"Australia" 2018 31.5 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 27.9874826 1.27081000804901
"Australia" 2019 31.200000762939453 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 27.96191965 1.27968001365662
"Australia" 2020 34 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 27.91390014 1.29891002178192
"Australia" 2021 34.79999923706055 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 28.0710185 1.32912003993988
"Australia" 2022 37.20000076293945 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 28.1470842 .
"Australia" 2023 . 89.57529 96.63541 97.13183 54.01786 81.07042 55.02232 3.033022 . .
"Austria" 2003 5.6 73.25887 92.11564 . 33.427563 . . . 26.29265419 .
"Austria" 2004 6 73.25887 92.11564 . 33.427563 . . . 26.43189509 .
"Austria" 2005 7.1 76.26433 82.45258 . 36.77686 . . . 26.47930001 .
"Austria" 2006 5.9 76.26433 82.45258 . 36.77686 . . . 26.54121017 .
"Austria" 2007 5 76.26433 82.45258 . 36.77686 . . . 26.68732211 .
"Austria" 2008 6.4 76.26433 82.45258 . 36.77686 . . . 26.79181164 .
"Austria" 2009 7 76.26433 82.45258 . 36.77686 . . . 26.71911759 .
"Austria" 2010 8.7 . . . . . . . 26.69522923 .
"Austria" 2011 11.1 . . . . . . . 26.7909625 .
"Austria" 2012 11.9 . . . . . . . 26.73796295 .
"Austria" 2013 12.6 . . . . . . . 26.78749509 1.16927003860474
"Austria" 2014 17.1 . . . . . . . 26.81589796 1.16376996040344
"Austria" 2015 20 88.16121 91.1209 . . . . . 26.66861092 1.16585004329681
"Austria" 2016 18.1 88.16121 91.1209 . . . . . 26.70426924 1.17218995094299
"Austria" 2017 19.2 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.75697813 1.17297005653381
"Austria" 2018 26.1 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.84354386 1.17596995830536
"Austria" 2019 31.3 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.82048847 1.18473994731903
"Austria" 2020 31.5 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.79912952 1.19034004211426
"Austria" 2021 31.4 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.89781915 1.20816004276276
"Austria" 2022 34.6 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 26.87897297 .
"Austria" 2023 . 88.16121 91.1209 . 48.46395 57.65766 39.13596 1.1843925 . .
"Belgium" 2003 6 82.53138 86.08787 . 29.22149 . . . 26.48557671 1.14858996868134
"Belgium" 2004 6.8 82.53138 86.08787 . 29.22149 . . . 26.63464419 1.16425001621246
"Belgium" 2005 6.2 93.27116 94.6036 . 34.646194 . . . 26.67836398 1.18326997756958
"Belgium" 2006 5.8 93.27116 94.6036 . 34.646194 . . . 26.73516967 1.19087994098663
"Belgium" 2007 6.4 93.27116 94.6036 . 34.646194 . . . 26.87795864 1.19723999500275
"Belgium" 2008 7.2 93.27116 94.6036 . 34.646194 . . . 26.97194311 1.20142996311188
"Belgium" 2009 7.6 93.27116 94.6036 . 34.646194 . . . 26.90
Comment