Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I tried to do LR test after excluding meanyearsedu (because in the linear regression result, it's insignificant). Based on the result, the model with fpaidemployed is better than one without it.

    Code:
    quietly reghdfe pop65 lntfr lifeexp lngdpcap fpaidemployed, absorb(prov_id year)
    estimates store model1
    quietly reghdfe pop65 lntfr lifeexp lngdpcap, absorb(prov_id year)
    estimates store model0
    lrtest model0 model1
    Code:
    .. lrtest model0 model1
    
    Likelihood-ratio test
    Assumption: model0 nested within model1
    
     LR chi2(1) =   9.03
    Prob > chi2 = 0.0027

    Comment


    • #17
      Carlo: Yes, I am a bit confused here. Because even if I don't pay attention to the variable significance, the model does not fit. I have tried to run different models (with 4, 3, 2 variables) but none of the models fits

      Comment


      • #18
        Should I try Lasso?

        Comment


        • #19
          Niara:
          just take a deep breath and think about the data generating process reported in the literature.
          Then do your panel data regression following a zero-expectation approach.
          Finally look at the results, whatever they are.
          Unsolicited strategic advice: discuss with your supervisor (whom you pay with your tuition fees) all your doubts along the way.
          Last but not least: save time to relax. Research should be an adventure, not a torture (even though sometimes the distinction between the two is not that clear )!
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #20
            Carlo: thank you, i needed that!
            I would love to relax but my supervisor is going on leave for weeks in a few days and i am behind schedule so I am a bit panicked here.

            Another question if I may, do you think reducing the year would help? How many observations do I need at least to perform panel data analysis? I talked to a lecturer and he said 11 years is too long. So I am thinking if it wise to reduce the years.

            Comment


            • #21
              Sorry, another question. Why absorb both year and prov_id instead of using xtreg and i.year? thank you!

              Comment


              • #22
                I wouldn't log anything that's a share.

                Your main problem is you are trying to explain the population share age65-plus with variables that aren't expected to explain it very well, and its value is unlikely to change much over a few years, so low variation within a country (need variation for regression to work well).

                I'm not sure you want the within estimator if you are trying to explain the variation in pop65 across countries. You'd drop the countryid as a fixed effect if you want the between effects. The within model is asking how pop65 changes over time within a country, not among them, and that number doesn't change a whole lot within a country.

                I'd spend some time thinking about what would explain pop65 better than what you have (healthcare expenditures, past wars, famine, disease, social security/retirement income programs, general health conditions, food security, birthrates in the mid-50s (?), i(e)mmigration, a lag of population share for the pop65 cohort 20/30 years ago, tax policies (sometimes blamed for the baby boom in the US), etc...). a lot of history embedded in pop65. Think about Russia. What's its pop65 going to look like in 20/30 years with hundreds of thousands of young/middle-aged men dying and many younger persons fleeing the country? In the US, pop65 is the consequence of the baby boom and good health care. AIDS in Africa.

                And keep if mind that pop65 + pop<65 sums to 1 (or as many age groups as you want). Thus, the share of pop65 depends on what determines the share of pop20-35, and so forth.

                Also, there's lot of research about how an aging population affects economic activity, so gdpcap could be taken to be endogenous.

                I'm not sure about including life expectancy in the regression (it's too much like pop65, I worry).

                Here's one paper that does what you are trying to do. I'm not convinced it is correct, but it may help, and at least gives you a hook to the literature.
                HTML Code:
                https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9782275/
                It appears you are new to regression analysis and haven't really thought this through yet. I'd put Stata away for a bit, and really think long and hard about what explains pop65 (search the literature, even if anecdotal) and what type of analysis would give you the answer you're after.

                The first question is: what are you trying to explain? And it's not just pop65. Are you interested in the variation across countries, or within countries?

                Comment


                • #23
                  George: thank you so much for your insight. This really helps me a lot. I will spend more time into literature review.

                  Again, thank you so much everyone for the input.

                  Comment


                  • #24
                    I pulled some data from WDI. No problem getting sensible and significant results, though a lot of missing data.

                    Code:
                    
                    HDFE Linear regression                            Number of obs   =      1,880
                    Absorbing 2 HDFE groups                           F(   7,   1747) =      75.60
                                                                      Prob > F        =     0.0000
                                                                      R-squared       =     0.9704
                                                                      Adj R-squared   =     0.9682
                                                                      Within R-sq.    =     0.2325
                                                                      Root MSE        =     0.3866
                    
                    ------------------------------------------------------------------------------
                           pop65 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                    -------------+----------------------------------------------------------------
                       healthexp |   .0018397    .000144    12.78   0.000     .0015573     .002122
                      basicwater |  -.0208504   .0033265    -6.27   0.000    -.0273747   -.0143261
                         malaria |   .0016348   .0002687     6.09   0.000     .0011079    .0021617
                          popden |  -.0045125   .0007723    -5.84   0.000    -.0060273   -.0029976
                           rural |  -.0101165   .0064765    -1.56   0.118    -.0228191     .002586
                          femlab |   .0195784   .0041595     4.71   0.000     .0114203    .0277365
                             hiv |   .0544103   .0134394     4.05   0.000     .0280513    .0807693
                           _cons |   5.338596   .5320328    10.03   0.000     4.295108    6.382085
                    ------------------------------------------------------------------------------
                    
                    Absorbed degrees of freedom:
                    -----------------------------------------------------+
                     Absorbed FE | Categories  - Redundant  = Num. Coefs |
                    -------------+---------------------------------------|
                     countryname |       109           0         109     |
                            year |        18           1          17     |
                    Code:
                    HDFE Linear regression                            Number of obs   =      1,880
                    Absorbing 1 HDFE group                            F(   7,   1855) =     360.39
                                                                      Prob > F        =     0.0000
                                                                      R-squared       =     0.5816
                                                                      Adj R-squared   =     0.5762
                                                                      Within R-sq.    =     0.5763
                                                                      Root MSE        =     1.4109
                    
                    ------------------------------------------------------------------------------
                           pop65 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                    -------------+----------------------------------------------------------------
                       healthexp |   .0029498   .0002043    14.44   0.000     .0025492    .0033505
                      basicwater |   .0612915   .0032465    18.88   0.000     .0549243    .0676587
                         malaria |  -.0025988    .000326    -7.97   0.000    -.0032381   -.0019595
                          popden |   .0002936   .0002564     1.14   0.252    -.0002094    .0007965
                           rural |   .0096246   .0026447     3.64   0.000     .0044377    .0148115
                          femlab |   .0381236   .0022861    16.68   0.000     .0336399    .0426073
                             hiv |  -.1772176   .0172126   -10.30   0.000    -.2109757   -.1434596
                           _cons |  -2.690234   .3750397    -7.17   0.000    -3.425778   -1.954689
                    ------------------------------------------------------------------------------
                    
                    Absorbed degrees of freedom:
                    -----------------------------------------------------+
                     Absorbed FE | Categories  - Redundant  = Num. Coefs |
                    -------------+---------------------------------------|
                            year |        18           0          18     |

                    Comment


                    • #25
                      George: thank you so much

                      I think I might have problems with my data.
                      1. Pop65+ : i got the data from the country's population projection based on census by the national office of statistics
                      2. lifeexp : national office of statistics
                      3. gdp per capita : national office of statistics
                      4. mean years of education : globaldatalab (interpolated from Demographic and Health Survey (DHS) which collected every 5 years)
                      5. tfr : globaldatalab (interpolated from DHS which collected every 5 years)
                      6. female paid employment : globaldatalab (interpolated from DHS which collected every 5 years)

                      Originally, the country has 34 provinces. However, GlobalDataLab has 29 provinces (they merge some of the provinces). So for those provinces, I calculated the average value.

                      Comment


                      • #26
                        I was going to suggest a log-log model so that you estimate elasticities. It seems that at least the elasticity with respect to GDP is significant. I'm not sure what that means, but it's something.

                        Comment


                        • #27
                          Yes, I think I am going to take it up to my supervisor, show her some alternatives and ask for suggestions. A lecturer also talked about elasticity. I think even though my final model might not be the log log, I will still talk about it in the result.

                          Comment

                          Working...
                          X