Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Filling in the Time Series by Panel

    I'd like to (try) and study the effect of a policy on maternal mortality from abortions. The policy begins in 1966, but I only have data dating back to 1969, unfortunately. However, I have data from 1969-2019, so my idea is to linearly/quadratically interpolate the missing values dating back to 1955, at least.

    My question is this: how do I, by panel, extend the time series for each unit back to the year 1955? Here is a subsample of my current dataset
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int year float(mmr id)
    1987 15.06 1
    1988 19.96 1
    1989 24.09 1
    1969 32.13 2
    1970 24.93 2
    1971 30.41 2
    1972 24.03 2
    1973 22.44 2
    1974 18.47 2
    1975 17.07 2
    1976 21.73 2
    1977 18.69 2
    1978 15.22 2
    1979 12.73 2
    1980   3.3 2
    1981     0 2
    1982  6.33 2
    1983  3.33 2
    1984  1.12 2
    1985  2.29 2
    1986  1.15 2
    1987     0 2
    1988     0 2
    1989  1.13 2
    1969 20.45 3
    1970  20.4 3
    1971 20.49 3
    1972 13.21 3
    1973 12.36 3
    1974 16.98 3
    1975 12.58 3
    1976  5.79 3
    1977 10.67 3
    1978 13.12 3
    1979   .81 3
    1980    .8 3
    1981  1.62 3
    1982     0 3
    1983     0 3
    1984   .86 3
    1985  1.75 3
    1986   .85 3
    1987     0 3
    1988   2.5 3
    1989     0 3
    1969 33.55 4
    1970 44.69 4
    1971 33.23 4
    1972  29.7 4
    1973 34.36 4
    1974 29.49 4
    1975 27.65 4
    1976 23.45 4
    1977 34.51 4
    1978 27.85 4
    1979 15.33 4
    1980  6.24 4
    1981  6.43 4
    1982  4.83 4
    1983  7.32 4
    1984  4.91 4
    1985  3.32 4
    1986 10.83 4
    1987  7.71 4
    1988  2.55 4
    1989  5.34 4
    1985  8.25 5
    1986  7.02 5
    1987  4.23 5
    1988  1.41 5
    1989  7.48 5
    1986   4.5 6
    1987  1.53 6
    1988  1.51 6
    1989  1.56 6
    1981  8.72 7
    1982  8.65 7
    1985  4.23 7
    1986  4.15 7
    1987 11.96 7
    1985     0 8
    1986  3.32 8
    1987     0 8
    1988     0 8
    1989  3.59 8
    1969 53.79 9
    1970 42.16 9
    1971 35.18 9
    1972 43.06 9
    1973 37.77 9
    1974 38.11 9
    1975  26.8 9
    1976 18.34 9
    1977 19.71 9
    1978 21.41 9
    1979  3.74 9
    1980  5.38 9
    1981   2.8 9
    1982  9.73 9
    1983  2.36 9
    end
    format %ty year
    label values id country
    label def country 1 "ALB", modify
    label def country 2 "AUT", modify
    label def country 3 "BEL", modify
    label def country 4 "BGR", modify
    label def country 5 "BIH", modify
    label def country 6 "CZE", modify
    label def country 7 "EST", modify
    label def country 8 "HRV", modify
    label def country 9 "HUN", modify
    
    cls
    
    
    xtset id year, y
    Presumably, tsfill or tsappend would be ideal for this situation?

  • #2
    I am conflicted here. I have enjoyed interpolation as a programming problem as a search will indicate, but I am sceptical about how often it is the best way to deal with missing data.

    ipolate offers linear extrapolation as an option.

    I'd advise extreme caution here. The grounds are almost certainly obvious to you but deserve mention. Generally, this could be criticised as "making up data" and there would be some truth in that regardless of the rationale. Particularly, it's all too easy to get nonsensical values from extrapolation, as when a value that can't be negative is predicted to be negative.

    Last edited by Nick Cox; 26 May 2022, 09:25.

    Comment


    • #3
      Yes, what I really meant was extrapolation.

      My advisor once told me that extrapolation is only as bad (usually) as the number of non-missing values you have. However, since I want to extrapolate back to the 1950s, this does give me quite a lot of pause.

      Using this data source is a start, but I suspect my best bet would be to look at individual nations mortality rates for the given years in question. I hate manual data collection/entry.

      For example, papers like this one actually do give maternal mortality data for Romania in 1965, which, while better than nothing, isn't exactly a very precise estimate. Alternatively, I could (and likely should) consult publications like this one (see Figure 17, if you'd like) which have pre-1900 data on the subject

      EDIT: Apparently in the Brill Publication, the real source for maternal mortality data come from this obscure volume. Now I must extract (by hand!) the datapoints given here in the tables they present. Curse computer scientists and engineers for not inventing GitHub and other repositories in the 80s.
      Last edited by Jared Greathouse; 26 May 2022, 10:38.

      Comment


      • #4
        Originally posted by Jared Greathouse View Post
        Apparently in the Brill Publication, the real source for maternal mortality data come from this obscure volume. Now I must extract (by hand!) the datapoints given here in the tables they present. Curse computer scientists and engineers for not inventing GitHub and other repositories in the 80s.
        You may want to try out https://smallpdf.com/pdf-to-excel. I was amazed that it could capture the data from an image converted to PDF.

        Comment


        • #5
          Okay this would be good to use. But, now that I have a better basis for interpolation, I'll still need to expand my dataset back to 1951.

          I've reduced my dataset to this
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input int year float(mmr id)
          1969 32.13  2
          1970 24.93  2
          1971 30.41  2
          1972 24.03  2
          1973 22.44  2
          1974 18.47  2
          1975 17.07  2
          1976 21.73  2
          1977 18.69  2
          1978 15.22  2
          1979 12.73  2
          1980   3.3  2
          1981     0  2
          1982  6.33  2
          1983  3.33  2
          1984  1.12  2
          1985  2.29  2
          1986  1.15  2
          1987     0  2
          1988     0  2
          1989  1.13  2
          1968 21.83  3
          1969 20.45  3
          1970  20.4  3
          1971 20.49  3
          1972 13.21  3
          1973 12.36  3
          1974 16.98  3
          1975 12.58  3
          1976  5.79  3
          1977 10.67  3
          1978 13.12  3
          1979   .81  3
          1980    .8  3
          1981  1.62  3
          1982     0  3
          1983     0  3
          1984   .86  3
          1985  1.75  3
          1986   .85  3
          1987     0  3
          1988   2.5  3
          1989     0  3
          1968 23.33  4
          1969 33.55  4
          1970 44.69  4
          1971 33.23  4
          1972  29.7  4
          1973 34.36  4
          1974 29.49  4
          1975 27.65  4
          1976 23.45  4
          1977 34.51  4
          1978 27.85  4
          1979 15.33  4
          1980  6.24  4
          1981  6.43  4
          1982  4.83  4
          1983  7.32  4
          1984  4.91  4
          1985  3.32  4
          1986 10.83  4
          1987  7.71  4
          1988  2.55  4
          1989  5.34  4
          1969 53.79  9
          1970 42.16  9
          1971 35.18  9
          1972 43.06  9
          1973 37.77  9
          1974 38.11  9
          1975  26.8  9
          1976 18.34  9
          1977 19.71  9
          1978 21.41  9
          1979  3.74  9
          1980  5.38  9
          1981   2.8  9
          1982  9.73  9
          1983  2.36  9
          1984  4.79  9
          1985  9.22  9
          1986  2.34  9
          1987  3.18  9
          1988  5.63  9
          1989  2.43  9
          1969 32.76 12
          1970 29.49 12
          1971 22.41 12
          1972 17.89 12
          1973 19.05 12
          1974 17.39 12
          1975 14.76 12
          1976 14.18 12
          1977 12.83 12
          1978 16.21 12
          1979 14.53 12
          1980  1.15 12
          1981  2.21 12
          1982  1.42 12
          end
          format %ty year
          label values id country
          label def country 2 "AUT", modifyit
          label def country 3 "BEL", modify
          label def country 4 "BGR", modify
          label def country 9 "HUN", modify
          label def country 12 "POL", modify
          Aside from creating another dataset of years from 1951-1989, and merging it to this one, how would I make the additional years from 1951 to 1968 in the current data example? Andrew Musau

          I imagine egen would be involved somehow. Now that I have 5-year data from the PDF table, I'm much more comfortable with interpolation now.

          Comment


          • #6
            Here is a way to do linear extrapolation. For other methods, see mipolate from SSC.

            ADDED IN EDIT: If the zero values are missing values and not true zeros, e.g., Austria in 1981, then set them to missing. As you can see, linear extrapolation leads to negative values in some instances which may be implausible. You can explore other methods to see if they do a better job, or fill in data for some earlier years if you have access to them.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input int year float(mmr id)
            1969 32.13  2
            1970 24.93  2
            1971 30.41  2
            1972 24.03  2
            1973 22.44  2
            1974 18.47  2
            1975 17.07  2
            1976 21.73  2
            1977 18.69  2
            1978 15.22  2
            1979 12.73  2
            1980   3.3  2
            1981     0  2
            1982  6.33  2
            1983  3.33  2
            1984  1.12  2
            1985  2.29  2
            1986  1.15  2
            1987     0  2
            1988     0  2
            1989  1.13  2
            1968 21.83  3
            1969 20.45  3
            1970  20.4  3
            1971 20.49  3
            1972 13.21  3
            1973 12.36  3
            1974 16.98  3
            1975 12.58  3
            1976  5.79  3
            1977 10.67  3
            1978 13.12  3
            1979   .81  3
            1980    .8  3
            1981  1.62  3
            1982     0  3
            1983     0  3
            1984   .86  3
            1985  1.75  3
            1986   .85  3
            1987     0  3
            1988   2.5  3
            1989     0  3
            1968 23.33  4
            1969 33.55  4
            1970 44.69  4
            1971 33.23  4
            1972  29.7  4
            1973 34.36  4
            1974 29.49  4
            1975 27.65  4
            1976 23.45  4
            1977 34.51  4
            1978 27.85  4
            1979 15.33  4
            1980  6.24  4
            1981  6.43  4
            1982  4.83  4
            1983  7.32  4
            1984  4.91  4
            1985  3.32  4
            1986 10.83  4
            1987  7.71  4
            1988  2.55  4
            1989  5.34  4
            1969 53.79  9
            1970 42.16  9
            1971 35.18  9
            1972 43.06  9
            1973 37.77  9
            1974 38.11  9
            1975  26.8  9
            1976 18.34  9
            1977 19.71  9
            1978 21.41  9
            1979  3.74  9
            1980  5.38  9
            1981   2.8  9
            1982  9.73  9
            1983  2.36  9
            1984  4.79  9
            1985  9.22  9
            1986  2.34  9
            1987  3.18  9
            1988  5.63  9
            1989  2.43  9
            1969 32.76 12
            1970 29.49 12
            1971 22.41 12
            1972 17.89 12
            1973 19.05 12
            1974 17.39 12
            1975 14.76 12
            1976 14.18 12
            1977 12.83 12
            1978 16.21 12
            1979 14.53 12
            1980  1.15 12
            1981  2.21 12
            1982  1.42 12
            end
            format %ty year
            label values id country
            label def country 2 "AUT", modify
            label def country 3 "BEL", modify
            label def country 4 "BGR", modify
            label def country 9 "HUN", modify
            label def country 12 "POL", modify
            
            xtset id year
            tsfill
            expand 19 if year==1969, g(new)
            bys id year (new): replace year= 1970-_n if new
            bys id year (new): drop if new & _N==2
            ds id year new, not
            foreach var in `r(varlist)'{
                cap replace `var'= . if new
            }
            ipolate mmr year, by(id) epolate g(wanted)
            Res.:

            Code:
            . ipolate mmr year, by(id) epolate g(wanted)
            
            . sort id year
            
            . l, sepby(id)
            
                 +---------------------------------------+
                 | year     mmr    id   new       wanted |
                 |---------------------------------------|
              1. | 1951       .   AUT     1    161.73001 |
              2. | 1952       .   AUT     1    154.53001 |
              3. | 1953       .   AUT     1    147.33001 |
              4. | 1954       .   AUT     1    140.13001 |
              5. | 1955       .   AUT     1    132.93001 |
              6. | 1956       .   AUT     1    125.73001 |
              7. | 1957       .   AUT     1    118.53001 |
              8. | 1958       .   AUT     1    111.33001 |
              9. | 1959       .   AUT     1    104.13001 |
             10. | 1960       .   AUT     1    96.930008 |
             11. | 1961       .   AUT     1    89.730007 |
             12. | 1962       .   AUT     1    82.530006 |
             13. | 1963       .   AUT     1    75.330006 |
             14. | 1964       .   AUT     1    68.130005 |
             15. | 1965       .   AUT     1    60.930004 |
             16. | 1966       .   AUT     1    53.730003 |
             17. | 1967       .   AUT     1    46.530003 |
             18. | 1968       .   AUT     1    39.330002 |
             19. | 1969   32.13   AUT     0    32.130001 |
             20. | 1970   24.93   AUT     0        24.93 |
             21. | 1971   30.41   AUT     0        30.41 |
             22. | 1972   24.03   AUT     0    24.030001 |
             23. | 1973   22.44   AUT     0    22.440001 |
             24. | 1974   18.47   AUT     0    18.469999 |
             25. | 1975   17.07   AUT     0        17.07 |
             26. | 1976   21.73   AUT     0        21.73 |
             27. | 1977   18.69   AUT     0    18.690001 |
             28. | 1978   15.22   AUT     0        15.22 |
             29. | 1979   12.73   AUT     0        12.73 |
             30. | 1980     3.3   AUT     0          3.3 |
             31. | 1981       0   AUT     0            0 |
             32. | 1982    6.33   AUT     0    6.3299999 |
             33. | 1983    3.33   AUT     0    3.3299999 |
             34. | 1984    1.12   AUT     0         1.12 |
             35. | 1985    2.29   AUT     0         2.29 |
             36. | 1986    1.15   AUT     0         1.15 |
             37. | 1987       0   AUT     0            0 |
             38. | 1988       0   AUT     0            0 |
             39. | 1989    1.13   AUT     0         1.13 |
                 |---------------------------------------|
             40. | 1951       .   BEL     1    45.289986 |
             41. | 1952       .   BEL     1    43.909986 |
             42. | 1953       .   BEL     1    42.529987 |
             43. | 1954       .   BEL     1    41.149988 |
             44. | 1955       .   BEL     1    39.769989 |
             45. | 1956       .   BEL     1     38.38999 |
             46. | 1957       .   BEL     1    37.009991 |
             47. | 1958       .   BEL     1    35.629992 |
             48. | 1959       .   BEL     1    34.249992 |
             49. | 1960       .   BEL     1    32.869993 |
             50. | 1961       .   BEL     1    31.489994 |
             51. | 1962       .   BEL     1    30.109995 |
             52. | 1963       .   BEL     1    28.729996 |
             53. | 1964       .   BEL     1    27.349997 |
             54. | 1965       .   BEL     1    25.969997 |
             55. | 1966       .   BEL     1    24.589998 |
             56. | 1967       .   BEL     1    23.209999 |
             57. | 1968   21.83   BEL     0        21.83 |
             58. | 1969   20.45   BEL     0    20.450001 |
             59. | 1970    20.4   BEL     0         20.4 |
             60. | 1971   20.49   BEL     0        20.49 |
             61. | 1972   13.21   BEL     0        13.21 |
             62. | 1973   12.36   BEL     0        12.36 |
             63. | 1974   16.98   BEL     0        16.98 |
             64. | 1975   12.58   BEL     0        12.58 |
             65. | 1976    5.79   BEL     0         5.79 |
             66. | 1977   10.67   BEL     0        10.67 |
             67. | 1978   13.12   BEL     0        13.12 |
             68. | 1979     .81   BEL     0          .81 |
             69. | 1980      .8   BEL     0    .80000001 |
             70. | 1981    1.62   BEL     0         1.62 |
             71. | 1982       0   BEL     0            0 |
             72. | 1983       0   BEL     0            0 |
             73. | 1984     .86   BEL     0    .86000001 |
             74. | 1985    1.75   BEL     0         1.75 |
             75. | 1986     .85   BEL     0    .85000002 |
             76. | 1987       0   BEL     0            0 |
             77. | 1988     2.5   BEL     0          2.5 |
             78. | 1989       0   BEL     0            0 |
                 |---------------------------------------|
             79. | 1951       .   BGR     1   -150.40999 |
             80. | 1952       .   BGR     1   -140.18999 |
             81. | 1953       .   BGR     1   -129.96999 |
             82. | 1954       .   BGR     1   -119.74999 |
             83. | 1955       .   BGR     1   -109.52999 |
             84. | 1956       .   BGR     1   -99.309992 |
             85. | 1957       .   BGR     1   -89.089993 |
             86. | 1958       .   BGR     1   -78.869993 |
             87. | 1959       .   BGR     1   -68.649994 |
             88. | 1960       .   BGR     1   -58.429995 |
             89. | 1961       .   BGR     1   -48.209995 |
             90. | 1962       .   BGR     1   -37.989996 |
             91. | 1963       .   BGR     1   -27.769997 |
             92. | 1964       .   BGR     1   -17.549997 |
             93. | 1965       .   BGR     1    -7.329998 |
             94. | 1966       .   BGR     1    2.8900013 |
             95. | 1967       .   BGR     1    13.110001 |
             96. | 1968   23.33   BGR     0        23.33 |
             97. | 1969   33.55   BGR     0    33.549999 |
             98. | 1970   44.69   BGR     0    44.689999 |
             99. | 1971   33.23   BGR     0        33.23 |
            100. | 1972    29.7   BGR     0    29.700001 |
            101. | 1973   34.36   BGR     0    34.360001 |
            102. | 1974   29.49   BGR     0        29.49 |
            103. | 1975   27.65   BGR     0        27.65 |
            104. | 1976   23.45   BGR     0    23.450001 |
            105. | 1977   34.51   BGR     0    34.509998 |
            106. | 1978   27.85   BGR     0        27.85 |
            107. | 1979   15.33   BGR     0        15.33 |
            108. | 1980    6.24   BGR     0    6.2399998 |
            109. | 1981    6.43   BGR     0    6.4299998 |
            110. | 1982    4.83   BGR     0    4.8299999 |
            111. | 1983    7.32   BGR     0    7.3200002 |
            112. | 1984    4.91   BGR     0    4.9099998 |
            113. | 1985    3.32   BGR     0    3.3199999 |
            114. | 1986   10.83   BGR     0        10.83 |
            115. | 1987    7.71   BGR     0         7.71 |
            116. | 1988    2.55   BGR     0         2.55 |
            117. | 1989    5.34   BGR     0    5.3400002 |
                 |---------------------------------------|
            118. | 1951       .   HUN     1    263.13002 |
            119. | 1952       .   HUN     1    251.50002 |
            120. | 1953       .   HUN     1    239.87002 |
            121. | 1954       .   HUN     1    228.24002 |
            122. | 1955       .   HUN     1    216.61002 |
            123. | 1956       .   HUN     1    204.98001 |
            124. | 1957       .   HUN     1    193.35001 |
            125. | 1958       .   HUN     1    181.72001 |
            126. | 1959       .   HUN     1    170.09001 |
            127. | 1960       .   HUN     1    158.46001 |
            128. | 1961       .   HUN     1    146.83001 |
            129. | 1962       .   HUN     1    135.20001 |
            130. | 1963       .   HUN     1    123.57001 |
            131. | 1964       .   HUN     1    111.94001 |
            132. | 1965       .   HUN     1    100.31001 |
            133. | 1966       .   HUN     1    88.680004 |
            134. | 1967       .   HUN     1    77.050003 |
            135. | 1968       .   HUN     1    65.420002 |
            136. | 1969   53.79   HUN     0    53.790001 |
            137. | 1970   42.16   HUN     0        42.16 |
            138. | 1971   35.18   HUN     0        35.18 |
            139. | 1972   43.06   HUN     0    43.060001 |
            140. | 1973   37.77   HUN     0        37.77 |
            141. | 1974   38.11   HUN     0    38.110001 |
            142. | 1975    26.8   HUN     0    26.799999 |
            143. | 1976   18.34   HUN     0        18.34 |
            144. | 1977   19.71   HUN     0    19.709999 |
            145. | 1978   21.41   HUN     0        21.41 |
            146. | 1979    3.74   HUN     0         3.74 |
            147. | 1980    5.38   HUN     0    5.3800001 |
            148. | 1981     2.8   HUN     0          2.8 |
            149. | 1982    9.73   HUN     0    9.7299995 |
            150. | 1983    2.36   HUN     0    2.3599999 |
            151. | 1984    4.79   HUN     0         4.79 |
            152. | 1985    9.22   HUN     0    9.2200003 |
            153. | 1986    2.34   HUN     0    2.3399999 |
            154. | 1987    3.18   HUN     0    3.1800001 |
            155. | 1988    5.63   HUN     0    5.6300001 |
            156. | 1989    2.43   HUN     0    2.4300001 |
                 |---------------------------------------|
            157. | 1951       .   POL     1    91.619972 |
            158. | 1952       .   POL     1    88.349974 |
            159. | 1953       .   POL     1    85.079975 |
            160. | 1954       .   POL     1    81.809977 |
            161. | 1955       .   POL     1    78.539978 |
            162. | 1956       .   POL     1    75.269979 |
            163. | 1957       .   POL     1    71.999981 |
            164. | 1958       .   POL     1    68.729982 |
            165. | 1959       .   POL     1    65.459984 |
            166. | 1960       .   POL     1    62.189985 |
            167. | 1961       .   POL     1    58.919987 |
            168. | 1962       .   POL     1    55.649988 |
            169. | 1963       .   POL     1     52.37999 |
            170. | 1964       .   POL     1    49.109991 |
            171. | 1965       .   POL     1    45.839993 |
            172. | 1966       .   POL     1    42.569994 |
            173. | 1967       .   POL     1    39.299995 |
            174. | 1968       .   POL     1    36.029997 |
            175. | 1969   32.76   POL     0    32.759998 |
            176. | 1970   29.49   POL     0        29.49 |
            177. | 1971   22.41   POL     0        22.41 |
            178. | 1972   17.89   POL     0    17.889999 |
            179. | 1973   19.05   POL     0    19.049999 |
            180. | 1974   17.39   POL     0    17.389999 |
            181. | 1975   14.76   POL     0        14.76 |
            182. | 1976   14.18   POL     0        14.18 |
            183. | 1977   12.83   POL     0        12.83 |
            184. | 1978   16.21   POL     0    16.209999 |
            185. | 1979   14.53   POL     0        14.53 |
            186. | 1980    1.15   POL     0         1.15 |
            187. | 1981    2.21   POL     0         2.21 |
            188. | 1982    1.42   POL     0         1.42 |
                 +---------------------------------------+
            Last edited by Andrew Musau; 26 May 2022, 13:35.

            Comment


            • #7
              Thanks so much for this.

              Yes, the 0's we see are in fact true 0s. I now have data (on 7 countries I think, 1 treated, 6 not) from 1945, 1951, to 1965 in 5 year increments, so hopefully once I input the truly missing data and interpolate (and not extrapolate!) I won't have any monkey-business.

              Comment

              Working...
              X