Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mipolate with only nonnegative values

    Dear Stata users,
    I am using census data and I want to fill the missing years. I am thinking in filling the missing years using the command mipolate. I am using the following code:
    by commune: mipolate menemployed year, spline gen(abc) epolate.
    However, when I have 0 menemployed between two dates, mipolate generates negative values for the missing observations, which does not make sense because the number of men employed cannot be negative. I thought that what is better to do is first replace the missing values in menemployed between two dates with 0 value, by 0; just like I did in menemployed2, because mipolate then gives different values. Could anyone explain me how to write that code?

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str5 commune float year double(menemployed abc) float menemployed2 double def
    "01019" 1982                0                   0         0                  0
    "01019" 1983                .   .0881996741774331         0                  0
    "01019" 1984                .  .16799937938558684         0                  0
    "01019" 1985                .  .23099914665518195         0                  0
    "01019" 1986                .    .268799007016939         0                  0
    "01019" 1987                .  .27299899150157864         0                  0
    "01019" 1988                .   .2351991311398216         0                  0
    "01019" 1989                .  .14699945696238848         0                  0
    "01019" 1990                0                   0         0                  0
    "01019" 1991                . -.20685355684019582         0                  0
    "01019" 1992                .  -.4452329231453413         0                  0
    "01019" 1993                .  -.6794641566261512         0                  0
    "01019" 1994                .  -.8738733149933406         0                  0
    "01019" 1995                .  -.9927864559576244         0                  0
    "01019" 1996                . -1.0005296372297174         0                  0
    "01019" 1997                .  -.8614289165203342         0                  0
    "01019" 1998                .  -.5398103515401902         0                  0
    "01019" 1999                0                   0         0                  0
    "01019" 2000                .   .7680357459056831         . .37718994515519205
    "01019" 2001                .  1.6717691560469523         . 1.1011913235808415
    "01019" 2002                .  2.5930321658100626         . 2.0053369648332655
    "01019" 2003                .  3.4136567105812685         .  2.922959698468782
    "01019" 2004                .   4.015474725746825         .  3.687392354043708
    "01019" 2005                .   4.280318146692987         .   4.13196776111436
    "01019" 2006 4.09001890880601    4.09001890880601 4.0900187 4.0900187492370605
    "01019" 2007                .   3.389004942869004         .  3.468885696447302
    "01019" 2008                .  2.3720881612525067         .  2.471939174697311
    "01019" 2009                .  1.2966764717239112         .  1.376557304418494
    "01019" 2010                .  .42017778205061074         . .46011820604225484
    "01019" 2011                0                   0         0                  0
    "01019" 2012                .   .2268587548652355         . .20289448622572165
    "01019" 2013                .  1.0247005620425367         .  .9927481826637552
    "01019" 2014                .  2.2507796584538866         . 2.2228212867610226
    "01019" 2015                .    3.76235028102127         .  3.746373995964446
    "01019" 2016 5.41666666666667    5.41666666666667  5.416667  5.416666507720947
    "01019" 2017                .   7.070983052311931         .   7.08695901947749
    "01019" 2018                .   8.725299437957347         .  8.757251531234033
    "01019" 2019                .  10.379615823602762         . 10.427544042990576
    end

  • #2
    Interpolation can help when a fairly smooth series contains a few gaps and there is good qualitative understanding of the kinds of change likely. In fact some books say nothing about interpolation except as applied within small intervals for defined deterministic functions (as I remember being taught in secondary school).

    Here your dataset is more gaps than data and looks likely to be capricious. Also, interpolation ignores what you should know in terms of global, national and regional context: growth, recession, whatever. Other way round, explaining unemployment by economic drivers runs the risk of circularity.

    So interpolation of any kind looks dubious with that data. I don't think you have any plausible scope for anything but linear interpolation. The usual way of ensuring positive estimates is to interpolate on logarithmic scale, but that won't work here either. Don't blame splines here: they just do what they are told.

    mipolate is from SSC, as you are asked to explain (FAQ Advice #12).

    Comment

    Working...
    X