Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping observations for hotdeck

    Hi,

    I need to impute some values, and I would like to do it using the hot deck method. I use hotdeck command but the problem is that some observations have simply too little observations and hence the imputation does not go through.

    I use the following code:

    Code:
    hotdeck $varlist using imp, store by(Identifier date) noise
    Also I have panel data example below:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(date Identifier) double(h1 h2 h3 h4 h5 h6 h7)
     7726   1 7.270529125451405  6.712280282199131    3.5 5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     7885   2  4.00909090909091  6.321466419562403    . 4.4433333333333325              2.221 3.4676878234314086 5.618421052631579
     7934   4 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     7936   5 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     7943   6 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     8153   7 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     8214   8 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     8228   9 7.577272727272728  8.203945949999529    .                  5             3.6865  2.630800314614003                 .
     8285  10 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     8945  11 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     8965  12                 .  8.883038989532837    .                2.7             5.3065 3.1522136780790344                 .
     9007  13 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     9359  14                 .  8.883038989532837    .                2.7             5.3065 3.1522136780790344                 .
     9369  15 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     9447  16 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     9517  17 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
     9532  18 7.270529125451405  6.712280282199131    .  5.526000000000001             3.7342  2.719483784343309 4.592668024439919
                     .
    end
    format %td date
    Is there any systematic way of getting rid of year firm observations that prevent the code from imputing the values?

    Thanks in advance for help.
    Last edited by Jozef Patrnciak; 15 Oct 2018, 11:19. Reason: dropping observations imputation

  • #2
    Jozef:
    what you experienced is well documanted in the -helpfile- of the user-written command -hotdeck- (for the future, whenever an unofficial Stata command is used, please specify as requested and explained by FAQ. Thanks):
    If a dataset contains many variables with missing values then it is
    possible that many of the rows of data will contain at least one
    missing value. The hotdeck procedure will not work very well in such
    circumstances. There are more elaborate methods that only replace
    missing values, rather than the whole row, for imputed values. These
    multivariate multiple imputation methods are discussed by
    Schafer(1997).
    You may want to try -ipolate- or -mi- suite of commands.

    Please also note that it's a while that I do not read about hotdeck imputation methods in peer-reviewed journals (obviously my experience is not statistical significant, but my gut-feeling is that, due to the vailability of -mi- suites in many statistical software, other methods are progressively side-tracked and/or considered old-fashioned).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment

    Working...
    X