Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data replace all values

    Hello!

    I'm fairly new to Stata 17, and Stata in general and I had a question about replacing all values of certain variables. I have a panel data set (time-series cross-sectional data) with country-years as the unit of observation. My data set spans 26 countries from 2019-2021. However, for most of the variables in 2021, there are no observations (I guess you could call that missing data). However, I consider most of the variables to be constant, and thus I want to replace all the missing values in 2021 with values of all the variables from 2020 accordingly. Here is a snapshot of my data:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str32 country_name double(year v2x_cspart v2x_egaldem)
    "Sweden"                   2020 .943 .815
    "Sweden"                   2021    .    .
    "Switzerland"              2020 .969 .825
    "Switzerland"              2021    .    .
    "Japan"                    2020 .724 .741
    "Japan"                    2021    .    .
    "United States of America" 2020 .979  .63
    "United States of America" 2021    .    .
    "Portugal"                 2020 .785 .764
    "Portugal"                 2021    .    .
    "Canada"                   2020 .956 .764
    "Canada"                   2021    .    .
    "Australia"                2020 .864 .716
    "Australia"                2021    .    .
    "France"                   2020 .884 .766
    "France"                   2021    .    .
    "Germany"                  2020 .982 .809
    "Germany"                  2021    .    .
    "Ireland"                  2020  .97 .788
    "Ireland"                  2021    .    .
    "Italy"                    2020 .921 .785
    "Italy"                    2021    .    .
    "Netherlands"              2020 .904 .781
    "Netherlands"              2021    .    .
    "Spain"                    2020 .907 .811
    "Spain"                    2021    .    .
    "United Kingdom"           2020 .957 .746
    "United Kingdom"           2021    .    .
    "Austria"                  2020 .937 .764
    "Austria"                  2021    .    .
    "Belgium"                  2020 .951 .824
    "Belgium"                  2021    .    .
    "Denmark"                  2020 .987  .87
    "Denmark"                  2021    .    .
    "Finland"                  2020 .972 .802
    "Finland"                  2021    .    .
    "Greece"                   2020 .891 .715
    "Greece"                   2021    .    .
    Essentially for all countries in 2021, I want to duplicate 2020 data for all variables.

    I have tried the following code successfully to duplicate one variable's values into another year:
    Code:
    by country_id (year), sort: replace v2x_egaldem = v2x_egaldem[1]
    But I have over 3000 variables in the data set that I need replacing and obviously doing 3000+ lines of code is inefficient. Is there a quicker way to do this? Thanks in advance guys!

    Best,
    Nathan
    Last edited by Nathan Brophy; 15 Nov 2021, 20:10.

  • #2
    Welcome to Statalist.

    If your 3000 variables are connected together in the data, you can use "firstVarName-lastVarName" to call all 3000 of them as a variable list (varlist). And then you can incorporate that into a loop:

    Code:
    foreach x of varlist v2x_cspart-v2x_egaldem{
    by country_name (year), sort: replace `x'= `x'[_n-1] if year == 2021
    }
    To learn more, check out -help foreach- and -help varlist-.

    In addition, I changed the core code a bit. In your case, since you only have 2020 and 2021 data so [1] would work. But if you have data back from 2019, the index [1] will copy 2019 data rather than 2020. Thus, changing to copying the last row [_n - 1] would probably be safer. I also added "if year == 2021" to make sure this copying only happens in 2021. You can take that out if you feel that is not needed.

    Comment


    • #3
      Thank you and sorry about the late reply. This method worked absolutely perfectly. Thank you so much! I am still familiarizing myself with loop functions in Stata.

      Comment

      Working...
      X