Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • expand data?

    Dear All, I found this question here (in Chinese). Suppose we have the following data:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long id float year1 str12 industrycode str60 industryname
    1 1991 "J66" "货币金融服务"                                          
    2 1991 "K70" "房地产业"                                                
    3 1991 "S90" "综合"                                                      
    4 1991 "G54" "道路运输业"                                             
    4 2010 "C27" "医药制造业"                                             
    4 2020 "I65" "软件和信息技术服务业"                              
    5 1990 "S90" "综合"                                                      
    5 2012 "K70" "房地产业"                                                
    5 2016 "N77" "生态保护和环境治理业"                              
    6 1992 "K70" "房地产业"                                                
    7 1992 "K70" "房地产业"                                                
    7 2010 "H61" "住宿业"                                                   
    7 2017 "K70" "房地产业"                                                
    8 1992 "I64" "互联网和相关服务"                                    
    8 2010 "F51" "批发业"                                                   
    8 2012 "H61" "住宿业"                                                   
    8 2015 "C37" "铁路、船舶、航空航天和其它运输设备制造业"
    9 1991 "S90" "综合"                                                      
    end
    For each `id', I wish to expand the data from 1991 to 2022. Taking id=4 as an example, I'd like to have the following result.
    Click image for larger version

Name:	001306ooczjfodeh9o9mse (1).png
Views:	1
Size:	173.8 KB
ID:	1727078


    Any suggestion is highly appreciated.
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    Does your year1 variable go from 1991 to 2022 in your full dataset? If so, the command -fillin id year1- might help.
    Last edited by Yanis Rahmouni; 14 Sep 2023, 01:19.

    Comment


    • #3
      Why 1991? id 5 starts at 1990.

      The assumption of constant industry in missing data might not be plausible; I cannot tell.

      That said:

      Code:
      preserve
      
      keep id
      sort id
      by id : keep if _n == 1
      expand 32
      sort id
      by id : generate year1 = 1990+_n
      
      tempfile balanced
      save "`balanced'"
      
      restore
      
      merge 1:1 id year1 using "`balanced'"
      
      sort id year
      by id (year) : replace industrycode = industrycode[_n-1] if mi(industrycode)
      by id (year) : replace industryname = industryname[_n-1] if mi(industryname)
      by id (year) : replace industrycode = industrycode[_n+1] if mi(industrycode)
      by id (year) : replace industryname = industryname[_n+1] if mi(industryname)

      Comment


      • #4
        Dear daniel, Thanks a lot.
        Ho-Chuan (River) Huang
        Stata 19.0, MP(4)

        Comment


        • #5
          Dear Yanis, Daniel's suggestion above works.
          Ho-Chuan (River) Huang
          Stata 19.0, MP(4)

          Comment

          Working...
          X