Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate a new variable adding observations of a different column

    Hello,

    I have the following dataset:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long gid int year str4 actor_id1 str3 assoc_actor1 int ccode_GID double ELP_L1
    62356    . "."    "."   216                  .
    62357    . "."    "."   216                  .
    79599    . "."    "."   216  .1778533116282972
    79600    . "."    "."   216 .17785357331615614
    79601    . "."    "."   216  .1778536186029953
    80317 2012 "2082" "."   216 .15909446787289028
    80317 2012 "2082" "."   216 .15909446787289028
    80317 2012 "2082" "."   216 .15909446787289028
    80317 2012 "2082" "."   216 .15909446787289028
    80317 2012 "2082" "."   216 .15909446787289028
    80317 2013 "1795" "."   216 .15909446787289028
    80317 2014 "2082" "."   216 .15909446787289028
    80317 2015 "2515" "."   216 .15909446787289028
    80317 2016 "2515" "747" 216 .15909446787289028
    80317 2016 "2515" "."   216 .15909446787289028
    80317 2017 "2515" "."   216 .15909446787289028
    80318 1997 "2515" "."   216 .17610458395941464
    80318 1997 "2515" "."   216 .17610458395941464
    80318 1998 "2515" "."   216 .17610458395941464
    80318 1998 "2515" "."   216 .17610458395941464
    80318 2002 "2515" "."   216 .17610458395941464
    80318 2003 "2515" "."   216 .17610458395941464
    80318 2003 "2515" "."   216 .17610458395941464
    80318 2004 "2515" "."   216 .17610458395941464
    80318 2007 "2082" "."   216 .17610458395941464
    80318 2009 "2082" "."   216 .17610458395941464
    80318 2012 "2515" "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2012 "32"   "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2012 "2082" "."   216 .17610458395941464
    80318 2013 "1795" "."   216 .17610458395941464
    80318 2013 "2515" "."   216 .17610458395941464
    80318 2013 "2622" "."   216 .17610458395941464
    80318 2013 "2515" "."   216 .17610458395941464
    80318 2014 "2515" "."   216 .17610458395941464
    80318 2014 "2515" "747" 216 .17610458395941464
    80318 2014 "2515" "."   216 .17610458395941464
    80318 2015 "1258" "."   216 .17610458395941464
    80318 2015 "2082" "."   216 .17610458395941464
    80318 2015 "2082" "."   216 .17610458395941464
    80318 2015 "2515" "."   216 .17610458395941464
    80318 2017 "2515" "."   216 .17610458395941464
    80318 2017 "2515" "."   216 .17610458395941464
    80319 2012 "2082" "."   216  .1778535061320099
    80319 2012 "2082" "."   216  .1778535061320099
    80320    . "."    "."   216  .1778534511499288
    80321    . "."    "."   216 .17785336784015726
    80322    . "."    "."   216    .17785364358528
    80323    . "."    "."   216 .17785352006697508
    80324    . "."    "."   216 .17785340046719966
    80325    . "."    "."   216  .1778538689661957
    80326    . "."    "."   216 .17785426020441264
    80327 2012 "2082" "."   216 .17785365978238588
    80327 2012 "2082" "."   216 .17785365978238588
    80327 2012 "2082" "."   216 .17785365978238588
    80328    . "."    "."   216 .17785336676075758
    80329    . "."    "."   216 .17785305873016455
    80330 2004 "2515" "."   216 .17785384947953276
    80331    . "."    "."   216                  0
    80332    . "."    "."   216                  0
    81037 1997 "1909" "."   216 .14923540560909032
    81037 1998 "1909" "."   216 .14923540560909032
    81037 1998 "1909" "."   216 .14923540560909032
    81037 1998 "1909" "."   216 .14923540560909032
    81037 1998 "2515" "."   216 .14923540560909032
    81037 1998 "2515" "."   216 .14923540560909032
    81037 1998 "1909" "."   216 .14923540560909032
    81037 1998 "1147" "548" 216 .14923540560909032
    81037 1999 "2515" "."   216 .14923540560909032
    81037 1999 "2515" "."   216 .14923540560909032
    81037 1999 "2515" "."   216 .14923540560909032
    81037 1999 "33"   "."   216 .14923540560909032
    81037 1999 "2515" "."   216 .14923540560909032
    81037 1999 "2515" "."   216 .14923540560909032
    81037 1999 "2515" "."   216 .14923540560909032
    81037 1999 "2515" "."   216 .14923540560909032
    81037 2000 "2515" "."   216 .14923540560909032
    81037 2000 "2515" "."   216 .14923540560909032
    81037 2000 "2515" "."   216 .14923540560909032
    81037 2000 "2515" "."   216 .14923540560909032
    81037 2000 "2515" "."   216 .14923540560909032
    81037 2000 "2515" "."   216 .14923540560909032
    81037 2000 "2515" "."   216 .14923540560909032
    81037 2000 "2515" "."   216 .14923540560909032
    81037 2000 "2515" "."   216 .14923540560909032
    81037 2000 "2515" "."   216 .14923540560909032
    81037 2001 "2515" "."   216 .14923540560909032
    81037 2002 "2515" "."   216 .14923540560909032
    81037 2002 "2082" "."   216 .14923540560909032
    81037 2003 "2515" "."   216 .14923540560909032
    81037 2003 "2515" "."   216 .14923540560909032
    end
    Column assoc_actor1 the most of the times appear missing. However, there are some values that are not missing. I would like to generate a new variable with the same values than actor_id1 but adding the observations that are no missing in assoc_actor1. In other words, I would like to have a new variable with the number of observations (and value) than actor_id1 but adding new observations with the value of assoc_actor1 when this variable is not missing. I would also like that these new observations maintain the value that they have in variables ccode and ELP. These variables are time invariant, that is, they are fixed by gid. If my question is not clear, please let me know.

    Is it possible to obtain what I want? Any ideas how to do it?

    Best,

    Diego.
    Last edited by Diego Malo; 09 Nov 2021, 02:31.

  • #2
    Diego:
    do you mean something along the following lines?
    Code:
    . destring actor_id1, g(num_actor_id1)
    actor_id1: all characters numeric; num_actor_id1 generated as int
    (16 missing values generated)
    
    . destring assoc_actor1 , g(num_assoc_actor1 )
    assoc_actor1: all characters numeric; num_assoc_actor1 generated as int
    (97 missing values generated)
    
    . egen wanted=rowtotal( num_actor_id1 num_assoc_actor1 )
    
    . replace wanted=. if wanted==0
    (16 real changes made, 16 to missing)
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you for your code Carlo Lazzaro . However, I think it is not working well. For example, for grid 80317 and year 2016, there are two observations for actor_id1. One of them has a non missing assoc_actor1. So, in the final dataset that I would like to have, grid 80317 and year 2016 should have three observations: the original two of actor_id1 and a new one of the non missing value of assoc_actor1. With your code, grid 80317 and year 2016 maintain the two original observations because I think rowtotal is summing columns, not adding new observations.
      Last edited by Diego Malo; 09 Nov 2021, 03:05.

      Comment


      • #4
        Diego:
        you may want to take a look at -collapse- then.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X