Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • making a number of variables based on values of other variables

    Hello, I have been receiving great help from the Stata Forum by searching - but today I have encountered a problem which I don't even know how to put keyword for the search. I would appreciate it very much if you could please help me on this.

    My data looks like this
    id ts1 ts2 ts3 ts65 tact1 tact2 tact3 …. tact65
    1 1 4 5 75 111 121 234 111
    2 1 3 6 80 111 131 313 111
    3 1 4 5 90 111 141 123 111
    ts* is time slot - so 1 means the first slot, so ts2=4 means 4th slot.
    tact* is the activity that was performed on that slot.

    I need to make variables called act* based on the value of time slots and tact.
    I need to make activity variables corresponds to the time slot.

    For instance, this is what I need to make from the above dataset:
    id act1 act2 act3 act4 act5 act6 act144
    1 111 111 111 121 234 234
    2 111 111 131 131 131 313
    3 111 111 111 141 123 123
    But I don't know how to make act* based on the value of the ts*

    I tried forvalues, but wasn't able to figure it out how to create proper loop.
    I don't even know how to explain this in sentence - I am sorry if my explanation doesn't make much sense.
    I would appreciate any help you could give..Thank you so much.










    Last edited by Jiweon Jun; 12 Nov 2021, 13:42.

  • #2
    Welcome to Statalist.

    It'd be very helpful if in future you can post the data using -dataex- (please read the FAQ http://www.statalist.org/forums/help to learn more). That way users don't have to recreate the data, and can jump right into drafting the codes.

    Here are some suggestions that may work.

    Code:
    * Making a fake data since no dataex:
    clear
    input id ts1 ts2 ts3 tact1 tact2 tact3
    1 1 4 5 111 121 234
    2 1 3 6 111 131 313
    3 1 4 5 111 141 123
    end
    
    * Reshape to long
    reshape long ts tact, i(id) j(instance)
    
    * Declare time-series in order to use tsfill
    tsset id ts
    * Fill the gaps with tsfill
    tsfill
    
    * Fill up the missing in tact by carrying the value forward
    bysort id (ts): replace tact = tact[_n-1] if tact==.
    
    * Some rename to get to the desired variable name
    rename tact act
    keep id ts act
    * Reshape back to wide
    reshape wide act, i(id) j(ts)
    
    list, sep(0)
    Results:
    Code:
         +----------------------------------------------+
         | id   act1   act2   act3   act4   act5   act6 |
         |----------------------------------------------|
      1. |  1    111    111    111    121    234      . |
      2. |  2    111    111    131    131    131    313 |
      3. |  3    111    111    111    141    123      . |
         +----------------------------------------------+
    Also, check again if you really want them to be in this column-based "wide format". It is a very clunky way to store longitudinal data. The "long format" is usually much preferred for both analysis and management.

    Comment


    • #3
      Thank you so much! It worked perfectly -- I'm so grateful that I don't know what to say!
      And your kind advice on the use of -dataex- and the long format is also very much appreciated. Thank you!

      Comment

      Working...
      X