Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variable Management


    > I got the following two variables in my dataset: > > Number1 Number2 > 000752 > 000752 > 48239P 000752 > 000752 > 000752 > 000752 > 89351Q 893895 > 893895 > 893895 > 893895 > 893895 > .... > > > I want to fill up all the empty cells so that it looks like the following: > > Number1 Number2 > 48239P 000752 > 48239P 000752 > 48239P 000752 > 48239P 000752 > 48239P 000752 > 48239P 000752 > 89351Q 893895 > 89351Q 893895 > 89351Q 893895 > 89351Q 893895 > 89351Q 893895 > > Each Number1 has a corresponding Number2.

  • #2
    Welcome to Statalist.

    Did you perhaps not take a look at your post after it was posted? You are expecting a lot of effort of other members to figure out what you presented.

    With regard to the example data below, I believe you have variables number1 and number2 and wish to replace the contents of number1 with new1. You seem to show that for any given value of number2, the same value of number1 will apply to all observations.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6(number2 number1 new1)
    "000752" ""       "48239P"
    "000752" ""       "48239P"
    "000752" "48239P" "48239P"
    "000752" ""       "48239P"
    "000752" ""       "48239P"
    "000752" ""       "48239P"
    "893895" "89351Q" "89351Q"
    "893895" ""       "89351Q"
    "893895" ""       "89351Q"
    "893895" ""       "89351Q"
    "893895" ""       "89351Q"
    end
    The following seems to do what you want.
    Code:
    . generate seq = _n
    
    . by number2 (number1), sort: replace number1 = number1[_N]
    (9 real changes made)
    
    . sort seq
    
    . list seq number1 number2, sepby(number1)
    
         +-------------------------+
         | seq   number1   number2 |
         |-------------------------|
      1. |   1    48239P    000752 |
      2. |   2    48239P    000752 |
      3. |   3    48239P    000752 |
      4. |   4    48239P    000752 |
      5. |   5    48239P    000752 |
      6. |   6    48239P    000752 |
         |-------------------------|
      7. |   7    89351Q    893895 |
      8. |   8    89351Q    893895 |
      9. |   9    89351Q    893895 |
     10. |  10    89351Q    893895 |
     11. |  11    89351Q    893895 |
         +-------------------------+
    Now, a word of advice to improve your future posts. We want to help you solve your problems, but you need to help us understand your problems. Please take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using code delimiters [CODE] and [/CODE], and to use the dataex command to provide sample data, as described in section 12 of the FAQ.

    Comment


    • #3
      Thank you William Lisowski for your kind reply and help. I am sorry for not posting well my data. Here again, I present the data with tabular form. Please look at this.

      So I have 4 variables. Each subject (Subject ID) has multiple entries for Heart Rate, but their demographic information (Age, Gender) is only in the first row. When I do analysis of the Heart Rate, I am unable to have the results in the correct format with respect to the subject ID. Is there a way in STATA that I can fill the info of each subject in each row? I am struggling with this as I have a very large dataset. I will really appreciate your help.
      Subject ID Age Gender Heart Rate
      1 54 F 67
      1 75
      2 34 F 56
      3 57 M 69
      3 90
      3 111
      3 67
      3 56
      3 76
      4 67
      4 94
      4 68
      5 21 F 56
      5 74
      6 39 M 73
      6 79
      6 84
      6 75
      6 48
      7 45 M 67
      7 59
      7 62
      8 31 M 68
      9 19 F 74
      9 71
      9 104
      10 11 M 96
      10 67

      Comment


      • #4
        Here is example code. For clarity, I repost your example data as prepared by the dataex command.
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input byte(subjectid age) str1 gender int heartrate
         1 54 "F"  67
         1  . ""   75
         2 34 "F"  56
         3 57 "M"  69
         3  . ""   90
         3  . ""  111
         3  . ""   67
         3  . ""   56
         3  . ""   76
         4  . ""   67
         4  . ""   94
         4  . ""   68
         5 21 "F"  56
         5  . ""   74
         6 39 "M"  73
         6  . ""   79
         6  . ""   84
         6  . ""   75
         6  . ""   48
         7 45 "M"  67
         7  . ""   59
         7  . ""   62
         8 31 "M"  68
         9 19 "F"  74
         9  . ""   71
         9  . ""  104
        10 11 "M"  96
        10  . ""   67
        end
        
        generate seq = _n
        bysort subjectid (seq): replace age = age[1]
        bysort subjectid (seq): replace gender = gender[1]
        list, sepby(subjectid) abbreviate(12) noobs
        Code:
        . list, sepby(subjectid) abbreviate(12) noobs
        
          +--------------------------------------------+
          | subjectid   age   gender   heartrate   seq |
          |--------------------------------------------|
          |         1    54        F          67     1 |
          |         1    54        F          75     2 |
          |--------------------------------------------|
          |         2    34        F          56     3 |
          |--------------------------------------------|
          |         3    57        M          69     4 |
          |         3    57        M          90     5 |
          |         3    57        M         111     6 |
          |         3    57        M          67     7 |
          |         3    57        M          56     8 |
          |         3    57        M          76     9 |
          |--------------------------------------------|
          |         4     .                   67    10 |
          |         4     .                   94    11 |
          |         4     .                   68    12 |
          |--------------------------------------------|
          |         5    21        F          56    13 |
          |         5    21        F          74    14 |
          |--------------------------------------------|
          |         6    39        M          73    15 |
          |         6    39        M          79    16 |
          |         6    39        M          84    17 |
          |         6    39        M          75    18 |
          |         6    39        M          48    19 |
          |--------------------------------------------|
          |         7    45        M          67    20 |
          |         7    45        M          59    21 |
          |         7    45        M          62    22 |
          |--------------------------------------------|
          |         8    31        M          68    23 |
          |--------------------------------------------|
          |         9    19        F          74    24 |
          |         9    19        F          71    25 |
          |         9    19        F         104    26 |
          |--------------------------------------------|
          |        10    11        M          96    27 |
          |        10    11        M          67    28 |
          +--------------------------------------------+
        If in your data there is another variable that tells you what order the observations belong in (like some sort of date, perhaps, or a visit number) you can use it instead of the sequence number I created for that purpose.

        Comment


        • #5
          WOW. Thank you! This is very helpful.

          Comment

          Working...
          X