Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to move values up in columns

    I have a table with many missing values. There are many columns with only missing values. The other columns have 1 non-missing value each, but it situates in different rows. I want to shift this value up to the first row. For the column with only missing variables, I do not change anything. I have not figured out how to do. I tried some idea such as creating an indicator of row that the value is not missing within each colum of variable, then set the values of all variables in the first column to be equal to the values of variables indicated by the number of row before, and then get rid of all the following rows to keep only 1 row. But I do not think this way will work.

    I hope the pic below is clear enough to illustrate my case.

    Many thanks.
    Attached Files

  • #2
    Code:
    collapse y1 y2 y3, by(x) 
    will solve your example.

    Code:
    help collapse

    Comment


    • #3
      It works perfectly. Collapse is magic. Thank you!

      Comment


      • #4
        Hi,

        I have tried to use collapse by for my problem but it is not working. Each subject ("identifier") has 12 rows. I would want to collapse all the other variables so that each subject has only 3 rows and everything is moved up when the cell is empty. I have tried using collapse
        Code:
        initial_base y2 y3 y4 y5 , by(identifier)
        but it is collapsing everything to one row, deleting non-missing observations.


        Does anyone know why this is happening and how to solve this problem? Many thanks !

        Comment


        • #5
          the implication, to me, of your wanting to end with 3 rows/subject is that you need at least one more variable in the "by()" part of your command; if you provide a -dataex- example within CODE blocks (please read the FAQ), someone might provide more help; it appears you have loaded some form of image in #4, but it is not readable on my machine

          Comment


          • #6
            Hi Rich,

            Thanks for your response and sorry for the picture. Let me know if you can access this dataset. Thanks for helping!

            Code:
            clear
            input float(initial_rating_score_temperament final_rating_score initial_rating_score
            final_rating_score_temperament initial_base_line initial_base_line_temperament
            final_base_line final_base_line_temperament identifier)
             .   . 100  .  1  .  .  . 61
             .   .   0  . -1  .  .  . 61
             .   .  60  .  0  .  .  . 61
            89   .   .  .  .  0  .  . 61
            15   .   .  .  . -1  .  . 61
            93   .   .  .  .  1  .  . 61
             . 100   .  .  .  .  1  . 61
             .   0   .  .  .  . -1  . 61
             .  57   .  .  .  .  0  . 61
             .   .   . 89  .  .  .  0 61
             .   .   . 10  .  .  . -1 61
             .   .   . 97  .  .  .  1 61
             .   .  69  .  0  .  .  . 31
             .   . 100  .  1  .  .  . 31
             .   .   9  . -1  .  .  . 31
            65   .   .  .  .  0  .  . 31
            87   .   .  .  .  1  .  . 31
            33   .   .  .  . -1  .  . 31
             .  67   .  .  .  .  0  . 31
             .  10   .  .  .  . -1  . 31
             .  99   .  .  .  .  1  . 31
             .   .   . 36  .  .  . -1 31
             .   .   . 57  .  .  .  0 31
             .   .   . 77  .  .  .  1 31
             .   .  24  . -1  .  .  . 30
             .   .  93  .  1  .  .  . 30
             .   .  50  .  0  .  .  . 30
            20   .   .  .  . -1  .  . 30
            96   .   .  .  .  1  .  . 30
            50   .   .  .  .  0  .  . 30
            end

            Comment


            • #7
              Code:
              collapse (firstnm) init* fin*, by(identifier)

              Comment


              • #8
                thanks for the data example, in #4, you said, "I would want to collapse all the other variables so that each subject has only 3 rows and everything is moved up when the cell is empty" (note that the reply in #7 is not consistent with this); try
                Code:
                bys identifier: gen byte seq=mod(_n,3)
                collapse (firstnm) init* fin*, by(identifier seq)
                you may want to -drop- the variable I called "seq" here after doing the -collapse-
                Last edited by Rich Goldstein; 05 Oct 2022, 06:26.

                Comment


                • #9
                  Thanks both - unfortunately neither codes are working for me... #7 leaves me with only one obs per subject (i.e., identifier), whereas #8 leaves me with some (I would want to say random but I am not sure what the code does) empty cells. What I'd want is that, by subject, every obs in each column is showed in the first 3 rows.
                  Let me know if it is not clear what I want to achieve!
                  Thanks

                  Comment


                  • #10
                    There is a small problem with the code in #8: it does not fully pin down the sort order, which is why you get some random empty cells.

                    Try this instead:
                    Code:
                    gen `c(obs)' obnum = _n
                    bys identifier (obnum): gen byte seq=mod(_n,3)
                    collapse (firstnm) init* fin*, by(identifier seq)
                    drop seq
                    which produces:

                    Code:
                    . list, noobs sepby(identifier)
                    
                      +--------------------------------------------------------------------------------------------------+
                      | identi~r   initia..   initi~re   initi~ne   initia..   final~re   final_..   final~ne   final_.. |
                      |--------------------------------------------------------------------------------------------------|
                      |       30         20         24         -1         -1          .          .          .          . |
                      |       30         96         93          1          1          .          .          .          . |
                      |       30         50         50          0          0          .          .          .          . |
                      |--------------------------------------------------------------------------------------------------|
                      |       31         65         69          0          0         67         36          0         -1 |
                      |       31         87        100          1          1         10         57         -1          0 |
                      |       31         33          9         -1         -1         99         77          1          1 |
                      |--------------------------------------------------------------------------------------------------|
                      |       61         89        100          1          0        100         89          1          0 |
                      |       61         15          0         -1         -1          0         10         -1         -1 |
                      |       61         93         60          0          1         57         97          0          1 |
                      +--------------------------------------------------------------------------------------------------+
                    The only reason we're seeing some blanks for #30 is because the data example does not have the complete set of observations for that identifier.
                    Last edited by Hemanshu Kumar; 05 Oct 2022, 07:39.

                    Comment


                    • #11
                      Fab! Thanks Hemanshu, that worked for me

                      Comment


                      • #12
                        I have data, and I tried too many ways to put some rows into one.
                        The data looks like this:
                        Code:
                        +-------------------------------------------------------------+
                        | CLPR Episode P# dpp Fud pp icd |
                        |-------------------------------------------------------------|
                        1. | 20160082 1 1 20sep2016 20sep2016 53 1 |
                        2. | 20160082 2 1 02jan2017 03jan2017 39 981 |
                        3. | 20160082 2 2 02jan2017 03jan2017 40 802 |
                        4. | 20160082 2 3 02jan2017 03jan2017 56 676 |
                        5. | 20160082 2 4 03jan2017 03jan2017 53 1 |
                        |-------------------------------------------------------------|
                        6. | 20160082 3 1 04oct2017 04oct2017 32 1 |
                        7. | 20160082 4 1 25sep2018 25sep2018 53 1 |
                        8. | 20160082 5 1 13mar2019 24sep2019 59 1 |
                        9. | 20160082 5 2 24sep2019 24sep2019 53 1 |
                        10. | 20160082 6 1 10jun2020 27apr2021 59 1 |
                        |-------------------------------------------------------------|
                        11. | 20160082 6 2 02dec2020 27apr2021 59 1 |
                        12. | 20160082 6 3 27apr2021 27apr2021 53 1 |
                        13. | 20160082 7 1 18oct2021 18oct2021 48 855 |
                        14. | 20160082 8 1 19apr2022 19apr2022 53 1 |
                        +-------------------------------------------------------------+
                        I want it to look like this:
                        Code:
                        +-------------------------------------------------------------------------------------------------------------------------------------------------+
                        | CLPR Episode dpp_1 fud_1 pp_1 ic9_1 dpp_2 fud_2 pp_2 ic9_2 dpp_3 fud_3 pp_3 ic9_3 dpp _4 fud _4 pp_4 ic9_4 |
                        |-------------------------------------------------------------------------------------------------------------------------------------------------|
                        1. | 20160082 1 20sep2016 20sep2016 53 1 |
                        2. | 20160082 2 02jan2017 03jan2017 39 981 02jan2017 03jan2017 40 802 02jan2017 03jan2017 56 676 03jan2017 03jan2017 53 1 |
                        3. | 20160082 3 04oct2017 04oct2017 32 1 |
                        4. | 20160082 4 25sep2018 25sep2018 53 1 |
                        5. | 20160082 5 13mar2019 24sep2019 59 1 24sep2019 24sep2019 53 1 |
                        |-------------------------------------------------------------------------------------------------------------------------------------------------|
                        6. | 20160082 6 10jun2020 27apr2021 59 1 02dec2020 27apr2021 59 1 27apr2021 27apr2021 53 1 |
                        7. | 20160082 7 18oct2021 18oct2021 48 855 |
                        8. | 20160082 8 19apr2022 19apr2022 53 1 |
                        +-------------------------------------------------------------------------------------------------------------------------------------------------+
                        because i want the repeated procedure to be in one row
                        P#= procedure number
                        dpp=date of the procedure performed
                        fud= follow up date
                        pp= procedure performed
                        Last edited by Fatimah Gh; 16 Oct 2023, 13:07.

                        Comment


                        • #13
                          Code:
                          * Example generated by -dataex-. For more info, type help dataex
                          clear
                          input long clpr byte(episode procedure_id) str9(dpp fud) byte pp int ic9
                          20160082 1 1 "20sep2016" "20sep2016" 53   1
                          20160082 2 1 "02jan2017" "03jan2017" 39 981
                          20160082 2 2 "02jan2017" "03jan2017" 40 802
                          20160082 2 3 "02jan2017" "03jan2017" 56 676
                          20160082 2 4 "03jan2017" "03jan2017" 53   1
                          20160082 3 1 "04oct2017" "04oct2017" 32   1
                          20160082 4 1 "25sep2018" "25sep2018" 53   1
                          20160082 5 1 "13mar2019" "24sep2019" 59   1
                          20160082 5 2 "24sep2019" "24sep2019" 53   1
                          20160082 6 1 "10jun2020" "27apr2021" 59   1
                          20160082 6 2 "02dec2020" "27apr2021" 59   1
                          20160082 6 3 "27apr2021" "27apr2021" 53   1
                          20160082 7 1 "18oct2021" "18oct2021" 48 855
                          20160082 8 1 "19apr2022" "19apr2022" 53   1
                          end
                          
                          rename (dpp fud pp ic9) =_
                          reshape wide dpp_ fud_ pp_ ic9_, i(clpr episode) j(procedure_id)
                          That said, you will probably regret doing this. There is very little useful data management or analysis you will be able to do with the data in Stata if you rearrange the layout in the way you described. The long layout you are starting out with is by far more useful for work in Stata. Unless you know specifically that you will be using the few Stata commands that prefer the wide layout, I urge you to keep the data the way you have it, at least as far as layout goes.

                          What you should change is: your date variables will not be useful unless they are Stata internal format numerical date variables. From the tableau you showed, it is impossible to tell whether they are that, or if they are just strings. If they are strings, you should use the -daily()- function to convert them.

                          The inability to tell whether these variables are strings or true Stata date variables arises because you showed your example data in a table, perhaps typed by yourself, or perhaps as -list- output. The helpful way to show example data is to use the -dataex- command, as I have done here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

                          Comment


                          • #14
                            Code:
                            * Example generated by -dataex-. For more info, type help dataex
                            clear
                            input long CLPR byte(Episode Procedure_ID) int date_procedure_performed byte F int follow_up_date str36 Procedure_Perfomed str95 X
                            20160081  9 1  22957 . 22957 "53" "1"  
                            20160082  1 1  20717 . 20717 "53" "1"  
                            20160082  2 1  20821 . 20822 "39" "981"
                            20160082  2 2  20821 . 20822 "40" "802"
                            20160082  2 3  20821 . 20822 "56" "676"
                            20160082  2 4  20822 . 20822 "53" "1"  
                            20160082  3 1  21096 . 21096 "32" "1"  
                            20160082  4 1  21452 . 21452 "53" "1"  
                            20160082  5 1  21621 . 21816 "59" "1"  
                            20160082  5 2  21816 . 21816 "53" "1"  
                            20160082  6 1  22076 . 22397 "59" "1"  
                            20160082  6 2  22251 . 22397 "59" "1"  
                            20160082  6 3  22397 . 22397 "53" "1"  
                            20160082  7 1  22571 . 22571 "48" "855"
                            20160082  8 1  22754 . 22754 "53" "1"  
                            20160083  1 1  20717 . 20717 "53" "1"  
                            20160083  2 1  20948 . 20948 "53" "1"  
                            20160084  1 1  20717 . 20717 "53" "1"  
                            20160084  2 1  20737 . 20738 "39" "981"
                            20160084  2 2  20737 . 20738 "40" "802"
                            20160084  2 3  20737 . 20738 "56" "676"
                            20160084  2 4  20738 . 20738 "53" "1"  
                            20160084  3 1  20787 . 20787 "53" "1"  
                            20160084  4 1  20864 . 20864 "53" "1"  
                            20160084  5 1  20942 . 20942 "32" "1"  
                            20160084  6 1  21017 . 21017 "48" "856"
                            20160084  6 2  21017 . 21017 "41" "988"
                            20160084  6 3  21017 . 21017 "37" "691"
                            20160084  6 4  21017 . 21017 "56" "676"
                            20160084  7 1  21151 . 21151 "53" "1"  
                            20160084  8 1  21508 . 21508 "53" "1"  
                            20160084  8 2  21376 . 21508 "59" "1"  
                            20160084  9 1  21733 . 21886 "59" "1"  
                            20160084  9 2  21886 . 21886 "34" "1"  
                            20160084 10 1  22111 . 22243 "59" "1"  
                            20160084 10 2  22243 . 22243 "53" "1"  
                            20160084 11 1  22600 . 22600 "53" "1"  
                            20160084 12 1  22894 . 22894 "34" "1"  
                            20160085  1 1  20654 . 20722 "53" "1"  
                            20160085  1 2  20722 . 20722 "53" "1"  
                            20160085  2 1  20723 . 20723 "38" "895"
                            20160085  3 1 -21914 . 20736 "53" "1"  
                            20160085  4 1  20752 . 20752 "53" "1"  
                            20160085  5 1  20753 . 20753 "39" "981"
                            20160085  5 2  20753 . 20753 "40" "802"
                            20160085  5 3  20753 . 20753 "56" "676"
                            20160085  6 1  21096 . 21096 "32" "1"  
                            20160085  7 1  21209 . 21209 "41" "988"
                            20160085  7 2  21209 . 21209 "56" "676"
                            20160085  8 1  21326 . 21326 "53" "1"  
                            20160085  9 1  21612 . 21837 "59" "1"  
                            20160085  9 2  21837 . 21837 "53" "1"  
                            20160085 10 1  21907 . 22110 "37" "691"
                            20160085 10 2  21907 . 22110 "56" "676"
                            20160085 11 1  22446 . 22446 "53" "1"  
                            20160085 12 1  22803 . 22803 "34" "1"  
                            20160085 13 1  22858 . 22858 "48" "855"
                            20160085 14 1  22962 . 23118 "59" "1"  
                            20160085 14 2  23118 . 23118 "53" "1"  
                            20160086  1 1 -21914 . 20724 "41" "988"
                            20160086  1 2  20724 . 20724 "53" "1"  
                            20160086  1 3 -21914 . 20724 "56" "676"
                            20160086  2 1  20789 . 20962 "59" "1"  
                            20160086  2 2  20935 . 20962 "59" "1"  
                            20160086  2 3  20962 . 20962 "53" "1"  
                            20160086  3 1  21053 . 21053 "53" "1"  
                            20160086  4 1  21051 . 21165 "59" "1"  
                            20160086  4 2  21165 . 21165 "53" "1"  
                            20160086  5 1  21145 . 21254 "59" "1"  
                            20160086  5 2  21254 . 21254 "53" "1"  
                            20160086  6 1  21290 . 21389 "54" "1"  
                            20160086  6 2  21291 . 21389 "59" "1"  
                            20160086  6 3  21319 . 21389 "59" "1"  
                            20160086  6 4  21354 . 21389 "59" "1"  
                            20160086  6 5  21389 . 21389 "53" "1"  
                            20160086  7 1  21718 . 22368 "59" "1"  
                            20160086  7 2  21935 . 22368 "59" "1"  
                            20160086  7 3  21970 . 22368 "59" "1"  
                            20160086  7 4  22173 . 22368 "59" "1"  
                            20160086  7 5  22368 . 22368 "41" "990"
                            20160086  8 1  22460 . 22460 "53" "1"  
                            20160086  9 1  22587 . 22587 "32" "1"  
                            20160086 10 1  22683 . 22733 "54" "1"  
                            20160086 10 2  22684 . 22733 "59" "1"  
                            20160086 10 3  22733 . 22733 "53" "1"  
                            20160086 11 1  22733 . 22943 "59" "1"  
                            20160086 11 2  22775 . 22943 "59" "1"  
                            20160086 11 3  22858 . 22943 "53" "1"  
                            20160086 11 4  22943 . 22943 "53" "1"  
                            20160086 12 1  22959 . 22959 "31" "992"
                            20160086 13 1  22985 . 22985 "53" "1"  
                            20160086 14 1  23039 . 23039 "31" "992"
                            20160086 15 1  23111 . 23111 "53" "1"  
                            20160087  1 1  20724 . 20724 "53" "1"  
                            20160087  2 1  20817 . 20817 "41" "988"
                            20160087  3 1  20817 . 20817 "41" "988"
                            20160087  3 2  20817 . 20817 "56" "676"
                            20160087  4 1  20920 . 20920 "53" "1"  
                            20160087  5 1  21284 . 21284 "53" "1"  
                            20160087  6 1  21284 . 21648 "59" "1"  
                            end
                            format %td date_procedure_performed
                            format %td follow_up_date
                            the way i wanted is the only way to import it to REDCap
                            Last edited by Fatimah Gh; 16 Oct 2023, 13:33.

                            Comment


                            • #15
                              Thank you for the -dataex- output. So your variable names are not what you originally said. And there is an additional variable, F, which is always missing. So the code shown above just requires -drop-ping F, and changing the variable names.
                              Code:
                              drop F
                              
                              rename X ic9_
                              rename (date_procedure_performed follow_up_date Procedure_Perfomed) =_
                              reshape wide date_procedure_performed_ follow_up_date_ Procedure_Perfomed_ ic9_, ///
                                  i(CLPR Episode) j(Procedure_ID)

                              Comment

                              Working...
                              X