Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem generating a new variable based on previous values

    Hi all,

    I'm trying to generate a variable (steponecpn) so that is represents the change in Cumulative Plane Number (cpn) from one month to the next.

    I've accomplished this with this code:
    Code:
     gen steponecpn=cpn-cpn[_n-1]
    The problem that I'm having is that when there isn't a previous value (ex. when cpn[_n-1] = .) I end up with no value in my steponecpn. In these cases, I'd like to have steponecpn = cpn. Does anyone have an idea of how to do this?


  • #2
    If this below does not work, use -dataex- to put up some data we can toy around with:

    Code:
     
     gen steponecpn=cond(missing(cpn[_n-1]), cpn, cpn-cpn[_n-1])

    Comment


    • #3
      Dear Kate,

      You simply write in Stata:

      Code:
      gen steponecpn=d1.cpn
      Best,
      Emna

      Comment


      • #4
        Hi Joro,

        Thanks very much, it did work!

        I've got a follow on question. While it worked for most of my data, I just realized that I've got a few areas where there is a gap of two missing variables between cpn values,

        ex. cpn = 100 in 2001, but then I have missings up until 2004 when cpn=140. In this case, I'd like steponecpn in 2004 to be 140-100=40. Is there a way to do that? I've used -dataex- below, but it doesn't have any of the values where there is a chunk in the middle missing.

        input str19 Company str21 Plant str14 ModelClean str23 Model float Time long Fullid float(cpn mhep mhos mhpp pop weight pop2 steponecpn)
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7305 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7274 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7245 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7214 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7184 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7153 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7123 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7092 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7061 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7031 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -7000 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6970 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6939 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6908 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6880 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6849 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6819 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6788 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6758 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6727 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6696 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6666 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6635 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6605 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6574 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6543 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6515 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6484 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6454 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6423 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6393 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6362 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6331 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6301 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6270 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6240 1 . . . . . . . .
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6209 1 35 3106 2112 2.2185714 32 1400 .32 35
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6178 1 68 3249 2112 2.3207142 35 1400 .35 33
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6150 1 111 5500 3300 3.9285715 40 1400 .4 43
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6119 1 177 5500 3300 3.9285715 40 1400 .4 66
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6089 1 249 3333 2000 2.3807142 40 1400 .4 72
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6058 1 319 2241 1300 1.6007143 42 1400 .42 70
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -6028 1 390 2241 1300 1.6007143 42 1400 .42 71
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -5997 1 466 1781 1015 1.272143 43 1400 .43 76
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -5966 1 539 1776 977 1.2685714 45 1400 .45 73
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -5936 1 619 2205 1213 1.575 45 1400 .45 80
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -5905 1 695 1789 984 1.2778572 45 1400 .45 76
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -5875 1 770 2107 1159 1.505 45 1400 .45 75
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -5844 1 845 1835 1009 1.3107142 45 1400 .45 75
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -5813 1 895 2045 1125 1.4607143 45 1400 .45 50
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -5784 1 949 2045 1125 1.4607143 45 1400 .45 54
        "Aeronca" "Middletown" "PT-19-23" "*PT-19-23" -5753 1 995 . . 0 45 1400 .45 46

        Comment


        • #5
          I just realized I could use the if command with dataex, here is a more useful sample:
          clear
          input str19 Company str21 Plant str14 ModelClean str23 Model float Time long Fullid float(cpn mhep mhos mhpp pop weight pop2 steponecpn)
          "Aeronca" "Middletown" "L-3" "L-3" -7305 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -7274 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -7245 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -7214 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -7184 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -7153 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -7123 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -7092 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -7061 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -7031 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -7000 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6970 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6939 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6908 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6880 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6849 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6819 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6788 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6758 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6727 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6696 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6666 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6635 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6605 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6574 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6543 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6515 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6484 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6454 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6423 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6393 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6362 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6331 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6301 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6270 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6240 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6209 2 929 500 425 .8333333 15 600 .15 929
          "Aeronca" "Middletown" "L-3" "L-3" -6178 2 . 425 . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6150 2 939 518 425 .8633333 18 600 .18 939
          "Aeronca" "Middletown" "L-3" "L-3" -6119 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -6089 2 1004 531 425 .885 20 600 .2 1004
          "Aeronca" "Middletown" "L-3" "L-3" -6058 2 1051 545 425 .9083334 22 600 .22 47
          "Aeronca" "Middletown" "L-3" "L-3" -6028 2 1106 545 425 .9083334 22 600 .22 55
          "Aeronca" "Middletown" "L-3" "L-3" -5997 2 1156 662 510 1.1033334 23 600 .23 50
          "Aeronca" "Middletown" "L-3" "L-3" -5966 2 1191 749 562 1.2483333 25 600 .25 35
          "Aeronca" "Middletown" "L-3" "L-3" -5936 2 1254 744 558 1.24 25 600 .25 63
          "Aeronca" "Middletown" "L-3" "L-3" -5905 2 1309 557 418 .9283333 25 600 .25 55
          "Aeronca" "Middletown" "L-3" "L-3" -5875 2 1359 740 555 1.2333333 25 600 .25 50
          "Aeronca" "Middletown" "L-3" "L-3" -5844 2 1414 533 400 .8883333 25 600 .25 55
          "Aeronca" "Middletown" "L-3" "L-3" -5813 2 1443 567 425 .945 25 600 .25 29
          "Aeronca" "Middletown" "L-3" "L-3" -5784 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5753 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5723 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5692 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5662 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5631 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5600 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5570 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5539 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5509 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5478 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5447 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5419 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5388 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5358 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5327 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5297 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5266 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5235 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5205 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5174 2 . . . . . . . .
          "Aeronca" "Middletown" "L-3" "L-3" -5144 2 . . . . . . . .
          end

          Comment


          • #6
            Hi Emna,

            Thanks for your suggestion as well. I tried it out in case in would also fix my secondary issue that cropped up, but the message I get when I use it is: "time variable not set." I tried:
            Code:
            tsset Fullid Time
            and then

            Code:
            gen steponecpn=d1.cpn
            But this ended up generate missing for all the steponecpn values.

            Since gen steponecpn=cond(missing(cpn[_n-1]), cpn, cpn-cpn[_n-1]) is working I think I should stick with that.

            Comment


            • #7
              I am not on very clear what you re trying to do, so shooting in the dark...
              But try the following:

              Code:
              gen newcpn = cpn
              bysort Fullid (Time): replace newcpn = newcpn[_n-1]  if missing(newcpn) & _n>1
              gen steponecpn = newcpn if _n==1
              replace steponecpn = cpn-cpn[_n-1] if _n>1

              Comment


              • #8
                Hi Joro,

                Thanks verily. I tried your code, but it ended up producing more missing values. I'll try to be better with my question:

                "Aeronca" "Middletown" "L-3" "L-3" -6209 2 929 500 425 .8333333 15 600 .15 929
                "Aeronca" "Middletown" "L-3" "L-3" -6178 2 . . . . . . . .
                "Aeronca" "Middletown" "L-3" "L-3" -6150 2 939 518 425 .8633333 18 600 .18 939
                "Aeronca" "Middletown" "L-3" "L-3" -6119 2 . . . . . . . .
                "Aeronca" "Middletown" "L-3" "L-3" -6089 2 1004 531 425 .885 20 600 .2 1004
                "Aeronca" "Middletown" "L-3" "L-3" -6058 2 1051 545 425 .9083334 22 600 .22 47
                "Aeronca" "Middletown" "L-3" "L-3" -6028 2 1106 545 425 .9083334 22 600 .22 55
                "Aeronca" "Middletown" "L-3" "L-3" -5997 2 1156 662 510 1.1033334 23 600 .23 50
                "Aeronca" "Middletown" "L-3" "L-3" -5966 2 1191 749 562 1.2483333 25 600 .25 35
                "Aeronca" "Middletown" "L-3" "L-3" -5936 2 1254 744 558 1.24 25 600 .25 63
                "Aeronca" "Middletown" "L-3" "L-3" -5905 2 1309 557 418 .9283333 25 600 .25 55
                "Aeronca" "Middletown" "L-3" "L-3" -5875 2 1359 740 555 1.2333333 25 600 .25 50
                "Aeronca" "Middletown" "L-3" "L-3" -5844 2 1414 533 400 .8883333 25 600 .25 55
                "Aeronca" "Middletown" "L-3" "L-3" -5813 2 1443 567 425 .945 25 600 .25 29
                "Aeronca" "Middletown" "L-3" "L-3" -5784 2 . . . . . . . .

                In the above section of my data, it's mostly consecutive, so the adjcpn is, mostly, correct. Right now, it's the difference between the current and the previous cpn. Is there a way to make adjcpn the difference between the current and the last reported cpn? Ex. In the above data where cpn=1004 and the last reported cpn=939, I'd like the steponecpn to be 65 rather than 1004 as it is now.


                Comment


                • #9
                  OK Kate, but then what do you want the difference between what is now missing, and the previous one cpn=939 to be equal to? You explained to me how adjcpn on row 5 has to look like and I understood. But how do you want the adjcpn on row 4 to look like?

                  I also do not know how you want adjcpn to look like for the first observation of your Fullid group, that is, how does adjcpn look like when you are observing the first time period of the Fullid?

                  Comment


                  • #10
                    Okay, so:

                    Row one: 929
                    Row two: .
                    Row three: 10
                    Row four: .
                    Row five: 65

                    When I used
                    Code:
                    gen newcpn = cpn
                    bysort Fullid (Time): replace newcpn = newcpn[_n-1] if missing(newcpn) & _n>1
                    gen steponecpn = newcpn if _n==1
                    replace steponecpn = cpn-cpn[_n-1] if _n>1
                    Like you suggested, it just about fixed things, except, in the above dataex it doesn't start calculating the values of steponecpn until 47, so it's just those previous five rows where the directly previous cpn value is missing that no value appears. For the first time period of Fullid, it should just be equal to cpn.

                    Hopefully that makes some sense. I really appreciate your help, I'm trying to read as much as I can about Stata but I'm definitely still a beginner.

                    Comment


                    • #11
                      How about this one:

                      Code:
                       gen newcpn = cpn
                      
                      bysort Fullid (Time): replace newcpn = newcpn[_n-1] if missing(newcpn) & _n>1
                      
                      by Fullid: gen steponecpn = newcpn if _n==1
                      
                      by Fullid: replace steponecpn = cpn-newcpn[_n-1] if _n>1
                      Last edited by Joro Kolev; 03 Aug 2020, 15:21. Reason: Sorry, forgot the by statements

                      Comment


                      • #12
                        Joro, that's fantastic!

                        I only made one tweak (based on your earlier suggestion) because the first row was coming up blank; here's what got it working:

                        Code:
                        gen newcpn = cpn
                        bysort Fullid (Time): replace newcpn = newcpn[_n-1] if missing(newcpn) & _n>1
                        gen steponecpn = newcpn if _n==1
                        replace steponecpn = cond(missing(newcpn[_n-1]), cpn, cpn-newcpn[_n-1]) if _n>1
                        Thanks and thanks again Joro.

                        Comment


                        • #13
                          I am very glad that you are happy, Kate.

                          But I have a bad feeling about this one. Until the very end I am not on very clear what we are doing here, and if what you are showing above really works, it would be a miracle.

                          I would suggest that you leave this aside for a while, and after you have a good rest you look at the data again and make sure that we really achieved here what you wanted to.

                          Originally posted by Kate Pryce View Post
                          Joro, that's fantastic!

                          I only made one tweak (based on your earlier suggestion) because the first row was coming up blank; here's what got it working:

                          Code:
                          gen newcpn = cpn
                          bysort Fullid (Time): replace newcpn = newcpn[_n-1] if missing(newcpn) & _n>1
                          gen steponecpn = newcpn if _n==1
                          replace steponecpn = cond(missing(newcpn[_n-1]), cpn, cpn-newcpn[_n-1]) if _n>1
                          Thanks and thanks again Joro.

                          Comment

                          Working...
                          X