Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create new variable X=1, if Y=1 in the 24 cells (A1:A24), and X=0, if in the 24 cells (A1:A24) =0

    Dear all,

    Y is a dummy variable (1 - crises occur, 0 - crises).
    I have to create a new variable X, that x =1, if within next 24 months a crises occur (in the next 24 month, Y has 1), and x=0, if in the next 24 months (A1:A24) =0

    How can I code that X is searching in Y value from (A1:A24) in Stata?
    Could you kindly help me.
    Last edited by Anastasiya Lu; 25 Apr 2021, 00:47.

  • #2
    I try to create a code, generate Newvaribale = "A" if sum(Crises15sddev[_n+24]) >=1
    Last edited by Anastasiya Lu; 25 Apr 2021, 01:20.

    Comment


    • #3
      The function sum() is for cumulative or running sums, and would need to be applied twice here, as what you want is the difference between two cumulative sums.

      Here instead I use rangestat from SSC. I assume panel data -- but if you haven't such, you can just ignore the identifier --- and an explicit time variable.


      Code:
      . clear
      
      . set obs 24
      
      . gen id = cond(_n <= 12, 1, 2)
      
      . bysort id : gen t = _n
      
      . set seed 2803
      
      . gen crisis = runiform() > 0.7
      
      .. rangestat (sum) crisis, int(t 1 4) by(id)
      
      . list, sepby(id)
      
      
           +-----------------------------+
           | id    t   crisis   crisis~m |
           |-----------------------------|
        1. |  1    1        1          1 |
        2. |  1    2        0          1 |
        3. |  1    3        1          0 |
        4. |  1    4        0          0 |
        5. |  1    5        0          0 |
        6. |  1    6        0          0 |
        7. |  1    7        0          1 |
        8. |  1    8        0          1 |
        9. |  1    9        0          1 |
       10. |  1   10        0          1 |
       11. |  1   11        1          0 |
       12. |  1   12        0          . |
           |-----------------------------|
       13. |  2    1        0          0 |
       14. |  2    2        0          0 |
       15. |  2    3        0          0 |
       16. |  2    4        0          0 |
       17. |  2    5        0          0 |
       18. |  2    6        0          1 |
       19. |  2    7        0          2 |
       20. |  2    8        0          2 |
       21. |  2    9        0          2 |
       22. |  2   10        1          1 |
       23. |  2   11        1          0 |
       24. |  2   12        0          . |
           +-----------------------------+
      In your data you want 24 not 4. The wanted indicator variable (you say dummy) is now easy. .

      Comment


      • #4
        Thank you, but I want to do something a bit different.
        We need to create a new column X. If we have value 1 in variable "crisis" in the next 24 cells, x =1, if we do not have value 1 in variable "crisis" in the next 24 cells, x=0

        I think it is Loops with forvalues. I am so new to Stata and cannot figure out.
        Attached Files
        Last edited by Anastasiya Lu; 25 Apr 2021, 08:12.

        Comment


        • #5
          Building on Nick's example, but with a different example dataset, and again using 4 periods rather than 24 periods to make shorter output. The code below sets x to 1 if crisis is 1 in the current period or the preceding three periods: periods 5 and 2-4 in the first crisis below; just like periods 36 and the 23 periods 13-35 in your example.
          Code:
          . rangestat (max) x=crisis, int(t 0 3) by(id)
          
          . list, sepby(id) 
          
               +----------------------+
               | id    t   crisis   x |
               |----------------------|
            1. |  1    1        0   0 |
            2. |  1    2        0   1 |
            3. |  1    3        0   1 |
            4. |  1    4        0   1 |
            5. |  1    5        1   1 |
            6. |  1    6        0   0 |
            7. |  1    7        0   0 |
            8. |  1    8        0   1 |
            9. |  1    9        0   1 |
           10. |  1   10        0   1 |
           11. |  1   11        1   1 |
           12. |  1   12        0   0 |
               |----------------------|
           13. |  2    1        0   1 |
           14. |  2    2        0   1 |
           15. |  2    3        1   1 |
           16. |  2    4        0   0 |
           17. |  2    5        0   1 |
           18. |  2    6        0   1 |
           19. |  2    7        0   1 |
           20. |  2    8        1   1 |
           21. |  2    9        0   1 |
           22. |  2   10        1   1 |
           23. |  2   11        0   0 |
           24. |  2   12        0   0 |
               +----------------------+

          Comment


          • #6
            I think #3 remains an answer to #1. As flagged, the indicator is just given by the sum being positive. William Lisowski is naturally correct that the maximum gets you there directly, but in my experience the sum is often directly useful and interesting.

            Comment


            • #7
              In this case, within the the period 1, it works perfectly. But new variable x does not search for crises in the period two. I marked where it should be 1, but we have 0 instead.

              I am very much beginner with Stata, could you please explain me, what does it mean "set seed 2803"? I also wonder, why I have only period 1 and period 2. I tried to figure out how to make more periods, but I did not figure out it. Please, could you help me.


              Attached Files

              Comment


              • #8
                In post #5, I provided
                • risis is the variable that in post #1 you call Crises15sddev
                • x is the variable that in post #1 you call X, the answer you seek.
                • t is a variable indicating time - the month - for which in post #2 you use the observation number _n, which will be a problem if perhaps your data has a month in the middle for which data is not avaialble, or if you have data from more than one country.
                • id is a variable indicating the country
                Because you provide no example data for readers to work with, both Nick and I assumed - apparently mistakenly - that your data would follow standard practices and include a variable indicating the month, and we allowed for the possibility that you are working with data from more than one country by including a variable indicating the country.

                Observation 12 is the final observation for id 1, the first country; it is not part of the crisis experienced by id 2 in the third month.

                In post #3, the lines of code preceding the rangestat command show how Nick generated made-up data. They are not part of the solution to your problem, they are a reaction to the lack of example data to work with.

                Click image for larger version

Name:	image_22320.png
Views:	1
Size:	168.3 KB
ID:	1605899
                Last edited by William Lisowski; 26 Apr 2021, 05:43.

                Comment


                • #9
                  Let me separately add some advice to you as a new user of Stata.

                  I'm sympathetic to you as a new user of Stata - there is quite a lot to absorb. And even worse if perhaps you are under pressure to produce some output quickly. Nevertheless, I'd like to encourage you to take a step back from your immediate tasks.

                  When I began using Stata in a serious way, I started, as have others here, by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. There are a lot of examples to copy and paste into Stata's do-file editor to run yourself, and better yet, to experiment with changing the options to see how the results change.

                  All of these manuals are included as PDFs in the Stata installation and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu. The objective in doing the reading was not so much to master Stata - I'm still far from that goal - as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax, and know how to find out more about them in the help files and PDF manuals.

                  Stata supplies exceptionally good documentation that amply repays the time spent studying it - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry and to work effectively.

                  When gazing at the view of a Stata dataset provided in the Browse window, or some list output, it is tempting to think of a Stata dataset as a spreadsheet and, if one has been using Excel extensively, to then draw on one's Excel-based experience in approaching data analysis. But a Stata data set is not a spreadsheet, and your Excel-driven instincts are almost never helpful, and often get badly in the way.

                  Please take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using code delimiters [CODE] and [/CODE], and to use the dataex command to provide sample data, as described in section 12 of the FAQ.

                  The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

                  Comment


                  • #10
                    Thank you a lot. I had two semesters of learning Stata in University for. Until now my knowledge was enough for work. I would be thankful to solve above mention problem for my thesis, the rest coding is done for my thesis. I have made this steps in Excel, but my professor said that it is not allowed, that everything have to be in stata.

                    Comment


                    • #11
                      Looking at the picture of your data in post #4, here are two different ways of accomplishing what you apparently want.
                      Code:
                      . // reproduce the example data in the picture in post #4
                      . * Example generated by -dataex-. For more info, type help dataex
                      . clear
                      
                      . input float(Crises15sddev var32)
                      
                           Crises1~v      var32
                        1. 0 0
                        2. 0 0
                        3. 0 0
                        4. 0 0
                        5. 0 0
                        6. 0 0
                        7. 0 0
                        8. 0 0
                        9. 0 0
                       10. 0 0
                       11. 0 0
                       12. 0 0
                       13. 0 1
                       14. 0 1
                       15. 0 1
                       16. 0 1
                       17. 0 1
                       18. 0 1
                       19. 0 1
                       20. 0 1
                       21. 0 1
                       22. 0 1
                       23. 0 1
                       24. 0 1
                       25. 0 1
                       26. 0 1
                       27. 0 1
                       28. 0 1
                       29. 0 1
                       30. 0 1
                       31. 0 1
                       32. 0 1
                       33. 0 1
                       34. 0 1
                       35. 0 1
                       36. 1 1
                       37. 0 1
                       38. 1 1
                       39. end
                      
                      . 
                      . // this is the code 
                      . generate t = _n
                      
                      . rangestat (max) new1_var32=Crises15sddev, int(t 0 23)
                      
                      . generate numcrisis = sum(Crises15sddev)
                      
                      . generate new2_var32 = numcrisis[min(_n+23,_N)]>numcrisis-Crises15sddev
                      
                      . 
                      . // notice that new1_var32 and new2_var32 from this code are each exactly the same as var32 
                      . list in 8/38, clean abbreviate(20)
                      
                             Crises15sddev   var32    t   new1_var32   numcrisis   new2_var32  
                        8.               0       0    8            0           0            0  
                        9.               0       0    9            0           0            0  
                       10.               0       0   10            0           0            0  
                       11.               0       0   11            0           0            0  
                       12.               0       0   12            0           0            0  
                       13.               0       1   13            1           0            1  
                       14.               0       1   14            1           0            1  
                       15.               0       1   15            1           0            1  
                       16.               0       1   16            1           0            1  
                       17.               0       1   17            1           0            1  
                       18.               0       1   18            1           0            1  
                       19.               0       1   19            1           0            1  
                       20.               0       1   20            1           0            1  
                       21.               0       1   21            1           0            1  
                       22.               0       1   22            1           0            1  
                       23.               0       1   23            1           0            1  
                       24.               0       1   24            1           0            1  
                       25.               0       1   25            1           0            1  
                       26.               0       1   26            1           0            1  
                       27.               0       1   27            1           0            1  
                       28.               0       1   28            1           0            1  
                       29.               0       1   29            1           0            1  
                       30.               0       1   30            1           0            1  
                       31.               0       1   31            1           0            1  
                       32.               0       1   32            1           0            1  
                       33.               0       1   33            1           0            1  
                       34.               0       1   34            1           0            1  
                       35.               0       1   35            1           0            1  
                       36.               1       1   36            1           1            1  
                       37.               0       1   37            1           1            1  
                       38.               1       1   38            1           2            1  
                      
                      .

                      Comment


                      • #12
                        Thank you so much! Variable new1_var32 is exactly what I need! Thank very very very much!!

                        Comment

                        Working...
                        X