Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Collapse

    Dear Stata Users;

    I'm having some troubles with the collapse command. Since I'm working with a dynamic panel model (GMM) I need to collapse all my data into 5-years averages (mean) and (sd). The thing is I have created a new variable :

    gen period=.
    replace period=80 if year>=1980 & year<1985
    replace period=85 if year>=1985 & year<1990
    replace period=90 if year>=1990 & year<1995
    replace period=95 if year>=1995 & year<2000
    replace period=100 if year>=2000 & year<2005
    replace period=105 if year>=2005 & year<2010

    Then when I collapse everything by (country period) the outcome is wrong. Since I have missing observations it should collapse by each observation right?

    I guess I need to create another variable related to nr or id but I don't know how to to do it.

    I really appreciate any advice.

    Best Regards

    Santiago

  • #2
    You are asking us to guess what you mean by "the outcome is wrong." We can't read your mind. Your statement "Since I have missing observations it should collapse by each observation right?" is particularly mysterious to me--I have no idea what you mean by that. Show us a sample of your data, then show us the exact code you used, then show us what you got, and then show us what you think you should have gotten. Then somebody may be able to help.

    In showing us data, code, output, etc., do it in code blocks so it's easy for everyone to read and work with. To create a code block, click on the underlined A button, then click on the # button. A pair of code block delimiters will appear. Paste your information between those delimiters.

    In showing us your data, code, output, etc., do it by copying and pasting from Stata itself: don't re-type it by hand as you might make subtle changes that matter.

    Comment


    • #3
      This question was asked before, but no one answered. That does not seem surprising as what you seek is very unclear.

      Your first block of code can be simplified to

      Code:
      gen period = 5 * floor(year / 5)
      You can further subtract 1900 if you wish but the results of the above, presumably 1980(5)2005, will be more descriptive of the result.

      That is a minor detail. The problems with your post are

      1. Your claim that this code produces a wrong outcome. What does that mean? What did you specify precisely?

      Code:
      collapse <everything>, by(country period)
      2. Your comment about "missing observations" is unexplained. Values can be missing, but what do you mean?

      3.
      I guess I need to create another variable related to nr or id but I don't know how to to do it.
      As you nowhere explain what these variables are or how they relate to the rest of the question it remains a mystery how we can help on this either.
      Last edited by Nick Cox; 03 Oct 2014, 17:52.

      Comment


      • #4
        This is still unclear. You refer to averages of "Rule of Law" but there is no variable having that name in your code. You list 5 numbers that presumably have something to do with Algeria in 1980-85, but you don't say what they are, and then follow that by saying that the original value (of what?) begins in 1996 and is -1.19. Finally you don't show us any of the results you got from Stata. Please follow the detailed instructions I gave you earlier about what to show us, and put it all in code blocks.

        Your descriptions are failing to convey the information needed to help.

        Comment


        • #5
          Sorry, but we still have nothing that we can check here. For example "Rule of Law" is which variable in your command? If you (e.g.) take your original data and

          Code:
           
          list democracy year if country == "Algeria"
          or whatever specifies Algeria, then we have something to check.

          Comment


          • #6
            Sorry, but I give up here. Evidently you don't understand what we're asking and nothing seems to be helping you. If Clyde or someone else doesn't solve your problem, your only recourse is to try StataCorp technical support.

            Comment


            • #7
              Dear all,

              I have panel data of firms who are chosen to participate in a “productivity boost” program.
              For each firm, I have its production levels for Jan-1-2017 - Dec-31-2018. So, here is the example of my data (sorted by date):

              [CODE]
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input long id float(date production treat)
              1001 20820 29.15 0
              1002 20820 20.67 0
              1003 20820 32.90 0
              1004 20820 33.30 0
              1005 20820 8.50 0
              1006 20820 16.75 0
              1007 20820 29.30 0
              1008 20820 8.70 0
              1009 20820 37.20 0
              1010 20820 28.15 0
              end
              format %td date
              [/CODE]

              The variable ‘treat’ is the ‘before/after the treatment’ indicator; the treatment (=participation into the program) starts on March 1, 2018.


              For each firm, I need the difference between its production level after and before the treatment for each day, i.e. the modified data should look like this (the values of the ‘diff’ variable are my random guesses):

              id date diff
              1001 01mar2018 19 [=production on 01mar2018 – production on 01mar2017]
              1001 02mar2018 -3 [=production on 02mar2018 – production on 02mar2017]

              1001 31dec2018 5

              1002 01mar2018 13

              1002 31dec2018 -1


              I tried to use the ‘foreach’ and ‘collapse’ commands, but I still cannot attain the result I want. Could you please help?


              Thanks!

              Comment


              • #8
                I think you want
                Code:
                gen day_of_year = doy(date)
                by id day_of_year (date), sort: gen diff = production[2] - production[1]
                Now, this will calculate diff for every date in the data set. And the same result will be placed in the observations for any given date in both 2017 and 2018. It isn't clear to me if this is what you need, or if you only want to do it for those observations in the treatment period: you would just add -if treat- to the -...gen diff..- command if that's the case.

                Comment

                Working...
                X