Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error: data not sorted, even though I have sorted them

    Hello,

    I am working on a project where I want to study how inheritances affect labour supply of recipients via a Diff-in-Diff approach.
    I have a panel structure (it is not 100% done yet, still some issues) so I ran the code

    Code:
    mi xtset id survey
    and then

    Code:
    gen E = 1 if l.expectation == 2 & gift_received == 2
    replace E=2 if l.expectation == 1 & gift_received == 2
    replace E=3 if l.expectation == 2 & gift_received == 1
    replace E=4 if l.expectation == 1 & gift_received == 1
    However, I always get the issue of "not sorted". So I try to use sort and sort , stable with my variables invovled and afterwards perform the generate part of my code, but still get the same issue.

    Here is an example of my data with some variables.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte survey int id byte(implicate expectation gift_received) long gift1_value
    1 123 1 2 2 .
    1 123 0 2 2 .
    1 123 5 2 2 .
    1 123 4 2 2 .
    1 123 2 2 2 .
    1 123 3 2 2 .
    2 234 3 2 1 7500
    1 234 5 2 1 15000
    1 234 3 2 1 15000
    1 234 4 2 1 15000
    1 234 1 2 1 15000
    2 234 5 2 1 7500
    2 234 1 2 1 7500
    1 234 2 2 1 15000
    2 234 2 2 1 7500
    2 234 4 2 1 7500
    2 234 0 2 1 7500
    1 234 0 2 1 15000
    1 345 4 1 1 25000
    1 345 2 1 1 25000
    1 345 0 1 1 25000
    1 345 3 1 1 25000
    1 345 1 1 1 25000
    1 345 5 1 1 25000
    1 456 4 1 1 50000
    1 456 3 1 1 50000
    2 456 5 2 2 .
    2 456 2 2 2 .
    2 456 3 2 2 .
    2 456 1 2 2 .
    1 456 1 1 1 50000
    2 456 4 2 2 .
    1 456 0 1 1 50000
    1 456 2 1 1 50000
    1 456 5 1 1 50000
    2 456 0 2 2 .
    1 567 4 2 2 .
    1 567 3 2 2 .
    1 567 5 2 2 .
    1 567 1 2 2 .
    1 567 0 2 2 .
    1 567 2 2 2 .
    1 678 4 2 2 .
    1 678 2 1 2 .
    1 678 1 2 2 .
    1 678 0 . 2 .
    1 678 5 2 2 .
    1 678 3 2 2 .
    1 789 2 2 1 50000
    1 789 1 2 1 50000
    1 789 4 2 1 50000
    1 789 0 2 1 50000
    1 789 5 2 1 50000
    1 789 3 2 1 50000
    end
    What is my mistake?


  • #2
    Oscar it would help to show all the code needed to replicate the problem with your sample data set. One problem I immediately encounter is

    Code:
    . xtset id survey
    repeated time values within panel
    r(451);
    This is true. The time variable, survey, equals 1 or 2 several times within each id value.

    So, perhaps there is an error in the sample data you sent.

    Again, I would suggest creating your extract, and then make sure that the extract can reproduce the problems you are having. Also show the commands in sequence.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    Stata Version: 17.0 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Robert thank you for your time.

      This is true. The time variable, survey, equals 1 or 2 several times within each id value.
      Yes, but why is this an issue? I changed that variable myself to indicate if the observation came from the first or second wave. this was just to create a time variable to differentiate better between those two waves.


      Again, I would suggest creating your extract, and then make sure that the extract can reproduce the problems you are having. Also show the commands in sequence
      Sorry, what do you mean by extract?

      Also funny enough I never got that problem that you have. The entire sequence is as follows (I posted the data I use too)


      Code:
      use "./HFCS/hfcs_smallsample.dta"
      
      keep survey im0100 sa0100 sa0010 di1100 di1200i hh0100 hh0201 hh0202 hh0203 hh0401 hh0402 hh0403 hh0501 hh0502 hh0503 hh0700 hi0100 /*
      */ hi0200 pe0600_2 pe0700_2 pe0100a_2 pe1100_2 pg0200_2 ra0300_2 da1000 ra0200_2 _mi_m _mi_id _mi_miss sa0110 sa0210 sa0200
      
      rename (im0100 sa0100 sa0010 di1100 di1200i hh0100 hh0201 hh0202 hh0203) (implicate count id employee_Income self_employment_income gift_received gift1_year gift2_year gift3_year)
      rename (hh0401 hh0402 hh0403 hh0501 hh0502 hh0503 hh0700 hi0100 hi0200) (gift1_value gift2_value gift3_value gift1_transfer gift2_transfer gift3_transfer expectation food_home food_outside)
      rename (pe0600_2 pe0700_2 pe0100a_2 pe1100_2 pg0200_2 ra0300_2) (job_hours job_years labour_status expect_retire received_selfincome age)
      rename (da1000 ra0200_2) (total_assets gender)
      
      tab survey
      
      ********************************************************************
      *Variablen Erstellung*
      ********************************************************************
      mi xtset id survey  
      
      
      
      gen E = 1 if l.expectation == 2 & gift_received == 2
      replace E=2 if l.expectation == 1 & gift_received == 2
      replace E=3 if l.expectation == 2 & gift_received == 1
      replace E=4 if l.expectation == 1 & gift_received == 1
      The data has been changed in a different do file, however that one is quite long and complex (following the instructions of the Austrian Central Bank as it provided the documentation). All in all the only thing I changed there was change the 1 -> 2 for the survey variable in wave 2 and then appended wave 2 to wave 1. So unless those two commands cause a major issue, I don't see how I made a mistake prior to this.
      Attached Files
      Last edited by Oscar Weinzettl; 15 Apr 2019, 00:20.

      Comment


      • #4
        When you used dataex you created an extract of 100 cases from your larger dataset. What I was suggesting was, after the dataex code, tack on the code needed to replicate your problem. If you couldn't reproduce your own problems using the data you gave us then we wouldn't be able to reproduce them either.

        In general, it is a fairly common problem to create an extract using dataex, but then the user leaves out some key variable (e.g. the variables needed for xtsetting or weighting) and then nobody can address their problem. Users should make sure that their examples actually work before they post them.

        I don't have the time now to look at the new data you posted but I might be able to later. However many people refuse to look at posted data sets because they might have a virus or some other problem, so most prefer you use dataex to present your data followed by other necessary code if at all possible. My guess is that you would get more responses if you had a complete example using dataex that showed the problem but you can see what happens.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          You are using lowercase L in your gen statements, not the number 1. Is that what you want?
          -------------------------------------------
          Richard Williams, Notre Dame Dept of Sociology
          Stata Version: 17.0 MP (2 processor)

          EMAIL: [email protected]
          WWW: https://www3.nd.edu/~rwilliam

          Comment


          • #6
            Your data set is confusing. It already includes 5 imputations. You didn't say that, and the dataex file did not include the vars needed to reproduce the mi settings. That is why I got the repeated time values error.

            Did you create the 5 imputations yourself? If so, I would do that last, and get all the other data manipulation taken care of first.

            If the data came to you with 5 imputations done, then I guess you'll have to figure out how to work with them. See the mi xeq subsection of the mi manual.
            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            Stata Version: 17.0 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment


            • #7
              Maybe there is a simpler way, but Stata seems to be happier if you give your commands like this:

              Code:
              mi xeq: sort id survey; gen E = 1 if l.expectation == 2 & gift_received == 2
              mi xeq: sort id survey; replace E=2 if l.expectation == 1 & gift_received == 2
              mi xeq: sort id survey; replace E=3 if l.expectation == 2 & gift_received == 1
              mi xeq: sort id survey; replace E=4 if l.expectation == 1 & gift_received == 1
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              Stata Version: 17.0 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment

              Working...
              X