Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Changing a binary policy variable shifting it to next year for repeated cross section data


    I've enlisted policy as a binary variable which I need to take 1 year forward. Like in my data for county 1003 the policy takes value 1 at year 2021. But, I need to create a new variable where the policy will take the value 1 for county 1003 at year =2022. Same goes for each observation. I have a repeated cross section dataset.

    Can anyone kindly take me how I can execute it? Here is the sample data for your convenience.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float policy double county int year
    0 1003 2002
    1 1003 2021
    0 2012 2002
    0 1003 2002
    1 1004 2015
    0 1004 2004
    0 1013 2002
    1 1013 2018
    1 10003 2004
    0 12003 2009
    0 10003 2000
    1 12003 2014
    0 10101 2021
    1 12003 2014
    end

  • #2
    There's something strange about this data that makes me question its suitability for this purpose.

    1. There are some duplicate observations for the same county and year. Look at county 1003 in year 2002 or county 12003 in 2014. Now, in the example data, it happens that in those duplicate pairs, the policy variable takes the same value in both observations, so one can at least make sense of the notion of "use the value of policy from the preceding year." But if in the real data sometimes the value of policy differs within such pairs, then what you are asking for isn't even definable. To check whether policy is always consistent within groups of observations having the same county and year, you can run:
    Code:
    by county year (policy), sort: assert policy[1] == policy[_N]
    If it is always consistent, this command will give you no output. If there are some inconsistencies it will give you an error message telling you about exceptions.

    2. There are large gaps between years. So, looking at county 1003, the year skips from 2002 to 2021. There are no observations for years 2003 or 2022. Do you want to just change the year variable to 2003 and 2022, respectively? Or do you want to add new observations for years 2003 and 2021, with each of those observations having the new policy variable set to the original policy variable's value in the preceding year? I just can't figure out what you want the end result to look like.

    Comment


    • #3
      Mr. Schechter for my each county there are observations from year 2000 to year 2022.

      Sice, these are repeated cross section data there are lot of samples of individuals for each county in a single year. That's why I used 3 observations for county 1003.

      Once, for county 12003 it starts taking value 1 since 2014 in my data. I want new policy variable will start taking value 1 for county 12003 since year 2015 onwards to till last year of my data that's 2022. That applies for all the observations of my data.

      My apologies with this unintentional and unclear message for my #1 post.

      Comment


      • #4
        This is clearer, but still leaves a little to my imagination. I imagine that the policy is applied county wide, not at the individual level. In other words, I assume that for any given year the value of policy is the same for all persons in that county. I also imagine that there are no gaps between years at all within your data: for every county and person, there is an observation for that county and person in every year from the first year they appear in the data through 2022. Here's an example of a data set that satisfies these conditions, followed by code to do what you ask:
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float policy double county int year float person_id
        0  1003 2002 2
        0  1003 2002 1
        0  1003 2003 1
        0  1003 2003 2
        0  1003 2004 2
        0  1003 2004 1
        0  1003 2005 2
        0  1003 2005 1
        0  1003 2006 2
        0  1003 2006 1
        0  1003 2007 2
        0  1003 2007 1
        0  1003 2008 2
        0  1003 2008 1
        0  1003 2009 2
        0  1003 2009 1
        0  1003 2010 1
        0  1003 2010 2
        0  1003 2011 1
        0  1003 2011 2
        0  1003 2012 1
        0  1003 2012 2
        0  1003 2013 1
        0  1003 2013 2
        0  1003 2014 1
        0  1003 2014 2
        0  1003 2015 1
        0  1003 2015 2
        0  1003 2016 1
        0  1003 2016 2
        0  1003 2017 2
        0  1003 2017 1
        0  1003 2018 1
        0  1003 2018 2
        0  1003 2019 1
        0  1003 2019 2
        0  1003 2020 1
        0  1003 2020 2
        1  1003 2021 2
        1  1003 2021 1
        1  1003 2022 2
        1  1003 2022 1
        0  1004 2004 1
        0  1004 2005 1
        0  1004 2006 1
        0  1004 2007 1
        0  1004 2008 1
        0  1004 2009 1
        0  1004 2010 1
        0  1004 2011 1
        0  1004 2012 1
        0  1004 2013 1
        0  1004 2014 1
        1  1004 2015 1
        1  1004 2016 1
        1  1004 2017 1
        1  1004 2018 1
        1  1004 2019 1
        1  1004 2020 1
        1  1004 2021 1
        1  1004 2022 1
        0  1013 2002 1
        0  1013 2003 1
        0  1013 2004 1
        0  1013 2005 1
        0  1013 2006 1
        0  1013 2007 1
        0  1013 2008 1
        0  1013 2009 1
        0  1013 2010 1
        0  1013 2011 1
        0  1013 2012 1
        0  1013 2013 1
        0  1013 2014 1
        0  1013 2015 1
        0  1013 2016 1
        0  1013 2017 1
        1  1013 2018 1
        1  1013 2019 1
        1  1013 2020 1
        1  1013 2021 1
        1  1013 2022 1
        0  2012 2002 1
        0  2012 2003 1
        0  2012 2004 1
        0  2012 2005 1
        0  2012 2006 1
        0  2012 2007 1
        0  2012 2008 1
        0  2012 2009 1
        0  2012 2010 1
        0  2012 2011 1
        0  2012 2012 1
        0  2012 2013 1
        0  2012 2014 1
        0  2012 2015 1
        0  2012 2016 1
        0  2012 2017 1
        0  2012 2018 1
        0  2012 2019 1
        0  2012 2020 1
        0  2012 2021 1
        0  2012 2022 1
        0 10003 2000 1
        0 10003 2001 1
        0 10003 2002 1
        0 10003 2003 1
        1 10003 2004 1
        1 10003 2005 1
        1 10003 2006 1
        1 10003 2007 1
        1 10003 2008 1
        1 10003 2009 1
        1 10003 2010 1
        1 10003 2011 1
        1 10003 2012 1
        1 10003 2013 1
        1 10003 2014 1
        1 10003 2015 1
        1 10003 2016 1
        1 10003 2017 1
        1 10003 2018 1
        1 10003 2019 1
        1 10003 2020 1
        1 10003 2021 1
        1 10003 2022 1
        0 10101 2021 1
        0 10101 2022 1
        0 12003 2009 1
        0 12003 2010 1
        0 12003 2011 1
        0 12003 2012 1
        0 12003 2013 1
        1 12003 2014 2
        1 12003 2014 1
        1 12003 2015 1
        1 12003 2015 2
        1 12003 2016 1
        1 12003 2016 2
        1 12003 2017 1
        1 12003 2017 2
        1 12003 2018 2
        1 12003 2018 1
        1 12003 2019 2
        1 12003 2019 1
        1 12003 2020 2
        1 12003 2020 1
        1 12003 2021 2
        1 12003 2021 1
        1 12003 2022 2
        1 12003 2022 1
        end
        
        //  VERIFY CRITICAL ASSUMPTIONS
        by county year (policy), sort: assert policy[1] == policy[_ N] // POLICY APPLIED AT COUNTY LEVEL
        by county person_id (year), sort: assert _N == year[_N] - year[1] + 1 // NO GAPS FROM FIRST YEAR TO LAST FOR ANY PERSON_ID
        
        //  CREATE SHIFTED POLICY VARIABLE
        by county person_id (year), sort: gen shifted_policy = policy[_n-1]
        The first two commands after the -dataex- input verify that these assumptions I have made hold in your actual data. If either command gives an error message, do not proceed because the code will give incorrect results if these assumptions are not true.

        Comment


        • #5
          Mr. Schechter,

          Those assumptions are absolutely right and I wish I could have been more of a help to send along my message.

          As per always you have been nothing short of a reliable and helpful mentor for me so far in my coding journey of stata. I'm really humbled for having this level of guidance despite being a novice learner. Truly appreciate your relentless contribution to this wonderful forum!

          Comment


          • #6
            Cross-posted at https://www.reddit.com/r/stata/comme...y_shifting_it/

            Please note our policy on cross-posting at https://www.statalist.org/forums/help#crossposting -- which is that you should tell us about it.

            It seems that Reddit ask you to tell them about cross-posting too.

            Comment


            • #7
              I sincerely had no idea about it, and if it caused any inconvenience I apologize for it. I'll be careful onwards.

              Comment


              • #8
                We ask that you read the FAQ Advice before posting.

                What you can do now is tell people at Reddit about this thread, so that further duplication of effort is avoided.

                Comment


                • #9
                  I did it Mr. Cox. Before posting next time, I'll carefully check the FAQ Advice section. I appreciate your time.

                  Comment

                  Working...
                  X