Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to code my new variable.

    Hello!

    I'm not the best at stata and need help how to code a new variable in my panel dataset (2007-2023). The purpose of the variable is to tell me how depressed/gloomy someone was feeling during a the latest election year.

    I have a variable for "ID" (each person), "year" (some people are not participating every year) and "gloomy" (scale from 1-7). The election years were 2010, 2012, 2017, 2021.

    I want the new variable "DepressedElection" to tell me the value of "gloomy" for the latest election year, and if the person was not participating in the survey in the lastest election year I want the "DepressedElection" to be missing.

    So to be concrete: If the person with ID "..." was participating in the survey year 2010, the "DepressedElection" should tell the value of the variable "gloomy" for 2010. If that same person, with the same "ID" was participating 2011, the "DepressedElection" should tell the value of the variable "gloomy" for 2010 for that specific person/"ID". If that same person, with the same "ID" was participating 2012, the "DepressedElection" should tell the value of the variable "gloomy" for 2012, since there was a new election in 2012, for that specific person/"ID". If that same person, with the same "ID" was participating 2013, the "DepressedElection" should tell the value of the variable "gloomy" for 2012 for that specific person/"ID". If that same person, with the same "ID" was participating 2014, the "DepressedElection" should tell the value of the variable "gloomy" for 2012 for that specific person/"ID". If that same person, with the same "ID" was participating 2015, the "DepressedElection" should tell the value of the variable "gloomy" for 2012 for that specific person/"ID". If that same person, with the same "ID" was participating 2016, the "DepressedElection" should tell the value of the variable "gloomy" for 2012 for that specific person/"ID". If that same person, with the same "ID" was participating 2017, the "DepressedElection" should tell the value of the variable "gloomy" for 2017 for that specific person/"ID", since there was a new election. If that same person, with the same "ID" was participating 2018, the "DepressedElection" should tell the value of the variable "gloomy" for 2017 for that specific person/"ID". If that same person, with the same "ID" was participating 2019, the "DepressedElection" should tell the value of the variable "gloomy" for 2017 for that specific person/"ID". If that same person, with the same "ID" was participating 2020, the "DepressedElection" should tell the value of the variable "gloomy" for 2017 for that specific person/"ID". If that same person, with the same "ID" was participating 2021, the "DepressedElection" should tell the value of the variable "gloomy" for 2021 for that specific person/"ID", since it was a year with an election. If that same person, with the same "ID" was participating 2022, the "DepressedElection" should tell the value of the variable "gloomy" for 2021 for that specific person/"ID". If that same person, with the same "ID" was participating 2023, the "DepressedElection" should tell the value of the variable "gloomy" for 2021 for that specific person/"ID".

    If the person was not participating in the survey for an election year, the years after until the next election should be missing variables. For instance, if If that same person, with the same "ID" was participating 2022 and 2023, but not 2021, the "DepressedElection" should be missing for the years 2022 and 2023 for that specific person/"ID". The years before 2010 should also be missing.

    Sorry for the long question. Thank you for your help!

    Best regards,
    Klara

  • #2
    Welcome to Statalist. For your future posts, please read the FAQ Advice, and in particular, FAQ Advice #12, to familiarize yourself with the dataex command for presenting data examples.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(ID year gloomy)
    1 2010 7
    1 2011 .
    1 2012 .
    1 2013 .
    1 2014 .
    1 2015 .
    1 2016 .
    1 2017 3
    1 2018 .
    1 2019 .
    1 2020 .
    1 2021 2
    1 2022 .
    1 2023 .
    2 2010 .
    2 2011 .
    2 2012 5
    2 2013 .
    2 2014 .
    2 2015 .
    2 2016 .
    2 2017 .
    2 2018 .
    2 2019 .
    2 2020 .
    2 2021 1
    2 2022 .
    2 2023 .
    3 2010 4
    3 2011 .
    3 2012 5
    3 2013 .
    3 2014 .
    3 2015 .
    3 2016 .
    3 2017 6
    3 2018 .
    3 2019 .
    3 2020 .
    3 2021 7
    3 2022 .
    3 2023 .
    end
    
    *START HERE
    gen election= inlist(year, 2010, 2012, 2017, 2021)
    bys ID (year): gen wanted= gloomy
    by ID: replace wanted= wanted[_n-1] if !missing(wanted[_n-1]) & missing(wanted) & !election
    Res.:

    Code:
    . l, sepby(ID)
    
         +----------------------------------------+
         | ID   year   gloomy   election   wanted |
         |----------------------------------------|
      1. |  1   2010        7          1        7 |
      2. |  1   2011        .          0        7 |
      3. |  1   2012        .          1        . |
      4. |  1   2013        .          0        . |
      5. |  1   2014        .          0        . |
      6. |  1   2015        .          0        . |
      7. |  1   2016        .          0        . |
      8. |  1   2017        3          1        3 |
      9. |  1   2018        .          0        3 |
     10. |  1   2019        .          0        3 |
     11. |  1   2020        .          0        3 |
     12. |  1   2021        2          1        2 |
     13. |  1   2022        .          0        2 |
     14. |  1   2023        .          0        2 |
         |----------------------------------------|
     15. |  2   2010        .          1        . |
     16. |  2   2011        .          0        . |
     17. |  2   2012        5          1        5 |
     18. |  2   2013        .          0        5 |
     19. |  2   2014        .          0        5 |
     20. |  2   2015        .          0        5 |
     21. |  2   2016        .          0        5 |
     22. |  2   2017        .          1        . |
     23. |  2   2018        .          0        . |
     24. |  2   2019        .          0        . |
     25. |  2   2020        .          0        . |
     26. |  2   2021        1          1        1 |
     27. |  2   2022        .          0        1 |
     28. |  2   2023        .          0        1 |
         |----------------------------------------|
     29. |  3   2010        4          1        4 |
     30. |  3   2011        .          0        4 |
     31. |  3   2012        5          1        5 |
     32. |  3   2013        .          0        5 |
     33. |  3   2014        .          0        5 |
     34. |  3   2015        .          0        5 |
     35. |  3   2016        .          0        5 |
     36. |  3   2017        6          1        6 |
     37. |  3   2018        .          0        6 |
     38. |  3   2019        .          0        6 |
     39. |  3   2020        .          0        6 |
     40. |  3   2021        7          1        7 |
     41. |  3   2022        .          0        7 |
     42. |  3   2023        .          0        7 |
         +----------------------------------------+
    
    .

    Comment


    • #3
      Thank you for your reply!

      The problem now is that there are no missing values. The people not participating in the survey on a year of election should have missing values if they participated the years after until the next election if that makes sense. Like I said before: if one person, with the same "ID" was participating in the survey 2022 and 2023, but not 2021, the "DepressedElection" should be missing for the years 2022 and 2023, and not take the value from 2012.

      When I try your code it says that no changes are made:

      by ID: replace DepressedElection= DepressedElection[_n-1] if !missing(DepressedElection[_n-1]) & missing(DepressedElection) & !election
      (0 real changes made)

      This is what I have a hard time solving.

      Thank you for your help!
      /Klara
      Last edited by Klara Danielsson; 25 Oct 2023, 07:33.

      Comment


      • #4
        gen ElectionInterval = .

        * Define the intervals based on Year
        replace ElectionInterval = 1 if Year >= 2010 & Year <= 2011
        replace ElectionInterval = 2 if Year >= 2012 & Year <= 2016
        replace ElectionInterval = 3 if Year >= 2017 & Year <= 2020
        replace ElectionInterval = 4 if Year >= 2021 & Year <= 2023

        * Display the unique values of ElectionInterval to verify the intervals
        tab ElectionInterval


        gen election= inlist(Year, 2010, 2012, 2017, 2021)

        gen DepressedElection=.
        by ID: replace DepressedElection= gloomy if election==1
        by ID: replace DepressedElection= gloomy[_n-1] if !missing(DepressedElection[_n-1]) & missing(DepressedElection) & !election & ///
        ElectionInterval ==ElectionInterval[_n-1]

        This is my code now and I get it almost right, although in case the election interval is longer than two years, the value DepressedElection takes from gloomy is the year of the election for the year after but after two years it takes the value from the year after the election year. I understand that by writing gloomy[_n-1], that is what I tell my code to do but I can't come up with how to write the condition so that it takes the value from the year of election within the same election interval.

        Comment


        • #5
          As I wrote in #2, the best way to describe the problem is to present a data example using the dataex command. So you may copy and paste the output of

          Code:
          sort ID Year
          dataex
          Then describe the issue based on the presented data example. I cannot follow what you have written below

          although in case the election interval is longer than two years, the value DepressedElection takes from gloomy is the year of the election for the year after
          but after two years it takes the value from the year after the election year

          Comment

          Working...
          X