Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a new variable with different occupation fields

    Dear All,

    I would like to analyze occupational fields. I have a variable that is coded according to the ISCO job classification. From this variable I need to create a variable with different fields/areas ex. Physical, math, Engineering porfessional or Life,health science professionals etc.
    In original data the codes start from 1000 until 9320.

    I would like to Group codes,however, those codes that I need to group do not come in a subsequent order. For example, I need to group an interval of codes ex from 3100 to 4100 with another interval 6100 to 7100.

    How could I do that?

    Please find an example:
    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str4 t7job_isco88
    ""   
    "-2" 
    ""   
    ""   
    ""   
    "-2" 
    ""   
    ""   
    "8280"
    ""   
    ""   
    "-2" 
    "-2" 
    "-2" 
    ""   
    ""   
    "-2" 
    ""   
    ""   
    "-2" 
    ""   
    ""   
    ""   
    "-2" 
    "2412"
    "-2" 
    ""   
    "-2" 
    ""   
    "4190"
    ""   
    ""   
    ""   
    ""   
    "-2" 
    "3133"
    ""   
    ""   
    ""   
    ""   
    "-2" 
    "-2" 
    ""   
    ""   
    ""   
    "-2" 
    ""   
    "3320"
    ""   
    ""   
    ""   
    ""   
    ""   
    ""   
    ""   
    ""   
    ""   
    "-2" 
    "-2" 
    "-2" 
    "-2" 
    ""   
    "4190"
    ""   
    ""   
    "7141"
    "-2" 
    ""   
    ""   
    "5132"
    "5141"
    ""   
    "-2" 
    ""   
    "5141"
    ""   
    "-2" 
    "6112"
    "-2" 
    ""   
    "-2" 
    ""   
    ""   
    ""   
    "-2" 
    "4190"
    "-2" 
    ""   
    ""   
    "-2" 
    ""   
    "5132"
    ""   
    "5220"
    ""   
    "-2" 
    ""   
    "-2" 
    ""   
    ""   
    end
    ------------------ copy up to and including the previous line ------------------


    Thank you in Advance!


  • #2
    See -group- function available from -egen-.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      A simple method would be to destring the occupation variable, which would then allow you to work with numeric ranges.

      Code:
      destring t7job_isco88, generate(t7job_isco88_numeric)
      
      /* in separate columns */
      generate group1=1 if (t7job_isco88_numeric>=3100 & t7job_isco88_numeric<=4100) | (t7job_isco88_numeric>=6100 & t7job_isco88_numeric<=7100)
      generate group2=1 if (t7job_isco88_numeric>=1000 & t7job_isco88_numeric<2000)
      generate group3=1 if (t7job_isco88_numeric>=2000 & t7job_isco88_numeric<3000)
      
      /* in one column, if mutually exclusive */
      generate group=1 if (t7job_isco88_numeric>=3100 & t7job_isco88_numeric<=4100) | (t7job_isco88_numeric>=6100 & t7job_isco88_numeric<=7100)
      replace group=2 if (t7job_isco88_numeric>=1000 & t7job_isco88_numeric<2000)
      replace group=3 if (t7job_isco88_numeric>=2000 & t7job_isco88_numeric<3000)
      label define group 1 "Codes 3100-4100, 6100-7100" 2 "Codes 1###" 3 "Codes 2###"
      label values group group
      tab group, missing
      You might have to first split the occupation variable into the part that is numeric and the part that has the description, if they are combined.

      And be careful with leading zeros in your occupation string variable, such as "0100" and "100", which would be destringed into the same variable.

      Comment


      • #4
        Thank you both!

        Jenny Williams, thank you for the Code example, now I see the logic and can apply, it works!

        Best Regards,
        Violeta

        Comment

        Working...
        X