Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Redefining layer numbers based on all values variable 'layer' takes for a given firm

    I have a panel on workers in different companies over years. Each worker is assigned a number between 0 and 3 (corresponding to their layer/level of importance). In some companies some layers are missing, for example I may only have workers from layers 0 and 2. In those cases I would like to redefine the variable to keep the order but have consecutive numbers starting at 0, i.e. change those with layer = 2 to layer = 1. I wrote some code which loops over all companies which need changes but it's taking forever to run. I tried other options but I can't find a way to combine 'by' with 'levelsof' to write the condition I want to check in a single line. I was wondering if anyone has suggestions on a more efficient way to do this.

    Code:
    egen num_layers = nvals(layer), by(firm_id year)
    egen firm_year = group(firm_id year) 
    
    gen wrong = .
    replace wrong = 1 if (num_layers == 2 & layer > 1) | (num_layers == 3 & layer == 3)
    gen need_fix = wrong*firm_year
    
    levelsof need_fix, local(f) 
    foreach i of local f {
        levelsof layer if firm_year == `i'
        if (r(levels) == "0 2" | r(levels) == "0 3") {
            replace layer = 1 if firm_year == `i' & layer > 0
            }
        if r(levels) == "1 2" {
            replace layer = 0 if firm_year == `i' & layer == 1
            replace layer = 1 if firm_year == `i' & layer == 2
            }
        if r(levels) == "2 3" {
            replace layer = 0 if firm_year == `i' & layer == 2
            replace layer = 1 if firm_year == `i' & layer == 3
            }
        if r(levels) == "0 1 3" {
            replace layer = 2 if firm_year == `i' & layer == 3
            }
        if r(levels) == "0 2 3" {
            replace layer = 1 if firm_year == `i' & layer == 2
            replace layer = 2 if firm_year == `i' & layer == 3
            }
        if r(levels) == "1 2 3" {
            replace layer = 0 if firm_year == `i' & layer == 1
            replace layer = 1 if firm_year == `i' & layer == 2
            replace layer = 2 if firm_year == `i' & layer == 3
            }
    }
    Thank you!

  • #2
    Kamila, below please find an example.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(workerid year firmid layer)
    1 2019 1 0  
    1 2020 1 0 
    2 2019 1 2  
    2 2020 1 2 
    3 2019 2 3  
    3 2020 2 3 
    4 2019 2 1
    4 2020 2 1
    end
    
    bys firmid (layer): gen num_layer = sum(layer!=layer[_n-1])
    bys firmid (layer): replace layer = layer[1] + num_layer - 1
    drop num_layer

    Comment


    • #3
      Thank you so much Fei! Your approach was incredibly helpful.

      After some minor changes the following code does exactly what I needed, significantly faster than my previous attempts:

      Code:
      bys year firmid (layer): gen num_layer = sum(layer!=layer[_n-1])
      bys year firmid (layer): replace layer = num_layer - 1
      drop num_layer

      Comment

      Working...
      X