Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating a new variable when both vars missing

    Consider the following panel data that tracks each agent's salary information in some form (e.g., annual, post-tax).

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int(id recordyr) byte salary long(annual posttax) float hourly
    112 2001 . 50000 46000     .
    112 2002 .     .     .    17
    113 2002 .     . 32000     .
    113 2003 .     . 33500     .
    113 2004 .     .     . 17.88
    end
    I want to create a new variable -salary- for each observation that puts -hourly- wage information when both -annual- and -posttax- entries are missing.
    If one of -annual- or -posttax- has a valid entry, I want to put -posttax- in the -salary-.

    How would I go about doing this for each observation in a loop? Also, although the example above does not have string variables, I want to do the same excerise as if these variables are all string (e.g., city, state, country, geographic information).


  • #2
    I do not think that I follow what you are asking, but if you are doing an operation observation by observation, identifiers are irrelevant. If the below is not correct, create an example that depicts the wanted variable.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int(id recordyr) byte salary long(annual posttax) float hourly
    112 2001 . 50000 46000     .
    112 2002 .     .     .    17
    113 2002 .     . 32000     .
    113 2003 .     . 33500     .
    113 2004 .     .     . 17.88
    end
    
    gen wanted= cond(missing(annual) & missing(posttax), hourly, posttax)
    For more on the -cond()- function, see https://journals.sagepub.com/doi/pdf...867X0500500310.

    Res.:


    Code:
    . l
    
         +--------------------------------------------------------------+
         |  id   recordyr   salary   annual   posttax   hourly   wanted |
         |--------------------------------------------------------------|
      1. | 112       2001        .    50000     46000        .    46000 |
      2. | 112       2002        .        .         .       17       17 |
      3. | 113       2002        .        .     32000        .    32000 |
      4. | 113       2003        .        .     33500        .    33500 |
      5. | 113       2004        .        .         .    17.88    17.88 |
         +--------------------------------------------------------------+

    Comment


    • #3
      This is exactly what I wanted -- thanks!

      Originally posted by Andrew Musau View Post
      I do not think that I follow what you are asking, but if you are doing an operation observation by observation, identifiers are irrelevant. If the below is not correct, create an example that depicts the wanted variable.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input int(id recordyr) byte salary long(annual posttax) float hourly
      112 2001 . 50000 46000 .
      112 2002 . . . 17
      113 2002 . . 32000 .
      113 2003 . . 33500 .
      113 2004 . . . 17.88
      end
      
      gen wanted= cond(missing(annual) & missing(posttax), hourly, posttax)
      For more on the -cond()- function, see https://journals.sagepub.com/doi/pdf...867X0500500310.

      Res.:


      Code:
      . l
      
      +--------------------------------------------------------------+
      | id recordyr salary annual posttax hourly wanted |
      |--------------------------------------------------------------|
      1. | 112 2001 . 50000 46000 . 46000 |
      2. | 112 2002 . . . 17 17 |
      3. | 113 2002 . . 32000 . 32000 |
      4. | 113 2003 . . 33500 . 33500 |
      5. | 113 2004 . . . 17.88 17.88 |
      +--------------------------------------------------------------+

      Comment

      Working...
      X