Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • create new value based on variable name

    Hi all,

    I have a dataset with variable names from taxdebt_20120101 to taxdebt_20131231 with the last digits being a date format. Those variables display values from 0 - 100,000
    I am trying to create a new variable that equals the date of the variable name if the value of that variable is > 0.

    My code so far:

    gen date = .

    forval j = 20120101/20131231 {
    replace date = `j' if taxdebt_`j' > 0
    }

    tostring date, replace format(%20.0f)
    gen date2 = date(date,"YMD")
    format date2 %td


    Unfortunately the code doesn't quite work. I keep getting the date 20120132. A date is also displayed for values = 0.
    Hope you can help.

  • #2
    Unfortunately the code doesn't quite work. I keep getting the date 20120132.
    And you'd better not to put that date on a credit report!

    What happens is that you can't just iterate over dates like that. forvalues loop increments continuously by 1 while the dates jump discontinuously as no month contains 32 days.
    Once it hits the date for which there is no variable in your dataset (jan 32, 2012 is the first such occurrence), then you should get an error. Why you don't get it - don't know.

    Here is an example of what you may undertake:

    Code:
    clear all
    input d_20120101 d_20120102 d_20120213
    0 0 12
    0 10 0
    17 0 0
    0 . 3
    0 . .
    . 0 7
    . . .
    . 7 0
    0 0 0
    end
    
    ds d_*
    local vlist `"`r(varlist)'"'
    
    generate d=""
    foreach v in `vlist' {
      replace d="`v'" if `v'>0 & !missing(`v') & missing(d) 
    }
    
    generate double date=real(substr(d,-8,8))
    format date %25.0g
    drop d
    
    list, ab(20)


    Code:
         +-------------------------------------------------+
         | d_20120101   d_20120102   d_20120213       date |
         |-------------------------------------------------|
      1. |          0            0           12   20120213 |
      2. |          0           10            0   20120102 |
      3. |         17            0            0   20120101 |
      4. |          0            .            3   20120213 |
      5. |          0            .            .          . |
         |-------------------------------------------------|
      6. |          .            0            7   20120213 |
      7. |          .            .            .          . |
      8. |          .            7            0   20120102 |
      9. |          0            0            0          . |
         +-------------------------------------------------+

    You also wrote "I am trying to create a new variable that equals the date of the variable name if the value of that variable is > 0."
    This is imprecise, since there could be multiple such variables. It could be only once in your data, but more likely not. Unfortunately you didn't post a sample.

    My code retains the first such occurrence.

    Best, Sergiy Radyakin

    Comment


    • #3
      Thank you so much for your response, Sergiy! It was very helpful for my understanding.

      You are right, there are multiple values over 5,000 for various dates. I was too immersed in displaying the proper date, I did not consider that.
      In the attachment you can see that 2012-01-02 and 2012-01-03 have the same values. How can I count the days in which the condition of value > 5,000 is met?

      Thanks!
      Attached Files

      Comment

      Working...
      X