Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create a transaction variable from date variables

    Dear all,

    I am relatively new to stata (and on the list) and need some help with an issue i have.
    I have a data set created from merging a bunch of data files on car registration from the Netherlands. The data files have been collected each month from Nov 2017 to Aug 2018, and have 3 date variables: 1) The date when the car was new from factory, 2) The date when the car was first registered in the NL and 3) The date the car was last re-registered.
    The two first dates are always the same, but the last one changes between the monthly data files (when for example a car is sold, exported and so on). I have an interest in looking at how many times a car changes owner based on how the registration date changes.

    I managed to create a variable for the nb of sales by looking at if the dates differ by:

    generate transaction = 0
    forvalues i=1(1)21 {
    local j = `i'+1
    replace transaction= `i' if Date`i' < Date`j' & scrp_status ==0
    }

    Where scrp_status is a dummy for if the car is scrapped or not (sorry for the ugly code input). The id variable i used for the merging etc is the cars segregationist number

    The problem is that sometimes the date changes more than once for a car between Nov 2017 and Aug 2019 which i can't capture with the code above, and i don't know how to proceed...
    Any ideas or input is very very welcome!!

    Many thanks
    Elin

  • #2
    You may want to try working with the data in long format (versus wide format). Type "help reshape".

    So for each ID, you would have multiple rows for each month.

    Then you can do something like this (assuming that each monthly row is unique within each ID and not duplicated):

    Code:
    sort id date_reregistered
    generate transaction=.
    replace transaction=1 if id[_n]==id[_n+1] & date_reregistered[_n]<date_reregistered[_n+1] & scrp_status==0
    egen transaction_count=sum(transaction), by(id)
    Or, if you want to keep the data in wide format, just change one of your lines of code to this:
    Code:
    generate transaction`i'=1 if Date`i' < Date`j' & scrp_status ==0
    And then add:
    Code:
    egen transaction_count=rowtotal(transaction*)
    But as for your question on "sometimes the date changes more than once for a car between Nov 2017 and Aug 2019 which i can't capture with the code above, and i don't know how to proceed...", wouldn't your existing code already capture how many times it changes?

    If you need more help, you might need to provide a data example or paste some code using the code formatting (see the hashtag/pound symbol).
    Last edited by Jenny Williams; 10 Oct 2019, 12:00.

    Comment


    • #3
      Thank you! Your second suggestion gave me what I was looking for.

      Comment

      Working...
      X