Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dbase transformation help

    Hello,

    I have to significantly modify a database that we have in order to correctly transform it for analysis. I am so far stumped, and cant seem to find what I should be doing in the help resources. I've attached a screenshot of a sample of the data that I am working with. For every person record (PER_REC) I need to identify the last (max value) occurrence of the activity sequence (ACT_SEQ), and then assign that last occurrence a DEP_time of 259. I was originally looking at the FAQ's for instructions on identify the last or first occurrence of panel data, but the examples did not seem correct for my application. Any help would be appreciated,

    Thanks,

    ​Robert
    Attached Files

  • #2
    Didn't intend for there to be a double post, moderator please delete this one.

    Comment


    • #3
      Data posted as CODE is much more readable than data posted as an image attachment: please see FAQ Advice.

      In principle your calculation is

      Code:
       
      bysort PER_REC (ACT_SEQ) : replace DEP_time = 259 if _n == _N
      but I note that

      1. Your ACT_SEQ is string with values such as "01". That's a little unusual for a sequence identifier.

      2. It has many repeated values. That's even more unusual, without some extra story.

      The calculation seems correct for your example to the extent that 259 is aligned with "09" for 1000001. No value of 259 appears for the other block of observations.

      Comment


      • #4
        Nick,

        Thank you for the code, it worked wonderfully from what I can tell.

        To answer your other questions, the ACT_SEQ is usually an integer, however, this is taken from a development version of the database we are using which has some odd attribute formatting from the export import process.

        The repeated values I assume are what you see for households, persons, and trip records. The Household ID is nested in the Person ID and all of those are nested in a trip ID which serves as a master record ID. The dbase is a survey of several thousands of person records of daily travel combined with household and person attribute data that we use to model travel behavior. So Household 1000001 has 3 household members which are 100000101, 100000102, 100000103 and every trip for every person that is made throughout the day is added to the end of the unique person record. In many cases we would have dropped the household and person ID since they are nested in the trip ID but we use another program to model some of the specific travel behavior elements, and that program requires the ID's to be broken out like that.

        Thanks again,

        Robert

        Comment

        Working...
        X