Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with daily panel data, "repeated time values within panel"

    Hey!

    I am currently gathering data of the blablacar website and will have daily data of around 40 different trips within Germany, Each trip has multiple offers, meaning that I get around 1000 offers each day. I am trying to set up a daily panel data but run into the problem that I have multiple times the same date as all the 1000 observations in a day have the same date probably. I want to regress the prices with all the different attributes of the driver that offered a specific trip.

    This is the error I get :

    tsset Date2, daily
    repeated time values in sample
    r(451);

    Anyone knows how to solve this? I am stuck : /

    Kind regards,
    Herbrink Kevin

  • #2
    Kevin:
    welcome to this forum.
    The main issue with your data is not the same date, but need of creating different panel_id for the same trip.
    Let's assume that you have three variables: trip (that does not changes across offers); offers for the same trip (assumed to change) and date.


    You may want to consider something along the following lines:
    Code:
    . set obs 3
    
    . g trip="Milan_Rome"
    
    . g offer=runiform()*50
    
    . g offer_id=_n
    
    . egen new_id=group(trip offer_id)
    
    . g date=1960
    
    . format date %ty
    
    . xtset new_id date
           panel variable:  new_id (strongly balanced)
            time variable:  date, 1960 to 1960
                    delta:  1 year
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      A tsset in terms of the date alone not only doesn't work; it does not make sense for your data.

      Whether an xtset in terms of a panel identifier alone makes sense I can't say, but it seems a bit more likely. I don't understand the data well enough to be firm on that, and I don't see a statement of your intended analysis methods.

      Comment


      • #4
        Originally posted by Carlo Lazzaro View Post
        Kevin:
        welcome to this forum.
        The main issue with your data is not the same date, but need of creating different panel_id for the same trip.
        Let's assume that you have three variables: trip (that does not changes across offers); offers for the same trip (assumed to change) and date.


        You may want to consider something along the following lines:
        Code:
        . set obs 3
        
        . g trip="Milan_Rome"
        
        . g offer=runiform()*50
        
        . g offer_id=_n
        
        . egen new_id=group(trip offer_id)
        
        . g date=1960
        
        . format date %ty
        
        . xtset new_id date
        panel variable: new_id (strongly balanced)
        time variable: date, 1960 to 1960
        delta: 1 year
        Thank you very much for your help! I am trying to set up the different trips as a separate variables and group them as such, then it should be possible as you mentioned to get rid of the error and set up the panel data as such right?

        Comment


        • #5
          Originally posted by Nick Cox View Post
          A tsset in terms of the date alone not only doesn't work; it does not make sense for your data.

          Whether an xtset in terms of a panel identifier alone makes sense I can't say, but it seems a bit more likely. I don't understand the data well enough to be firm on that, and I don't see a statement of your intended analysis methods.
          The data in general consists of a price of a certain trip offered by a person, I have multipe trips for certain routes that are taken on a daily basis. For a single day I have around 1000 trips in totaL.
          I am in fact planning on doing a two stage least squared trip fixed effect regression with the mentioned panel data. I am trying to analyse what attributes of drivers(experience, star rating) affects the price as well as the number of sold seats.

          I hope that clarifies it a bit!

          Comment


          • #6
            You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output (fixed spacing fonts help), and sample data using dataex.

            So, your observation is the bid, and you're setting up trips as dummy variables, and then looking at how attributes of drivers interacts with trips in determining bid price? If it is "offered" then I'm not sure where you get "number of sold seats".

            You can do xtset and not set up the time dimension. Then you can do panel estimates, but cannot lead or lag variables using L or F. But, what is your panel control? Is it the person offering the trip?

            Comment

            Working...
            X