Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • should I define data as a panel?

    Hello,

    I have the dataset on airtravel. It includes 7mln tickets, each of them with it's own code. There is information on route (origin and destination), airline name and date. Additional information icludes price of each ticket, location-specivic characteristics, price of gasoline in each week, distance and so on.
    I tried to specify the panel, using xtset in the following way: xtset flight num_day , where flight is the code of flight and num_day is the number of each day in 2013 (from 1 to 365)

    Stata writes "repeated time values within panel". Now I understand that this is because there were many passengers on each flight.
    But my question is how should I define the panel if each of the coupon numbers is unique and "contains" information on route, airline, flight number and date at the same time?

    p.s. if I take it as not panel, then I will treat the tickets of the same airline on the same direction but in different dates as completely different observations, which is not good, as I understand

    Thank you very much

    Alexander

  • #2
    You can perhaps define each passenger in each flight as having a unique id and then have the panel variable as flight and the "time variable" as passenger. Though passengers are of course not "time", panel data can also just have multiple dimensions of hierarchy.

    Generally though, data speaks louder than words. if you can provide a small sample using dataex (see FAQ), we could help you more.

    Comment


    • #3
      Alexander:
      welcome to the list.
      As far as I can get your query, it seems that you're not measuring the same sample of panelid a given numer of times (data waves) on the same set of variables: hence, you're not dealing with a pane dataset.
      If you are (wisely) worried about heterogeneity due to airlines and routes, you can create an indicator that put those predictors together and cluster OLS standard errors on it:
      Code:
      egen indicator=group(airline route)
      PS: Crossed with Ariel's reply, who, interestingly, suggests a different approach.
      Last edited by Carlo Lazzaro; 02 May 2017, 08:16.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input double coup_num int reis str2 ak_code str10 fl_date float tar str7 route int num_day double(jetfueli gasolineprice)
        6145070721 1 "S7" "06.01.2013" 4500 "MOW-GOJ"  6 34246 28.06
        6145127046 1 "UN" "07.01.2013" 2090 "MOW-LED"  7 34246 28.06
        6145125134 1 "UN" "07.01.2013" 1375 "MOW-LED"  7 34246 28.06
        6145126553 1 "UN" "07.01.2013" 2475 "MOW-LED"  7 34246 28.06
        6145176157 1 "S7" "14.01.2013" 4500 "MOW-GOJ" 14 34246 28.08
        6145176156 1 "S7" "14.01.2013" 4500 "MOW-GOJ" 14 34246 28.08
        6145202528 1 "UN" "14.01.2013" 2975 "MOW-LED" 14 34246 28.08
        6145278558 1 "UN" "14.01.2013" 1375 "MOW-LED" 14 34246 28.08
        6145202527 1 "UN" "14.01.2013" 2975 "MOW-LED" 14 34246 28.08
        6145228558 1 "UN" "14.01.2013" 1375 "MOW-LED" 14 34246 28.08
        6145153277 1 "UN" "14.01.2013" 1375 "MOW-LED" 14 34246 28.08
        6145379683 1 "S7" "16.01.2013" 4500 "MOW-GOJ" 16 34246 28.13
        6145332697 1 "UN" "16.01.2013" 2090 "MOW-LED" 16 34246 28.13
        6145292190 1 "UN" "16.01.2013" 1375 "MOW-LED" 16 34246 28.13
        6145319945 1 "S7" "16.01.2013" 4500 "MOW-GOJ" 16 34246 28.13
        6145350252 1 "UN" "16.01.2013" 2090 "MOW-LED" 16 34246 28.13
        6145359934 1 "UN" "17.01.2013" 2090 "MOW-LED" 17 34246 28.13
        6145202008 1 "S7" "17.01.2013" 4500 "MOW-GOJ" 17 34246 28.13
        6145371228 1 "UN" "17.01.2013" 2090 "MOW-LED" 17 34246 28.13
        6145338750 1 "S7" "17.01.2013" 4500 "MOW-GOJ" 17 34246 28.13
        6145380253 1 "UN" "17.01.2013" 3825 "MOW-LED" 17 34246 28.13
        6145331090 1 "UN" "17.01.2013" 1375 "MOW-LED" 17 34246 28.13
        6145338752 1 "S7" "17.01.2013" 4500 "MOW-GOJ" 17 34246 28.13
        end
        Thank you for your answers. Possibly, the example of data will be useful

        Comment


        • #5
          Some explanation of the variables can help. In the main post you suggested that you tried to "xtset flight", yet no such variable (flight) exists in the data example...

          Comment


          • #6
            Ok, sure

            reis=flight, ak_code = airline code, tar = tarrif or price, jetfuel = price of jet fuel in each airport, gasolineprice = price of gasoline by week

            thank you

            Comment


            • #7
              Originally posted by Carlo Lazzaro View Post
              Alexander:
              welcome to the list.
              As far as I can get your query, it seems that you're not measuring the same sample of panelid a given numer of times (data waves) on the same set of variables: hence, you're not dealing with a pane dataset.
              If you are (wisely) worried about heterogeneity due to airlines and routes, you can create an indicator that put those predictors together and cluster OLS standard errors on it:
              Code:
              egen indicator=group(airline route)
              PS: Crossed with Ariel's reply, who, interestingly, suggests a different approach.
              Mr Carlo Lazzaro,

              Am I right that the coefficient of that indicator would not be interpretable and that clastering of errors should be done separately?

              Thank you

              Comment


              • #8
                Alexander:
                if you mean that:
                - indicator should not be included among predictors;
                . indicator should be only used for clustering standard errors via -vce(cluster indicator)-;

                your interpretation of my previous post is correct.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment

                Working...
                X