Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Compute median price for each quarter

    Hi guys, I'm a beginner to stata and would like to seek help here, thanks!

    [Question]
    I have a set of data, including 4 variables: 1. bought year 2. bought quarter 3. bought city 4. price

    I would like to ask how to calculate the median of price in 1) each quarter-year and 2) each city


    [I tried the below code]
    Code:
    . sort year quarter
    . egen median_price = median(price), by(year quarter)
    . list, sepby(year quarter)
    [Problem]
    it seems like I already have the 1)quarter-year median price, but the problem is, I have 55,000 observations, and I want the result to be grouped according to 1)quarter-year and 2)city, but not separately for each 55,000 obs

    Thanks!!!

  • #2
    Code:
    by year quarter, sort: egen median_price = median(price)
    egen flag = tag(year quarter)
    list year quarter median_price if flag, noobs clean

    Comment


    • #3
      Thank you so much for your speedy reply, Clyde! I've been able to learn the method of tag and flag!!

      I'm facing a similar question when I am plotting the time series graph using the same set of data. I want to plot the graph with x axis as time and y axis as median price. However, I'm struggling with the time variable.

      This is my code:
      Code:
      gen time = rocyear*10 + rocquarter2digit
      tsset time
      and it shows this error:
      Code:
      repeated time values in sample
      I think it is similar to my previous questions, for example, the first 1-100 obs have the same median price and same quarter-year(or time). Is there a similar method by which I could group observations with the same quarter-year and median price together and draw a time series graph?

      Comment


      • #4
        Also, referring to my original question with 4 variables: 1. bought year 2. bought quarter 3. bought city 4. price

        I'm wondering if I could add in one more variable, let's say weight, and restrict the top 20% weight products and shows the median of price in each quarter-year?

        Thank you !!!

        Comment


        • #5
          Your time variable does not look a good idea for graphics as the results will presumably go like 20214, 20221, 20222, 20223, 20224, 20231, with for every year 3 gaps of 1 and 1 gap of 7.

          You need something more like

          Code:
          gen timevar = yq(year, quarter)

          followed by an appropriate display format setting, on which see

          Code:
          help datetime display formats
          tsset in terms of single time variable is doomed to fail as the main point behind #1 is that you have several observations in each quarter which you want to summarize.

          tsset with a time variable alone will fail if there is more than one observation for any distinct time. So the error message is in a strong sense telling you what you already know.

          Code:
          line median_price timevar if flag, sort
          should give you a start on a reasonable graph.

          Your extra question on weight really needs a data example for an answer to be attempted.

          Please read the FAQ Advice, especially #12. https://www.statalist.org/forums/help

          Comment

          Working...
          X