Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Timeline of percentiles

    Hey Statalist community,

    I'm working with a dataset of monthly survey responses and I'm looking to create a time series (line chart) of various percentiles (p10,p25,p50,p75,p90) of one of the variables. I have a variable (YEAR_MONTH) which indicates the survey year and month (Format: YYYYMM) the response was collected from and any given month could have around 200 responses. I'm able to create a table using tabstat with YEAR_MONTH as the rows and the percentiles as the columns for my variable but I can't figure out how to convert this data into an actual timeseries chart. I'm thought about creating a new data set but that just seems like a very backwards way of doing it.

    Please help.

  • #2
    Welcome to Statalist, Mathieu! Please read the FAQ on how to post questions most effectively, especially #12. As advised there, please provide a sample of your data using the dataex command, to help others help you.

    Furthermore, you may want to explore the pctile function of the egen command, to help create the variables you need to graph. See
    Code:
    help egen
    If your YEAR_MONTH variable is currently string or numbers like 202301 (to signify January 2023), you may also want to create a numeric variable that Stata internally understands as a date. See
    Code:
    help datetime
    More specific advice will be forthcoming once we have more clarity about your data from a data extract, as I suggested.
    Last edited by Hemanshu Kumar; 19 Jan 2023, 16:22.

    Comment


    • #3
      I agree with Hemanshu Kumar.

      One detail seems predictable. As Hemanshu says, it seems that your monthly dates are either numbers like 202301 or strings like "202301". But both are useless for line charts. A numeric variable has gaps like that between 202212 and 202301 that graphics can only take literally and strings are even less use for a time axis.

      Run-together monthly dates can be tricky. Here is an instant tutorial:

      Code:
      . clear
      
      . set obs 1
      Number of observations (_N) was 0, now 1.
      
      . gen mdate_n = 202301
      
      . gen mdate_s = "202301"
      
      . gen mdate1 = ym(floor(mdate_n/100), mod(mdate_n, 100))
      
      . gen mdate2 = monthly(substr(mdate_s, 1, 4) + " " + substr(mdate_s, 5, 2), "YM")
      
      . l
      
           +-------------------------------------+
           | mdate_n   mdate_s   mdate1   mdate2 |
           |-------------------------------------|
        1. |  202301    202301      756      756 |
           +-------------------------------------+
      
      . format mdate? %tm
      
      . l
      
           +-------------------------------------+
           | mdate_n   mdate_s   mdate1   mdate2 |
           |-------------------------------------|
        1. |  202301    202301   2023m1   2023m1 |
           +-------------------------------------+

      For more discussion see https://www.stata-journal.com/articl...article=dm0096

      Comment


      • #4
        Hi Nick and Hemanshu,

        Thank you for your help, it really helped me! Next time I'll take note of the FAQ to post more effectively.

        Best,
        Matt

        Comment

        Working...
        X