Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graphing the frequency of a dummy variable

    Hello,

    I'm working with daily financial data and I have 2 binary variables (called events and news) that take value 1 when there is an event or news in a given day that may influence my country financially. I would like to make a graph sort of as initial summary statistics, to show the percentage of days where there were news and events in each month, but I'm not sure how to code this.

    I have tried:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    gen month=month(date)
    gen yr=year(date)
    graph bar (mean) events news , over(month) over(yr) ytitle(proportion events news)
    end
    But I'm not really sure if this is it, because it appears to me that this just calculates the average of both dummies for the month, and not the percentage of days in a month.

    Here's a sample of the data:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int date float(month yr events news)
    17244  3 2007 0 0
    17245  3 2007 0 0
    17246  3 2007 0 0
    17247  3 2007 0 0
    17248  3 2007 0 0
    17251  3 2007 0 0
    17252  3 2007 0 0
    17253  3 2007 0 0
    17254  3 2007 0 0
    17255  3 2007 0 0
    17258  4 2007 0 0
    17259  4 2007 0 0
    17260  4 2007 0 0
    17261  4 2007 0 0
    17262  4 2007 1 1
    17265  4 2007 1 1
    17266  4 2007 0 0
    17267  4 2007 0 0
    17268  4 2007 0 0
    17269  4 2007 1 0
    17272  4 2007 1 0
    17273  4 2007 1 0
    17274  4 2007 0 0
    17275  4 2007 0 0
    17276  4 2007 0 0
    17279  4 2007 0 0
    17280  4 2007 0 0
    17281  4 2007 0 0
    17282  4 2007 0 0
    17283  4 2007 0 0
    17286  4 2007 0 0
    17287  5 2007 1 1
    17288  5 2007 0 0
    17289  5 2007 0 0
    17290  5 2007 0 0
    17293  5 2007 0 0
    17294  5 2007 0 0
    17295  5 2007 1 1
    17296  5 2007 1 1
    17297  5 2007 1 1
    17300  5 2007 1 1
    17301  5 2007 0 0
    17302  5 2007 0 0
    17303  5 2007 0 0
    end
    format %tdnn/dd/CCYY date
    Thank you in advance.

  • #2
    Is this any good? I imagine it'll look better if you have more data. You'll want to play around with the replace month = month+ if... lines if you have more than three months, FYI

    Code:
    //Check if clean-plots exists, and install if not
    capture confirm existence scheme-cleanplots.scheme
    if _rc!=0{
        net install cleanplots, from("https://tdmize.github.io/data/cleanplots")
    }
    
    preserve
    //Collapse data
    collapse events news, by(month)
    
    //Stack month and events into one variable
    stack month events  month news, into(month events_news) clear
    
    //Adjust month values for x-axis 
    replace month = month+0.5 if _stack==2
    replace month = month+1 if month>=4
    replace month = month+1 if month>=6
    ​​​​​​​
    //Draw plot
    tw (bar events_news month if _stack==1, barw(0.45)) (bar events_news month if _stack==2, barw(0.45)), ///
        xlab(3.25 "Month 3" 5.25 "Month 4" 7.25 "Month 5") xsc(extend) xti("") ylab(,format(%9.1f)) ///
        yti(proportion of events/news in a month) leg(lab(1 "Events") lab(2 "News")) scheme(cleanplots)
    
    restore
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	61.1 KB
ID:	1572293

    Last edited by Chris Larkin; 10 Sep 2020, 17:56.

    Comment


    • #3
      This works well yes thank you! My only issue is that I have all 12 months per year, and 7 years of data (2006 to 2013), and I would like to have the frequencies per month of the year. Any advice on how to do that?

      Also, just out of curiosity, is it possible to graph this in a two-way line graph instead of bar? This works just fine as well if not. Thank you so much!

      Comment


      • #4
        Both of those things are possible. If you want to make this a line graph just change the word bar in the twoway command to line or connected and then remove the barwidth() options.

        If you can share more data I can mock up something for the 12 months over 7 years. It wouldn't be that 'fun' to make the data i'd need to play around with the code.
        Last edited by Chris Larkin; 11 Sep 2020, 05:50.

        Comment


        • #5
          The mean of a (0, 1) variable is just the proportion of values that are 1. If you want to see results between 0 and 100 then either multiply by 100 or fix the axis labels;

          For 8 years (not 7) 2006 to 2013 and 12 months, the code in #1 won't yield a very helpful graph while the helpful answer in #2 is moving in a different direction.

          For up to 96 values a line plot is a more obvious alternative. Some basic technique can be seen from this self-contained example. Here I invent means and it's immaterial to the technique that they aren't proportions or percentages. All the technique here extends to plotting two variables against time.

          In your case I would recommend using egen to calculate the means directly in new variables.

          Code:
          clear 
          set obs 96 
          gen year = 2005 + ceil(_n/12)
          bysort year : gen month = _n 
          gen mdate = ym(year, month)
          format mdate %tm 
          
          gen mean = 1 + 2 * (year - 2000) + (month - 6.5)^2 
          
          set scheme s1color 
          
          line mean mdate 
          
          line mean year, by(month)
          
          line mean month, by(year) xla(1/12)
          Some exotic plots for data that may show trend and seasonality were discussed in https://www.stata-journal.com/articl...article=gr0025

          Comment


          • #6
            See the concurrent thread https://www.statalist.org/forums/for...-99-categories for much more on getting means of an indicator variable as a proportion or percent.

            Comment


            • #7
              Chris and Nick, thank you both so much for your help. The last link shared by Nick was exceptionally helpful.

              Chris: I'm not sure how much more data you were thinking about, but here is a bit over a year:

              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input int date float(yr month events news)
              17167 2007  1 0 0
              17168 2007  1 0 0
              17169 2007  1 0 0
              17170 2007  1 0 0
              17171 2007  1 0 0
              17174 2007  1 0 0
              17175 2007  1 0 0
              17176 2007  1 0 0
              17177 2007  1 0 0
              17178 2007  1 0 0
              17181 2007  1 0 0
              17182 2007  1 0 0
              17183 2007  1 0 0
              17184 2007  1 0 0
              17185 2007  1 0 0
              17188 2007  1 0 0
              17189 2007  1 0 0
              17190 2007  1 0 0
              17191 2007  1 0 0
              17192 2007  1 0 0
              17195 2007  1 0 0
              17196 2007  1 0 0
              17197 2007  1 0 0
              17198 2007  2 0 0
              17199 2007  2 0 0
              17202 2007  2 0 0
              17203 2007  2 0 0
              17204 2007  2 0 0
              17205 2007  2 0 0
              17206 2007  2 0 0
              17209 2007  2 0 0
              17210 2007  2 0 0
              17211 2007  2 0 0
              17212 2007  2 0 0
              17213 2007  2 0 0
              17216 2007  2 0 0
              17217 2007  2 0 0
              17218 2007  2 0 0
              17219 2007  2 0 0
              17220 2007  2 0 0
              17223 2007  2 0 0
              17224 2007  2 0 0
              17225 2007  2 0 0
              17226 2007  3 0 0
              17227 2007  3 0 0
              17230 2007  3 0 0
              17231 2007  3 0 0
              17232 2007  3 0 0
              17233 2007  3 0 0
              17234 2007  3 0 0
              17237 2007  3 0 0
              17238 2007  3 1 1
              17239 2007  3 0 0
              17240 2007  3 0 0
              17241 2007  3 0 0
              17244 2007  3 0 0
              17245 2007  3 0 0
              17246 2007  3 0 0
              17247 2007  3 0 0
              17248 2007  3 0 0
              17251 2007  3 0 0
              17252 2007  3 0 0
              17253 2007  3 0 0
              17254 2007  3 0 0
              17255 2007  3 0 0
              17258 2007  4 0 0
              17259 2007  4 0 0
              17260 2007  4 0 0
              17261 2007  4 0 0
              17262 2007  4 1 1
              17265 2007  4 1 1
              17266 2007  4 0 0
              17267 2007  4 0 0
              17268 2007  4 0 0
              17269 2007  4 1 0
              17272 2007  4 1 0
              17273 2007  4 1 0
              17274 2007  4 0 0
              17275 2007  4 0 0
              17276 2007  4 0 0
              17279 2007  4 0 0
              17280 2007  4 0 0
              17281 2007  4 0 0
              17282 2007  4 0 0
              17283 2007  4 0 0
              17286 2007  4 0 0
              17287 2007  5 1 1
              17288 2007  5 0 0
              17289 2007  5 0 0
              17290 2007  5 0 0
              17293 2007  5 0 0
              17294 2007  5 0 0
              17295 2007  5 1 1
              17296 2007  5 1 1
              17297 2007  5 1 1
              17300 2007  5 1 1
              17301 2007  5 0 0
              17302 2007  5 0 0
              17303 2007  5 0 0
              17304 2007  5 0 0
              17307 2007  5 0 0
              17308 2007  5 0 0
              17309 2007  5 0 1
              17310 2007  5 0 1
              17311 2007  5 0 1
              17314 2007  5 0 1
              17315 2007  5 0 1
              17316 2007  5 0 1
              17317 2007  5 0 1
              17318 2007  6 1 1
              17321 2007  6 1 1
              17322 2007  6 1 1
              17323 2007  6 1 1
              17324 2007  6 1 1
              17325 2007  6 1 1
              17328 2007  6 1 1
              17329 2007  6 1 1
              17330 2007  6 0 0
              17331 2007  6 0 0
              17332 2007  6 0 0
              17335 2007  6 0 0
              17336 2007  6 0 0
              17337 2007  6 0 0
              17338 2007  6 0 0
              17339 2007  6 0 0
              17342 2007  6 0 0
              17343 2007  6 0 0
              17344 2007  6 0 0
              17345 2007  6 0 0
              17346 2007  6 0 0
              17349 2007  7 0 0
              17350 2007  7 0 0
              17351 2007  7 0 0
              17352 2007  7 0 0
              17353 2007  7 0 0
              17356 2007  7 0 0
              17357 2007  7 0 0
              17358 2007  7 0 0
              17359 2007  7 0 0
              17360 2007  7 0 0
              17363 2007  7 0 0
              17364 2007  7 0 0
              17365 2007  7 0 0
              17366 2007  7 0 0
              17367 2007  7 0 0
              17370 2007  7 0 0
              17371 2007  7 0 0
              17372 2007  7 0 0
              17373 2007  7 0 0
              17374 2007  7 0 0
              17377 2007  7 0 0
              17378 2007  7 0 0
              17379 2007  8 0 0
              17380 2007  8 0 0
              17381 2007  8 0 0
              17384 2007  8 0 0
              17385 2007  8 0 0
              17386 2007  8 0 0
              17387 2007  8 0 0
              17388 2007  8 0 0
              17391 2007  8 0 0
              17392 2007  8 0 0
              17393 2007  8 0 0
              17394 2007  8 0 0
              17395 2007  8 0 0
              17398 2007  8 0 0
              17399 2007  8 0 0
              17400 2007  8 0 0
              17401 2007  8 0 0
              17402 2007  8 1 0
              17405 2007  8 1 0
              17406 2007  8 1 0
              17407 2007  8 0 0
              17408 2007  8 0 0
              17409 2007  8 0 0
              17412 2007  9 0 0
              17413 2007  9 0 0
              17414 2007  9 0 0
              17415 2007  9 0 0
              17416 2007  9 0 0
              17419 2007  9 0 0
              17420 2007  9 0 0
              17421 2007  9 0 0
              17422 2007  9 0 0
              17423 2007  9 0 0
              17426 2007  9 0 0
              17427 2007  9 0 0
              17428 2007  9 0 0
              17429 2007  9 0 0
              17430 2007  9 0 0
              17433 2007  9 0 0
              17434 2007  9 0 0
              17435 2007  9 0 0
              17436 2007  9 0 0
              17437 2007  9 0 0
              17440 2007 10 0 0
              17441 2007 10 0 0
              17442 2007 10 0 0
              17443 2007 10 0 0
              17444 2007 10 0 1
              17447 2007 10 0 1
              17448 2007 10 0 0
              17449 2007 10 0 0
              17450 2007 10 0 0
              17451 2007 10 0 0
              17454 2007 10 0 0
              17455 2007 10 0 0
              17456 2007 10 0 0
              17457 2007 10 0 0
              17458 2007 10 0 0
              17461 2007 10 0 0
              17462 2007 10 0 0
              17463 2007 10 0 0
              17464 2007 10 0 0
              17465 2007 10 0 0
              17468 2007 10 0 0
              17469 2007 10 0 0
              17470 2007 10 0 0
              17471 2007 11 0 0
              17472 2007 11 0 0
              17475 2007 11 0 0
              17476 2007 11 0 0
              17477 2007 11 0 0
              17478 2007 11 0 0
              17479 2007 11 0 0
              17482 2007 11 0 0
              17483 2007 11 0 0
              17484 2007 11 0 0
              17485 2007 11 0 0
              17486 2007 11 0 0
              17489 2007 11 0 0
              17490 2007 11 0 0
              17491 2007 11 0 0
              17492 2007 11 0 0
              17493 2007 11 0 0
              17496 2007 11 0 0
              17497 2007 11 0 0
              17498 2007 11 0 0
              17499 2007 11 0 0
              17500 2007 11 0 0
              17503 2007 12 0 0
              17504 2007 12 0 0
              17505 2007 12 1 0
              17506 2007 12 1 0
              17507 2007 12 1 0
              17510 2007 12 1 0
              17511 2007 12 0 0
              17512 2007 12 0 0
              17513 2007 12 0 0
              17514 2007 12 1 0
              17517 2007 12 1 0
              17518 2007 12 1 0
              17519 2007 12 1 0
              17520 2007 12 0 0
              17521 2007 12 1 1
              17524 2007 12 1 1
              17525 2007 12 1 1
              17526 2007 12 1 1
              17527 2007 12 1 1
              17528 2007 12 1 1
              17531 2007 12 0 0
              17532 2008  1 1 1
              17533 2008  1 0 0
              17534 2008  1 0 0
              17535 2008  1 0 0
              17538 2008  1 0 0
              17539 2008  1 0 0
              17540 2008  1 0 0
              17541 2008  1 0 0
              17542 2008  1 0 0
              17545 2008  1 0 0
              17546 2008  1 0 0
              17547 2008  1 0 0
              17548 2008  1 0 0
              17549 2008  1 0 0
              17552 2008  1 0 0
              17553 2008  1 0 0
              17554 2008  1 0 0
              17555 2008  1 0 0
              17556 2008  1 0 0
              17559 2008  1 0 0
              17560 2008  1 0 0
              17561 2008  1 0 0
              17562 2008  1 0 0
              17563 2008  2 0 0
              17566 2008  2 0 0
              17567 2008  2 0 0
              17568 2008  2 0 0
              17569 2008  2 0 0
              17570 2008  2 0 0
              17573 2008  2 0 0
              17574 2008  2 0 0
              17575 2008  2 0 0
              17576 2008  2 0 0
              17577 2008  2 0 0
              17580 2008  2 0 0
              17581 2008  2 0 0
              17582 2008  2 0 0
              17583 2008  2 0 0
              17584 2008  2 0 0
              17587 2008  2 0 0
              17588 2008  2 0 0
              17589 2008  2 0 0
              17590 2008  2 0 0
              17591 2008  2 0 0
              17594 2008  3 0 0
              17595 2008  3 0 0
              17596 2008  3 0 0
              17597 2008  3 0 0
              17598 2008  3 0 0
              17601 2008  3 0 0
              17602 2008  3 0 0
              17603 2008  3 0 0
              17604 2008  3 0 0
              17605 2008  3 0 0
              17608 2008  3 0 0
              17609 2008  3 0 0
              17610 2008  3 0 0
              17611 2008  3 0 0
              17612 2008  3 1 1
              17615 2008  3 1 1
              17616 2008  3 0 0
              17617 2008  3 0 0
              17618 2008  3 0 0
              17619 2008  3 0 0
              17622 2008  3 0 0
              17623 2008  4 0 1
              17624 2008  4 0 1
              17625 2008  4 0 1
              17626 2008  4 0 1
              17629 2008  4 0 1
              17630 2008  4 0 1
              17631 2008  4 0 1
              17632 2008  4 0 1
              17633 2008  4 0 1
              17636 2008  4 0 1
              17637 2008  4 1 1
              17638 2008  4 1 1
              17639 2008  4 1 1
              17640 2008  4 1 1
              17643 2008  4 1 1
              17644 2008  4 1 1
              17645 2008  4 1 1
              17646 2008  4 0 1
              17647 2008  4 0 1
              17650 2008  4 0 1
              17651 2008  4 0 0
              17652 2008  4 0 0
              17653 2008  5 1 1
              17654 2008  5 0 1
              17657 2008  5 0 0
              17658 2008  5 0 0
              17659 2008  5 0 0
              17660 2008  5 0 0
              17661 2008  5 0 0
              17664 2008  5 0 0
              17665 2008  5 0 0
              17666 2008  5 0 0
              17667 2008  5 0 0
              17668 2008  5 0 0
              17671 2008  5 0 0
              17672 2008  5 0 0
              17673 2008  5 0 0
              17674 2008  5 0 0
              17675 2008  5 0 0
              end
              format %tdnn/dd/CCYY date
              Any help on the code you could give me would be greatly appreciated really.

              Thank you once again.

              Comment


              • #8
                That's really helpful. How does the below work for you?

                Code:
                preserve
                //Collapse data
                collapse events news, by(yr month)
                //Generate a month-year variable
                gen mdate = ym(yr, month)
                format mdate %tm 
                
                //Plot data -- if you want a data point for each year-month combination
                tw (line events mdate) (line news mdate), ///
                    xlab(#16, angle(25)) xti("") ylab(,format(%9.1f)) ///
                    yti(proportion of events/news in each month) ///
                    leg(lab(1 "Events") lab(2 "News")) name(month_year, replace) scheme(cleanplots)
                
                restore
                
                preserve 
                //Collapse data
                collapse events news, by(month)
                
                //Plot data -- if you want a data point for each month independent of year
                tw (line events month) (line news month), ///
                    xlab(#12) xti("") ylab(,format(%9.1f)) ///
                    yti(proportion of events/news in each month) ///
                    xlab(1"Jan" 2"Feb" 3"Mar" 4"Apr" 5"May" 6"Jun" 7"Jul" 8"Aug" 9"Sep" 10"Oct" 11"Nov" 12"Dec") ///
                    leg(lab(1 "Events") lab(2 "News")) name(month, replace) scheme(cleanplots)    
                
                restore

                Comment


                • #9
                  This works perfectly, thank you so much!!!

                  Comment

                  Working...
                  X