Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Density Plot : illisible chart, how to zoom to have a nicer chart?

    Hi everyone,

    I have created an illisible chart, in which the various graphics are superimposed and you can't really read them.

    How could I "zoom in" (for example: the x scale from 0 to 500,000, to see the density curves better).

    I've tried -xscale- but it doesn't seem to work.

    Also, I have another question: Is it possible to automate the label values to avoid the annoying manual way I use, where I type everything.

    Below is the code I used and the resulting graph. The code is inspired by the fantastic blog created by Fahad Mirza "Top 25 Stata Visualizations — With Full Code" on Medium.



    Code:
    encode(tariff_ekon_id), gen(tariff_ekon_id_)
    
     levelsof tariff_ekon_id_, local(tariff_type1)
     foreach typeof_tariff of local tariff_type1 {
      
      quietly summarize power_p1
      local kden_t "`kden_t' (kdensity power_p1 if tariff_ekon_id_ == `typeof_tariff', range(`r(min)' `r(max)') recast(area) fcolor(%50) lwidth(*0.25))"
        
     }
     
    
     
     twoway `kden_t',  scheme(white_tableau) ///
         legend(subtitle("Contract Types:", size(2)) label(1 "20A") label(2 "20DHA") label(3 "20DHS") label(4 "20TD") label(5 "21A") label(6 "21DHA") label(7 "21DHS") label(8 "30") ///
         label(9 "30TD") label(10 "31") label(11 "61A") label(12 "61TD") label(13 "62TD") label(14 "63TD") label(15 "64TD") rowgap(0.25) size(2)) ///
         title("{bf}Density Plot", pos(11) size(2.75)) ///
         ytitle("Density", size(2) orient(horizontal)) ///
         ylabel(, nogrid labsize(2)) ///
         xlabel(0(50000)900000, labsize(tiny) alternate nogrid format(%9.0fc)) ///
         xtitle("Contracted Powers", size(2)) ///
         subtitle("Tariff Types, 1{sup:st} Period", pos(11) size(2))
    
        
    graph export "../figures/distr_tariff_types_kdens.png", replace
    graph export "../figures/distr_tariff_types_kdens.pdf", replace
    Click image for larger version

Name:	distr_tariff_types_kdens.png
Views:	1
Size:	84.3 KB
ID:	1727887



    Thanks in advance for your help.
    Best,

    Michael
    Last edited by Michael Duarte Goncalves; 22 Sep 2023, 04:44.

  • #2
    There is a bundle of questions here on different levels. Your graph isn't based on a dataset you give or reference, so that limits mightily what can be said.

    The big question seems to be why is your main plot unreadable (= illisible), meaning not readable in a way that is helpful for analysis. The main answer I have is that you have 15 different groups that don't obviously differ that much. Perhaps some are quite rare -- but that's not clear either. It is unusual at the best of times that you can superimpose 15 different groups -- whether as points, lines or areas -- and then make easy and effective comparisons between those groups. Sometimes separate graphs for each group can help.

    I don't know what "contracted powers" are. Perhaps they must be shown as you do -- perhaps a transformation would make sense.

    Alternatively, or additionally, it can make sense to show density on a log scale. That way you give up on the area interpretation of probability density, but it can be a useful thing to do.

    I don't know what you expect xscale() to do and you don't show what you tried. Assuming that you specified the range() suboption, what it does do and does not do are documented clearly at
    Code:
    help axis scale options
    range() never narrows the scale of an axis or causes data to be omitted from the plot.
    So xscale(range()) can be used to extend the x axis: nothing more, nothing less. People somehow, sometimes, expect that it is a way to subset the data and provide a zoom on part of the graph, but that is pure fantasy.

    What you could is calculate the densities as new variables and then show part of the curves.

    I am a moderate fan of kernel density estimation, but its ability to make highly skewed distributions clearer is strongly limited.

    Comment


    • #3
      Thanks Nick Cox for the helpful tips. Just out of curiosity and to extend the original question, what if the data contains a lot of zero values?

      Also, you mentioned that you are not a fan of kernel density in this scenario, if i may ask, what will be the best method in your opinion for this kind of skewed data? I was thinking a bean plot maybe? But would love to hear from you on this.

      Thank you.

      Comment


      • #4
        Michael Duarte Goncalves thank you for citing, its good to hear the post was useful.

        Comment


        • #5
          Fahad Mirza

          If the data contain zeros, the only thing ruled out is straight logarithmic transformation. But for visualization log(whatever + 1) might still be helpful.

          Sorry, but I have no confident ideas on what will work best because I can't see the data and I don't have any information on what makes sense for "contracted powers".

          But my wild guess would be to try a quantile plot with possibly a front-and-back design. I can't see that bean plots would work any better.

          Comment


          • #6
            Hi Marcus
            one option is to use a command I created joy_plot.
            net install joy_plot, from(https://friosavila.github.io/stpackages)

            for example

            Code:
            ssc install frause
            net install fra_tools, from(https://friosavila.github.io/stpackages)
            
            frause oaxaca, clear
            vjoin married single divorced, name(mstatus)
            joy_plot lnwage, over(mstatus) color(%50) alegend notext gap0
            
            ** or with range
            
            joy_plot lnwage, over(mstatus) color(%50) alegend notext gap0 range(2 5)
            HTH
            Fernando

            Comment


            • #7
              Hi everyone,

              Nick Cox: Thank you for your feedback , and I apologise for the lack of detail. Yes, I had used xscale specifying the range, as follows in options: xscale(r(0(50000)900000).

              You have undone one of my beliefs, i.e. that -xscale- does not narrow the scale.

              Fahad Mirza: My plesure. In fact, I use a lot your Stata Viz. tips. Thanks to provide this to everyone. I hope that there will be more in a while.

              FernandoRios: Thank you very much for your command. I will try it.

              Thank you all for your help.

              Michael

              Comment


              • #8
                Nick Cox:

                Here a data example. Sorry, I forgot to put -dataex-.:

                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input float id_numerical long(tariff_ekon_id_ power_p1)
                  12089  4   0
                 547007 15   0
                 463980  4   0
                 167391  1   0
                 467938 15   0
                 727027  1   0
                 332979  4   0
                 573193 15   0
                 196748  4   0
                 369693  4   1
                1192145 12   1
                   6117 12   1
                 347416 12   1
                 976872 12   1
                1076215 12   1
                  88765 12   1
                 878192 11   1
                1095955 12   1
                 261637 12   1
                1008453 12   1
                 846853 12   1
                1064514  4   1
                 922230 12   1
                1236813 12   1
                1085856 12   1
                  66665  4   1
                 803598 11   1
                 542584 12   1
                 151802 12   1
                 853336 12   1
                 630905  4   1
                 223320  4   1
                 373646 11   1
                1244020  4   1
                 870891 12   1
                 775905  4   1
                 870793  4   1
                 304150 12   1
                 655872 12   1
                 527149  4   1
                 363069 12   1
                 410882 12   1
                 595716  4   1
                 604293 12   1
                 785459  4   1
                 131820  4   1
                 602225  4   2
                 754578  4   3
                 313726  4   3
                 832257  4   3
                 924145  4   3
                1074655  4   3
                  88766  4   3
                  22159  4   3
                1093714  4   4
                 231538  4   4
                1070575  4   4
                   6946  4   4
                 816495  4   4
                 317723  4   4
                 694927  4   4
                 787832  4   5
                 790012  4   5
                 688670 12  10
                  71616 12  10
                 402604 12  10
                 541007 12  10
                 623626 12  10
                 216999 12  10
                 637690 12  10
                1201430 12  10
                  12994 11  10
                 906167 12  10
                1034942 12  10
                 636757 12  10
                 660741 12  10
                1198526 12  10
                 527614 12  10
                 564077 12  10
                  41368 12  10
                 917189 12  10
                 458250 12  10
                 474135 11  10
                 953820 12  10
                 681857  4  14
                1146181 12  15
                1018475 12  15
                 631736 12  50
                 431043  4 100
                 224655  4 100
                 125897  4 100
                 461207  4 100
                 164922  4 100
                  80483  4 100
                1130413  4 100
                 359671  4 100
                 717726  4 100
                  30644  4 100
                 832842  4 100
                 306203  4 100
                end
                label values tariff_ekon_id_ tariff_ekon_id_
                label def tariff_ekon_id_ 1 "20A", modify
                label def tariff_ekon_id_ 4 "20TD", modify
                label def tariff_ekon_id_ 11 "30", modify
                label def tariff_ekon_id_ 12 "30TD", modify
                label def tariff_ekon_id_ 15 "61TD", modify
                Thank you.
                Michael
                Last edited by Michael Duarte Goncalves; 22 Sep 2023, 07:29.

                Comment


                • #9
                  Thanks for the data example, but I don’t understand its relation to your original question. What corresponds to contracted powers?

                  Comment


                  • #10
                    Contracted Powers are basically the power in kwh of household electricity during period 1 (basically a time slot during the day). Above, this is represented by -power_p1-.

                    My question was whether it was possible to obtain a graph where the values and patterns of the various electricity tariffs could be read more clearly.
                    My initial idea was to use -graph combine-. However, using -graph combine- with 20 different tariffs to represent contracted power in period 1 is not very useful I imagine.

                    And it gets even more confusing than my graph above, right? Sorry, I'm still inexperienced in stata world.
                    Last edited by Michael Duarte Goncalves; 22 Sep 2023, 09:08.

                    Comment


                    • #11
                      Thanks for the detail. Strictly, your units should be denoted by kWh -- noting that W stands for watt, itself named for the engineer Watt -- and those are units of energy, not power. But your data example only includes values up to 100 whereas if I understand correctly the values go up to about 900,000.

                      As you have values of zero, a pragmatic choice of scale might be log(energy + 1).

                      Here I use qplot and mylabels from the Stata Journal.

                      You need to work on the y axis labels for the full dataset. You might need to omit some of the less common tariffs.

                      Code:
                      * Example generated by -dataex-. For more info, type help dataex
                      clear
                      input float id_numerical long(tariff_ekon_id_ power_p1)
                        12089  4   0
                       547007 15   0
                       463980  4   0
                       167391  1   0
                       467938 15   0
                       727027  1   0
                       332979  4   0
                       573193 15   0
                       196748  4   0
                       369693  4   1
                      1192145 12   1
                         6117 12   1
                       347416 12   1
                       976872 12   1
                      1076215 12   1
                        88765 12   1
                       878192 11   1
                      1095955 12   1
                       261637 12   1
                      1008453 12   1
                       846853 12   1
                      1064514  4   1
                       922230 12   1
                      1236813 12   1
                      1085856 12   1
                        66665  4   1
                       803598 11   1
                       542584 12   1
                       151802 12   1
                       853336 12   1
                       630905  4   1
                       223320  4   1
                       373646 11   1
                      1244020  4   1
                       870891 12   1
                       775905  4   1
                       870793  4   1
                       304150 12   1
                       655872 12   1
                       527149  4   1
                       363069 12   1
                       410882 12   1
                       595716  4   1
                       604293 12   1
                       785459  4   1
                       131820  4   1
                       602225  4   2
                       754578  4   3
                       313726  4   3
                       832257  4   3
                       924145  4   3
                      1074655  4   3
                        88766  4   3
                        22159  4   3
                      1093714  4   4
                       231538  4   4
                      1070575  4   4
                         6946  4   4
                       816495  4   4
                       317723  4   4
                       694927  4   4
                       787832  4   5
                       790012  4   5
                       688670 12  10
                        71616 12  10
                       402604 12  10
                       541007 12  10
                       623626 12  10
                       216999 12  10
                       637690 12  10
                      1201430 12  10
                        12994 11  10
                       906167 12  10
                      1034942 12  10
                       636757 12  10
                       660741 12  10
                      1198526 12  10
                       527614 12  10
                       564077 12  10
                        41368 12  10
                       917189 12  10
                       458250 12  10
                       474135 11  10
                       953820 12  10
                       681857  4  14
                      1146181 12  15
                      1018475 12  15
                       631736 12  50
                       431043  4 100
                       224655  4 100
                       125897  4 100
                       461207  4 100
                       164922  4 100
                        80483  4 100
                      1130413  4 100
                       359671  4 100
                       717726  4 100
                        30644  4 100
                       832842  4 100
                       306203  4 100
                      end
                      label values tariff_ekon_id_ tariff_ekon_id_
                      label def tariff_ekon_id_ 1 "20A", modify
                      label def tariff_ekon_id_ 4 "20TD", modify
                      label def tariff_ekon_id_ 11 "30", modify
                      label def tariff_ekon_id_ 12 "30TD", modify
                      label def tariff_ekon_id_ 15 "61TD", modify
                      
                      gen toshow = log1p(power_p1)
                      label var toshow "energy (kWh; log1p scale)"
                      
                      mylabels 0 1 5 10 50 100, myscale(log1p(@)) local(yla)
                      
                      qplot toshow, by(tariff_ekon_id, note("")) yla(`yla') xla(0(0.25)1)
                      Click image for larger version

Name:	energy_qplot.png
Views:	1
Size:	40.0 KB
ID:	1727944

                      Comment


                      • #12
                        Hi Nick Cox,

                        It works very well! Beautiful, thanks.

                        Best,
                        Michael

                        Comment

                        Working...
                        X