Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Connect each bar in histogram graph to become line graph

    Dear Stata expert,

    I have a dataset of discrete prices of five product (each product has roughly 12 million observations), and I would like to
    (1) generate histogram (percent) of price of each product
    (2) convert histogram to line (which is to connect the center of the top of each bar)
    (3) overlay lines by product type

    I know I can either:
    (1) generate and overlay histogram first, and using "visual graphic editor" to change histogram to line; or
    (2) generate a frequency variable and graph twoway line.

    However I am wondering if there's code that can help me directly changing histogram to line graph?

    Any help is greatly appreciated.

    CHT

  • #2
    Chen-Hao: I'm not sure I am fully understanding your question, but might something like this work (assume your discrete price variable is named dx):
    Code:
    sort dx
    by dx: egen cx=count(dx~=.)
    gen px=cx/_N
    twoway  (scatter px dx, c(l) s(i)) (dropline px dx, base(0) s(i))
    This is not very elegant. You would want to be sure you handled missing values in the calculations, and you would perhaps have to play around with axis label options in the twoway command.

    Comment


    • #3
      I guess what is wanted is what is often called a frequency polygon. My good news is that an option -recast(line)- may be enough. My bad news is that this won't be smart on your behalf about empty bins inside the histogram. But both are guesses based on the last time I tried it,

      Comment


      • #4
        If your goal is to show the distribution of prices for all products in one graph, have you considered histogram, by()?
        Code:
        histogram price, by(product)

        Comment


        • #5
          For examples of both behaviours mentioned in #3 see


          Code:
          sysuse auto, clear
          twoway histogram mpg, discrete  || histogram mpg, discrete recast(line) lc(red)

          Comment


          • #6
            Nick Cox Can someone do what you describe in #3 with a bar graph instead of a histogram?

            Comment


            • #7
              Belinda Foster

              graph bar won't make it easier at all. You can't recast graph bar results as line graphs.

              twoway bar isn't different in this respect, so far as I can see.

              I am not fond of frequency polygons: the bins are harder to infer than on histograms, and if the interest is in density as continuously varying, then why not kernel density estmation? But if anyone is serious that they want one, then they have to fill in bins with zero explicitly

              Here's one method. If the bins aren't successive integers, then I'd map them to successive integers.

              Code:
              sysuse auto, clear
              
              * observed values 
              contract mpg, freq(Frequency) 
              su mpg, meanonly 
              local max = r(max)
              local min = r(min) 
              levelsof mpg, local(levels)
              
              * add extra bins with zero frequency 
              set obs `= `max' - `min' + 1'
              numlist "`min'/`max'"
              local numlist `r(numlist)'
              local needed : list numlist - levels
              local added : word count `needed' 
                       
              tokenize "`needed'"
              
              quietly forval i = 1/`added' { 
                 local which = _N - `i' + 1 
                 replace mpg = ``i'' in `which' 
               } 
              
               replace Frequency = 0 if missing(Frequency) 
               line Frequency mpg , sort xtic(`min'/`max')

              Comment


              • #8
                Thanks Nick, I was just curious as people seem to ask about it. I agree, kernel density estimation is easier to implement.

                Comment


                • #9
                  @ John and @ Nick

                  Thank you for your kindly help. I apologize not describing my question clearer.
                  What I want to achieve is:

                  Step 1:
                  I use the code:
                  twoway histogram price, discrete width(0.25) percent
                  to get the the following graph.

                  Click image for larger version

Name:	histogram.png
Views:	1
Size:	22.1 KB
ID:	1394189



                  Step 2: I use stata "graph editor" to convert the bar to line as below

                  Click image for larger version

Name:	step2.png
Views:	1
Size:	79.2 KB
ID:	1394190


                  Click image for larger version

Name:	line.png
Views:	1
Size:	67.0 KB
ID:	1394191


                  I understand that John's suggestion is to calculate the percentage of each price width, and make twoway graph of this new variable.
                  I am wondering if there is code that I can use to achieve the same thing of using graph editor as I showed above (I understand using kdensity can give me something very similar).

                  Thanks again for all suggestions and help.

                  Comment


                  • #10
                    Originally posted by Chen-Hao Tsai View Post
                    I am wondering if there is code that I can use to achieve the same thing of using graph editor as I showed above
                    You can use gr_edit, a command that is not documented. Let's start with a histogram.
                    Code:
                    sysuse auto
                    twoway histogram mpg, discrete percent
                    Start the Graph Editor, click the "Start recording" icon and change the plot type to "Line". Click the "End recording" icon and save the recording on your PC. Open the saved grec file in a text editor. It contains this line:
                    Code:
                    .plotregion1.plot1._set_type line
                    This modification of the graph can be integrated in a do-file with the gr_edit command, followed by the line above.
                    Code:
                    gr_edit .plotregion1.plot1._set_type line
                    You can now draw a histogram and convert it to a line graph with these commands, without using the Graph Editor:
                    Code:
                    sysuse auto
                    twoway histogram mpg, discrete percent
                    gr_edit .plotregion1.plot1._set_type line

                    Comment


                    • #11
                      @ Friedrich et al

                      Thank you for your time and kindly help.

                      A follow-up question:

                      When I use all observations to generate the histogram, can I only display the histogram for prices below $50?
                      It's like cutting the long right tail shorter in the above graph.

                      CHT
                      Last edited by Chen-Hao Tsai; 23 May 2017, 10:25.

                      Comment


                      • #12
                        @ Friedrich

                        Sorry to bother with another question.

                        Where does Stata save grec file? I carefully recorded each step of using Graph Editor but can not find the saved grec file.

                        Thanks again for your help.

                        CHT

                        Comment


                        • #13
                          Originally posted by Friedrich Huebler View Post
                          Click the "End recording" icon and save the recording on your PC.
                          When you come to this step you can specify where the grec file is saved.

                          Comment

                          Working...
                          X