Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Export data in histogram

    I am new to stata (I am using version Stata/IC 12.1 for Mac, 64-bit Intel) and succeeded in producing a histogram, for the record I am using
    Code:
    histogram residual, bin(400) percent normal
    It produces a nice figure but I need the raw data because they have to be put into a LaTeX file (and constructed again using tikz). I did not find a way to export the data behind the graph (I do not need the normal distribution but this would be nice). Could anybody help me in this matter?

  • #2
    PS I need the raw numbers, not any graphical format as ps, eps, tiff etc.

    Comment


    • #3
      There are two ways to do this.

      One is to use standard Stata techniques to recreate the binning done to create the histogram, and the x-y pairs for the normal curve.

      The other is to use a relatively obscure Stata programming feature called (for reasons I haven't yet learned) serset. It is documented in the output of help serset and in the Stata Programming Reference PDF linked from that help. Perhaps this example will point you in a helpful direction.
      Code:
      sysuse auto, clear
      histogram price, bin(10) percent normal
      serset dir
      serset set 0
      serset use, clear
      ds
      rename (__000000 __000002) (percent center)
      save bars, replace
      list, clean
      serset set 1
      serset use, clear
      ds
      rename (__000003 __000004) (y x)
      list in 1/5, clean
      list in 296/300, clean
      save normal, replace
      Code:
      . sysuse auto, clear
      (1978 automobile data)
      
      . histogram price, bin(10) percent normal
      (bin=10, start=3291, width=1261.5)
      
      . serset dir
      
        0.  11 observations on 3 variables
            __00000A __00000B __000009
      
        1.  300 observations on 2 variables
            __00000F __00000E
      
      . serset set 0
      
      . serset use, clear
      
      . ds
      __000000  __000001  __000002
      
      . rename (__000000 __000002) (percent center)
      
      . save bars, replace
      file bars.dta saved
      
      . list, clean
      
             percent   __000001     center  
        1.     37.84          0    3,921.8  
        2.     28.38          0    5,183.3  
        3.     12.16          0    6,444.8  
        4.     4.054          0    7,706.3  
        5.     1.351          0    8,967.8  
        6.     5.405          0     10,229  
        7.     4.054          0     11,491  
        8.     1.351          0     12,752  
        9.     4.054          0     14,014  
       10.     1.351          0     15,275  
       11.         .          0      3,291  
      
      . serset set 1
      
      . serset use, clear
      
      . ds
      __000003  __000004
      
      . rename (__000003 __000004) (y x)
      
      . list in 1/5, clean
      
                     y           x  
        1.   10.613032        3291  
        2.   10.760907   3333.1906  
        3.    10.90861   3375.3813  
        4.   11.056078   3417.5719  
        5.   11.203246   3459.7625  
      
      . list in 296/300, clean
      
                     y           x  
      296.   .08811901   15737.237  
      297.   .08411325   15779.428  
      298.   .08027316   15821.619  
      299.   .07659272   15863.809  
      300.   .07306606       15906  
      
      . save normal, replace
      file normal.dta saved
      
      .
      Note that sersets, like local macros, vanish when the do-file within which they were created comes to an end.

      And if you noted that the variable names given by serset dir do not match the variable names created when the serset is used, you get extra credit. I don't know why that is, but I haven't spent any effort trying to figure out why, either. I will admit to building this example one or two lines at a time, hacking at it until it worked the way I needed. Maybe I need to spend some time with that PDF I recommended.
      Last edited by William Lisowski; 03 Oct 2021, 13:47.

      Comment


      • #4
        Another way to proceed is via

        Code:
        help twoway__histogram_gen
        Watch the underscores carefully: that's twoway underscore underscore histogram underscore gen

        As said, see the help and/or https://www.stata-journal.com/articl...article=gr0014

        Comment


        • #5
          Originally posted by William Lisowski View Post
          Stata programming feature called (for reasons I haven't yet learned) serset
          Excuse me for being a foreigner but I lack the appropriate feeling for the english language. Is serset an abbreviation? A artificial word? How does it "sound" for someone who grew up speaking english?

          Comment


          • #6
            see
            Code:
            help serset
            re: pronunciation - I am unaware of any official way to say this - I believe that this is a word made up by StataCorp

            Comment


            • #7
              I think you could pretend you’re saying series and then change your mind halfway though and make it serset. As Rich says it’s a Stata neologism like varlist or varname.

              Comment


              • #8
                We're all foreigners when it comes to speaking Stata. I don't know anyone who would enter "Stata" as the answer to "what was your first spoken language" on a survey.

                I must admit to having no idea what the "word" serset means. The documentation explains a serset as being like a dataset but used, primarily, for the creation of Stata graphics. With that in mind, I will hazard a guess that it may be an artifical word constructed as a shortening of serviceset. And based on that, what I hear in my head when I read it is "sir set".

                But none of the above is authoritative, and nobody should pay attention to the voices in my head.

                If Nick Cox or one of the Stata developers who occasionally contribute here cannot explain serset more thoroughly, with a historical background and intended pronunciation, then I'm not sure there is an answer.

                Added in edit: crossed with Nick's post. The implication for pronunciation is that if I thought about it as seriesset then the voice in my head would sound like "sear set".
                Last edited by William Lisowski; 22 Oct 2021, 13:47.

                Comment


                • #9
                  The pronunciation sear set matches what I remember from talks I heard from Stata developers.

                  I remember advising against transmorphic on the grounds that metamorphic already existed, but the good folks at StataCorp went ahead any way. Hyper-purists who get queasy at Latin-Greek or Greek-Latin hybrids never got over television.




                  Comment

                  Working...
                  X