Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • stata histogram with weight (NOT integer)

    Hello Statalist colleagues,

    I am trying to draw histograms with weights, but my weight variables are decimals, not integers. So I don't think these are frequency weights (integers).

    Q1 could you please let me know how I can draw histograms in stata with these decimal weights?

    Following is the sample data I have, and the code I use.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 industry double(group total frac2 year weight)
    "D" 1 1                  0 2013 .024959253996062778
    "F" 1 1                  0 2013   3.193310095417693
    "D" 1 2                  0 2013 .024959253996062778
    "C" 1 4                  0 2013  1.9570856152388283
    "C" 1 4                  0 2013  1.9570856152388283
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "D" 1 5                  0 2013 .024959253996062778
    "D" 1 5                  0 2013 .024959253996062778
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "A" 1 5                  0 2013  .05989453080455009
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "B" 1 5                  0 2013   .0930922210972081
    "C" 1 5 .20227818884757595 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "A" 1 5                  0 2013  .05989453080455009
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "F" 1 5  .6000000000000001 2013   3.193310095417693
    "C" 1 5 .18620077946143354 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5 .20816900214545864 2013  1.9570856152388283
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5 .19533739945932332 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5  .7757481281975991 2013  1.9570856152388283
    "D" 1 5                  0 2013 .024959253996062778
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "D" 1 5                  0 2013 .024959253996062778
    "A" 1 5                  0 2013  .05989453080455009
    "C" 1 5                  0 2013  1.9570856152388283
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "B" 1 5                  0 2013   .0930922210972081
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "D" 1 5                  0 2013 .024959253996062778
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "A" 1 5  .2184818581010063 2013  .05989453080455009
    "E" 1 5                  0 2013   .8025529376696152
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5                  0 2013  1.9570856152388283
    "F" 1 5                  0 2013   3.193310095417693
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5                  0 2013  1.9570856152388283
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "C" 1 5                  0 2013  1.9570856152388283
    "A" 1 5                  0 2013  .05989453080455009
    "F" 1 5                  0 2013   3.193310095417693
    "B" 1 5                  0 2013   .0930922210972081
    "C" 1 5                  0 2013  1.9570856152388283
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5                  0 2013  1.9570856152388283
    "F" 1 5                  0 2013   3.193310095417693
    "C" 1 5                  0 2013  1.9570856152388283
    end

    Code I use is as follows.

    Q2. Also, additionally, it would be helpful to know if you know how I can do the iterative drawing job with 'forvalues' loop. I didn't want to make the histogram plots all compact, so am manually dividing industries using 'keep if', but was wondering if this can be done more easily with for loops.

    Code:
    use mw_exposure.dta, clear
    
    keep if year == 2013
    keep if industry == "A" | industry == "B" | industry == "C" | industry == "D" | industry == "E" | industry == "F"
    histogram frac2, xtitle(frac2) by(year industry)
    
    use mw_exposure.dta, clear
    
    keep if year == 2013
    keep if industry == "G" | industry == "H" | industry == "I" | industry == "J" | industry == "K" | industry == "L"
    histogram frac2, xtitle(frac2) by(year industry)
    Thanks!

  • #2
    Q2 is moot if Q1 is impossible, as it is directly because histogram only supports frequency weights.

    That aside, this is how I would start on Q2. I would not want to go round and round on keep and save.

    Code:
    use mw_exposure.dta, clear
    
    histogram frac2 if year == 2013 & inlist(industry, "A", "B", "C", "D", "E", "F"), xtitle(frac2) by(year industry)
    There is nothing wrong with that code in #1. It's just that I think experienced users would lean on if there.

    I would look at kdensity here as supporting weights more flexibly. If you need a histogram, strongly, I suspect you need to set up the basic calculations yourself and feed the result to twoway bar.

    Comment


    • #3
      Hi Nick Cox,
      thanks for the reply with great suggestions. I would look at kdensity!

      Thanks again.

      Comment


      • #4
        Here's indicative code for a do-it-yourself histogram based on weights. You must decide first on a bin width and then calculate what you want to show as based on total weights for each bin and total weights for each graph. The calculation for percents or densities are easy variations on that for fractions.

        For more on binning, see https://journals.sagepub.com/doi/pdf...867X1801800311

        With more industries and more years, you might run into further problems of a different kind.


        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str1 industry double(group total frac2 year weight)
        "D" 1 1                  0 2013 .024959253996062778
        "F" 1 1                  0 2013   3.193310095417693
        "D" 1 2                  0 2013 .024959253996062778
        "C" 1 4                  0 2013  1.9570856152388283
        "C" 1 4                  0 2013  1.9570856152388283
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "D" 1 5                  0 2013 .024959253996062778
        "D" 1 5                  0 2013 .024959253996062778
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "A" 1 5                  0 2013  .05989453080455009
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "B" 1 5                  0 2013   .0930922210972081
        "C" 1 5 .20227818884757595 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "A" 1 5                  0 2013  .05989453080455009
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "F" 1 5  .6000000000000001 2013   3.193310095417693
        "C" 1 5 .18620077946143354 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5 .20816900214545864 2013  1.9570856152388283
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5 .19533739945932332 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5  .7757481281975991 2013  1.9570856152388283
        "D" 1 5                  0 2013 .024959253996062778
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "D" 1 5                  0 2013 .024959253996062778
        "A" 1 5                  0 2013  .05989453080455009
        "C" 1 5                  0 2013  1.9570856152388283
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "B" 1 5                  0 2013   .0930922210972081
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "D" 1 5                  0 2013 .024959253996062778
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "A" 1 5  .2184818581010063 2013  .05989453080455009
        "E" 1 5                  0 2013   .8025529376696152
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5                  0 2013  1.9570856152388283
        "F" 1 5                  0 2013   3.193310095417693
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5                  0 2013  1.9570856152388283
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "C" 1 5                  0 2013  1.9570856152388283
        "A" 1 5                  0 2013  .05989453080455009
        "F" 1 5                  0 2013   3.193310095417693
        "B" 1 5                  0 2013   .0930922210972081
        "C" 1 5                  0 2013  1.9570856152388283
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5                  0 2013  1.9570856152388283
        "F" 1 5                  0 2013   3.193310095417693
        "C" 1 5                  0 2013  1.9570856152388283
        end
        
        preserve
        
        * bun width 0.1
        gen bin = floor(10 * frac)
        bysort industry year bin : egen double bin_total = total(weight)
        by industry year : egen double plot_total = total(weight)
        gen fraction = bin_total / plot_total
        
        gen toshow = (bin / 10) +  0.05
        set scheme s1color
        twoway bar fraction toshow , xtitle(sensible name for frac) barw(0.1) by(industry year) bstyle(histogram)
        Click image for larger version

Name:	histogram2.png
Views:	1
Size:	25.0 KB
ID:	1665192

        Last edited by Nick Cox; 18 May 2022, 09:21.

        Comment

        Working...
        X