Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • egen rowtotal of squared proportions?

    Hi all,

    I'm trying to make variable for the is basically a sum of squared proportions (a Berry Diversification Index: https://de.wikipedia.org/wiki/Berry-Index). I was wondering if there is an egen command I could use to speed up this process. It's a bit tedious doing it manually (as you can see in the attatched screen-shot), any suggestions would be appreciated.
    Click image for larger version

Name:	Screen Shot 2018-07-04 at 1.01.41 PM.png
Views:	1
Size:	108.7 KB
ID:	1451806


    Thanks
    -Bryan
    Bryan Gensits
    Royal University of Bhutan: College of Natural Resources

  • #2
    Please don't show code in images: use CODE delimiters around copied and pasted code as explained at FAQ Advice #12. Given your data layout I would probably recommend a different layout for most Stata purposes, but failing that something with this flavour should help. Your crops (?) and total cultivated ares (?) are here a b c d e f g and total.

    Code:
    gen double wanted = 1
    
    quietly foreach v in a b c d e f g {
          replace wanted = wanted - (`v'/total)^2
    } 
    That index (or its complement) goes back way beyond Herfindahl even, at least to Gini (whose name is attached to many different indexes).

    On working rowwise see also https://www.stata-journal.com/sjpdf....iclenum=pr0046

    Is there an egen function for directly summing squares of variables across observations? I doubt it. One could be written as almost identical to rowtotal() but it is better just to do the calculation as above.

    If you have say 7 or fewer categories, note that some variation on

    Code:
    gen wanted = 1 - (a/total)^2 - (b/total)^2
    is not outrageous.

    EDIT

    Berry, Charles H. 1971. Corporate growth and diversification. _The Journal of Law and Economics_ 14: 371-383.
    https://doi.org/10.1086/466714

    appears to be the grandfather reference.
    Last edited by Nick Cox; 04 Jul 2018, 01:52.

    Comment


    • #3
      Thanks Nick!
      Sorry about the picture, it's been a long time since I've been on here and am just getting back into Stata so I had forget about the built in code deliminater. In any event, the foreach command was a good suggestion. I have over 50 different categories so doing it the second way you suggested is a bit much. I'm really trying to streamline and clean-up my stata .do files as much as possible.

      In your first code, wanted would just wind up being replaced with the solution to the formula for whatever the final v is if I'm not mistaken. In this case:
      Code:
       1-(`g'/total)^2
      is or am I missing something obvious in your code that would sum it to generate the Index for each observation?

      Thanks again.

      Also, Berry definitely wouldn't have been my go-to, but I'm working on a more or less similar research project to a prior USDA study in Bhutan and they had used a Negitive Berry Index so I'm trying that out for the time being.

      Hellerstein, Daniel and Higgins, Nathaniel Alan and Horowitz, John, The Predictive Power of Risk Preference Measures for Farming Decisions (March 19, 2012). European Review of Agricultural Economics pp. 1–27 doi:10.1093/erae/jbs043
      Last edited by Bryan Gensits; 04 Jul 2018, 03:52.
      Bryan Gensits
      Royal University of Bhutan: College of Natural Resources

      Comment


      • #4
        You're right to query whatever looks wrong or dubious, but the loop looks fine to me on second survey.

        You start with values of 1 and then subtract each squared proportion in turn. So, the core statement is

        Code:
         replace wanted = wanted - (`v'/total)^2
        So. suppose in a toy example my proportions are 0.7, 0.2 and 0.1. I start with 1 and each time around the loop I subtract one of 0.49, 0.04 and 0.01.

        Nothing stops you initialising at 0, adding the squared proportions and then subtracting from 1.

        Comment


        • #5
          Ah, I got it now. I missed the loop, after the initial generation of the var wanted. Thanks for following up, this will definitely help me out as I'm getting back in the groove of Stata. Take it easy.
          Bryan Gensits
          Royal University of Bhutan: College of Natural Resources

          Comment

          Working...
          X