Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Calculating Gini coefficients

    Hello everyone. I'm doing a project on inequality in regards to the distribution of tech. My data looks like this:
    Country Field Year Scientists
    Britain Eng. 2022 5
    Britain Med. 2022 20
    France Eng. 2022 7
    France Med. 2022 4
    Essentially I used the collapse command so the patents are broken up into country-field-year categories. I have been trying to calculate Gini coefficients by field but the repeated years and country names are posing a problem (I do need to keep them separate).

    How do I calculate inequality measures for each Field across all countries and years? I have tried using ineqdeco.

  • #2
    Gini's name has been attached to at least three quite different measures, most commonly in my reading in the context of income inequality. What would be the measure you want for your toy data example? Show us a hand calculation, perhaps.

    I would expect variations between countries and across time to be much of the interest!

    Comment


    • #3
      Code:
      clear all
      
      set obs 50
      g country = _n
      expand 2
      bys country: g field = _n
      expand 10
      bys country: g year = _n+1990
      
      xtset country year
      g scientists = int(runiform(0,20)) 
      
      g gini = .
      ineqdeco scientists if field == 1
      replace gini = r(gini) if field == 1
      
      ineqdeco scientists if field == 2
      replace gini = r(gini) if field == 2

      Comment


      • #4
        Originally posted by Nick Cox View Post
        Gini's name has been attached to at least three quite different measures, most commonly in my reading in the context of income inequality. What would be the measure you want for your toy data example? Show us a hand calculation, perhaps.

        I would expect variations between countries and across time to be much of the interest!
        Hello Dr. My goal is to find the Gini coefficient in relation to the number of scientists (not income as it is used usually). Essentially I am trying to see if the number of scientists in a particular field is concentrated in more countries than others.

        Once I get the field Gini sorted, I can do the same across years.

        Comment


        • #5
          Originally posted by George Ford View Post
          Code:
          clear all
          
          set obs 50
          g country = _n
          expand 2
          bys country: g field = _n
          expand 10
          bys country: g year = _n+1990
          
          xtset country year
          g scientists = int(runiform(0,20))
          
          g gini = .
          ineqdeco scientists if field == 1
          replace gini = r(gini) if field == 1
          
          ineqdeco scientists if field == 2
          replace gini = r(gini) if field == 2
          Hello Dr. This seems like it will work but I do have a trivial question. I have quite a few fields to do this for. Is there any way to automate the process?

          Comment


          • #6
            Apply -levelsof- to the field variable and put the values into a local macro. Then use George's code within a loop over the contents of the local. (-forval-)

            Comment


            • #7
              What you want to do still remains unclear to me.

              You’ve said that you don’t regard your measure as an income or like it but you seem to warm to George’s suggestion which follows that path.

              You haven’t picked up my suggestion of a hand calculation.

              Comment


              • #8
                Following up on Nick's comment, I note that my code is purely mechanical. I have not considered whether what you're up to makes sense.

                It looks like what you're after is a measure of inequality in the distribution of scientists in different fields. Do some countries have high concentrations of scientists in particular fields, rather than a good mix.

                Comment


                • #9
                  Originally posted by Stephen Jenkins View Post
                  Apply -levelsof- to the field variable and put the values into a local macro. Then use George's code within a loop over the contents of the local. (-forval-)
                  Thank you for your help.

                  Comment


                  • #10
                    Originally posted by George Ford View Post
                    Following up on Nick's comment, I note that my code is purely mechanical. I have not considered whether what you're up to makes sense.

                    It looks like what you're after is a measure of inequality in the distribution of scientists in different fields. Do some countries have high concentrations of scientists in particular fields, rather than a good mix.
                    Bingo. You explained my motive much better than I could have. Thank you for your help.
                    Last edited by John Atanasios; 31 Jul 2024, 11:29.

                    Comment


                    • #11
                      If you do accept George's phrasing of your goal, I wonder whether a measure of inequality is what you want. Here's my thinking: Presumably all countries would have more concentration of scientists in particular fields, so that within country-year, an equal or uniform distribution would not be typical. Let's say, for example, that it's "ordinary/typical/desirable" for countries to have higher percentages of their scientists in Engineering than Medicine. On that view, I'd say that what you might want to know is the extent to which each country's percentage distribution matches up with some standard" or ideal distribution. That standard might be posited on a theoretical basis, or you might use the overall distribution across all countries as the standard, or you might take the percentage scientist distribution collapsed across some selection of countries known to be "good." With that standard distribution, each country could have a measure that is (for example) the sum of absolute differences in percentage between its distribution and the standard. If something like that fits what you want, that kind of calculation would be easier to do, I think, with the data arranged in a so-called wide layout for each country-year.

                      The other possibility here -- one that *is* oriented to inequality -- would be to regard the total pool of scientists in each field but across all countries as a pooling of a valued resource, the distribution of which could be analyzed for inequality in the same way a pool of income could. I would think that you might want some kind of norming here for country population, i.e., per capita scientist count.

                      Comment


                      • #12
                        Originally posted by Mike Lacy View Post
                        If you do accept George's phrasing of your goal, I wonder whether a measure of inequality is what you want. Here's my thinking: Presumably all countries would have more concentration of scientists in particular fields, so that within country-year, an equal or uniform distribution would not be typical. Let's say, for example, that it's "ordinary/typical/desirable" for countries to have higher percentages of their scientists in Engineering than Medicine. On that view, I'd say that what you might want to know is the extent to which each country's percentage distribution matches up with some standard" or ideal distribution. That standard might be posited on a theoretical basis, or you might use the overall distribution across all countries as the standard, or you might take the percentage scientist distribution collapsed across some selection of countries known to be "good." With that standard distribution, each country could have a measure that is (for example) the sum of absolute differences in percentage between its distribution and the standard. If something like that fits what you want, that kind of calculation would be easier to do, I think, with the data arranged in a so-called wide layout for each country-year.

                        The other possibility here -- one that *is* oriented to inequality -- would be to regard the total pool of scientists in each field but across all countries as a pooling of a valued resource, the distribution of which could be analyzed for inequality in the same way a pool of income could. I would think that you might want some kind of norming here for country population, i.e., per capita scientist count.
                        It is the latter possibility. My colleagues and I do not expect (or desire) for the number of scientists to be equal across all countries and fields. Thank you for reminding me to control for country population.

                        Comment

                        Working...
                        X