Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dividing the sample

    Hi experts and researchers,

    I use panel data and need to divide the sample into two using Stata.


    How can I divide the sample in two, above and below median inflation observations, the median inflation rate calculated from the full sample is 8.3%

    Thank you

    Badiah

  • #2
    A panel dataset consists of several observations of the same entity (i.e., the entity is observed over time). If you want to assume that the average value represents the entity, you can proceed as follows (assume "invest" is inflation in the example below):

    Code:
    webuse grunfeld, clear
    frame put company invest, into(median)
    frame median{
        collapse invest, by(company)
        qui sum invest, d
        gen hinvest= invest>r(p50)
    }
    frlink m:1 company, frame(median)
    frget hinvest, from(median) 
    frame drop median
    Res.:

    Code:
    . tab company hinvest
    
               |        hinvest
       company |         0          1 |     Total
    -----------+----------------------+----------
             1 |         0         20 |        20 
             2 |         0         20 |        20 
             3 |         0         20 |        20 
             4 |         0         20 |        20 
             5 |         0         20 |        20 
             6 |        20          0 |        20 
             7 |        20          0 |        20 
             8 |        20          0 |        20 
             9 |        20          0 |        20 
            10 |        20          0 |        20 
    -----------+----------------------+----------
         Total |       100        100 |       200

    Comment


    • #3
      Andrew Musau Hi Andrew, Thank you very much for your response, I'm not sure if I understand your explanation correctly!
      Code:
      If you want to assume that the average value represents the entity,
      let me give more details
      I use system GMM for dynamic panel data and I want to execute the marginal effect like previous literature by dividing the sample into two according to the median (above and below median) I need to examine the inflation impact on the financial development - growth nexus

      The dependent variable: GDP
      independent variables: financial development, inflation, trade and government

      so I need to divide the sample in two, above and below median inflation observations, the median inflation rate calculated from the full sample is 8.3%

      How can I divide the sample according to this value of the median (8.3%)

      If your code is an appropriate code for my case, could you please rewrite your code with my variables just for more understanding

      I appreciate your help
      Thank you

      Badiah

      Comment


      • #4
        It depends if you want to break the panel structure of the data when splitting the sample. If you want to strictly split based on observations:

        Code:
        sum inflation if e(sample), d
        gen hinflation = inflation>r(p50)
        If you want to respect the panel structure of the data, then assuming that your panel identifier is named "country":

        Code:
        frame put country inflation if e(sample), into(median)
        frame median{
            collapse inflation, by(country)
            qui sum inflation, d
            gen hinflation= inflation>r(p50)
        }
        frlink m:1 country, frame(median)
        frget hinflation, from(median)
        frame drop median

        The first code splits observations into two groups; above the median inflation value and below the median inflation value. The determination of median does not respect time periods and number of observations per country. It simply lists all inflation values from smallest to largest, gets the median value and then splits observations based on this value. So the same country can be in both groups. The second code calculates the average inflation of each country, finds the median for these values and splits the sample into countries above and below this median value. What is appropriate for you, I do not know.
        Last edited by Andrew Musau; 24 Nov 2021, 08:21.

        Comment


        • #5
          Andrew Musau Thank you very much, I clearly understand now and it is well worked.

          Comment

          Working...
          X