Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PPML with negative values

    .I want to use a PPML estimator to see if there is an effect of variables like gdp p.c. distance etc. (migration gravity model) on female migration flows =(migration stock year t )- (migration stock t-1)/population origin country.

    The problem is here, that there are also negative values when for example in year 1980 there are 145 migrants and in year 1985 there are only 140. flow= 140-145/250 = -0.002

    Does someone have any idea how to solve this?

    Here is an example of my dataset:

    f_flow = female migration flows


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str28(iso3_d iso3_o) int year long female float(gdpcap_o gdpcap_d) double distw float f_flow
    "AUS" "AFG" 1980     0 276.29773 10187.797   11086.1405715168              .
    "AUS" "AFG" 1985     0  217.1051 11436.447   11086.1405715168              0
    "AUS" "AFG" 1990  1275         . 18221.105   11086.1405715168    .0002131698
    "AUS" "AFG" 1995  2895         .  20364.25   11086.1405715168   .00019635924
    "AUS" "AFG" 2000  4520         .  21666.95   11086.1405715168    .0001667219
    "AUS" "AFG" 2005  7654  252.4079  33995.85   11086.1405715168   .00025897508
    "AUS" "AFG" 2010  8396 561.19763  51800.93   11086.1405715168   .00005297929
    "AUS" "AGO" 1980     0  779.6896 10187.797  12954.90535159645   -.0018387728
    "AUS" "AGO" 1985     0  750.7219 11436.447  12954.90535159645              0
    "AUS" "AGO" 1990   171  992.8729 18221.105  12954.90535159645  .000027490247
    "AUS" "AGO" 1995   168  416.3216  20364.25  12954.90535159645    -4.1113e-07
    "AUS" "AGO" 2000   163  655.6295  21666.95  12954.90535159645  -5.947157e-07
    "AUS" "AGO" 2005   198 1706.5436  33995.85  12954.90535159645   3.501472e-06
    "AUS" "AGO" 2010   217 4218.6494  51800.93  12954.90535159645   1.591821e-06
    "AUS" "ALB" 1980   405  612.1152 10187.797 15208.496627803803   .00014486154
    "AUS" "ALB" 1985   341  662.9148 11436.447 15208.496627803803  -.00004436182
    "AUS" "ALB" 1990   304  639.4639 18221.105 15208.496627803803  -.00002307743
    "AUS" "ALB" 1995   436  760.5594  20364.25 15208.496627803803   .00008154959
    "AUS" "ALB" 2000   641 1193.4662  21666.95 15208.496627803803   .00013415827
    "AUS" "ALB" 2005   979 2798.9495  33995.85 15208.496627803803   .00022511977
    "AUS" "ALB" 2010  1074 4175.1216  51800.93 15208.496627803803   .00006545305
    "AUS" "AND" 1980     0         . 10187.797  16706.34578852234              .
    "AUS" "AND" 1985     0         . 11436.447  16706.34578852234              .
    "AUS" "AND" 1990     0         . 18221.105  16706.34578852234              .
    "AUS" "AND" 1995     0         .  20364.25  16706.34578852234              .
    "AUS" "AND" 2000     0         .  21666.95  16706.34578852234              .
    "AUS" "AND" 2005     0         .  33995.85  16706.34578852234              .
    "AUS" "AND" 2010     0         .  51800.93  16706.34578852234              .
    "AUS" "ARE" 1980     0  42961.84 10187.797 11543.805450162507              0
    "AUS" "ARE" 1985     0 30138.887 11436.447 11543.805450162507              0
    "AUS" "ARE" 1990   257  28066.15 18221.105 11543.805450162507    .0004018067
    "AUS" "ARE" 1995   436  28020.09  20364.25 11543.805450162507   .00021774434
    "AUS" "ARE" 2000   680 34476.285  21666.95 11543.805450162507    .0002379158
    "AUS" "ARE" 2005  1295  43533.89  33995.85 11543.805450162507    .0004514888
    "AUS" "ARE" 2010  1421 33885.926  51800.93 11543.805450162507   .00005973846
    "AUS" "ARG" 1980  4041 2736.8975 10187.797 12044.574133735276    .0001839228
    "AUS" "ARG" 1985  4531  2914.154 11436.447 12044.574133735276   .00003171649
    "AUS" "ARG" 1990  5308 4332.6563 18221.105 12044.574133735276   .00004657566
    "AUS" "ARG" 1995  5456  8973.321  20364.25 12044.574133735276   8.290453e-06
    "AUS" "ARG" 2000  5519  9329.113  21666.95 12044.574133735276  3.3287315e-06
    "AUS" "ARG" 2005  5880  5767.657  33995.85 12044.574133735276  .000018053088
    "AUS" "ARG" 2010  6450 11460.376  51800.93 12044.574133735276  .000027065404
    "AUS" "ARM" 1980   121         . 10187.797 13188.615966196823    -.003979139
    "AUS" "ARM" 1985   111         . 11436.447 13188.615966196823  -5.856272e-06
    "AUS" "ARM" 1990   268  636.6807 18221.105 13188.615966196823   .00008610525
    "AUS" "ARM" 1995   425  455.5503  20364.25 13188.615966196823   .00009271677
    "AUS" "ARM" 2000   482  621.4248  21666.95 13188.615966196823   .00003499788
    "AUS" "ARM" 2005   531  1625.397  33995.85 13188.615966196823  .000031024676
    "AUS" "ARM" 2010   582  3124.785  51800.93 13188.615966196823   .00003331243
    "AUS" "ATG" 1980     0 1565.6487 10187.797  16622.68029241066    -.015476254
    "AUS" "ATG" 1985     0 3070.4395 11436.447  16622.68029241066              0
    "AUS" "ATG" 1990     0  6325.241 18221.105  16622.68029241066              0
    "AUS" "ATG" 1995    13  7230.321  20364.25  16622.68029241066      .00034357
    "AUS" "ATG" 2000    15  10094.76  21666.95  16622.68029241066   .00004525399
    "AUS" "ATG" 2005    18 12079.865  33995.85  16622.68029241066   .00006455639
    "AUS" "ATG" 2010    20  13017.31  51800.93  16622.68029241066   .00004057536
    "AUS" "AUT" 1980 10362  10843.36 10187.797 15608.417659805298     .002600321
    "AUS" "AUT" 1985 10225  9150.002 11436.447 15608.417659805298 -.000034520806
    "AUS" "AUT" 1990  9953  21628.76 18221.105 15608.417659805298  -.00006791117
    "AUS" "AUT" 1995  9550 30252.795  20364.25 15608.417659805298  -.00009800646
    "AUS" "AUT" 2000  9022  24517.27  21666.95 15608.417659805298  -.00012805678
    "AUS" "AUT" 2005  8471  38241.09  33995.85 15608.417659805298  -.00013031167
    "AUS" "AUT" 2010  9292  46444.18  51800.93 15608.417659805298    .0001914894
    "AUS" "AZE" 1980    18         . 10187.797 12862.659950720978    -.002958742
    "AUS" "AZE" 1985    17         . 11436.447 12862.659950720978  -2.942864e-07
    "AUS" "AZE" 1990     9 1237.3246 18221.105 12862.659950720978  -2.186682e-06
    "AUS" "AZE" 1995    39  397.1981  20364.25 12862.659950720978   7.630038e-06
    "AUS" "AZE" 2000    83  655.0974  21666.95 12862.659950720978  .000010696757
    "AUS" "AZE" 2005   127 1578.3672  33995.85 12862.659950720978  .000010335075
    "AUS" "AZE" 2010   139  5842.806  51800.93 12862.659950720978  2.6267355e-06
    "AUS" "BDI" 1980     0 222.88095 10187.797 12135.006432344055  -.00006553367
    "AUS" "BDI" 1985     0 240.87598 11436.447 12135.006432344055              0
    "AUS" "BDI" 1990     6 201.94914 18221.105 12135.006432344055  2.1645849e-06
    "AUS" "BDI" 1995     9  161.1016  20364.25 12135.006432344055   9.800177e-07
    "AUS" "BDI" 2000    12 130.42386  21666.95 12135.006432344055   9.202442e-07
    "AUS" "BDI" 2005   358 143.78354  33995.85 12135.006432344055   .00009171253
    "AUS" "BDI" 2010   393  219.5298  51800.93 12135.006432344055    7.84408e-06
    "AUS" "BEL" 1980  2116 12912.997 10187.797 16319.186800683594    .0003417986
    "AUS" "BEL" 1985  2271  8784.148 11436.447 16319.186800683594  .000030756775
    "AUS" "BEL" 1990  2328  20678.84 18221.105 16319.186800683594   .00001119695
    "AUS" "BEL" 1995  2406 28522.047  20364.25 16319.186800683594  .000015071832
    "AUS" "BEL" 2000  2521 23151.936  21666.95 16319.186800683594   .00002198218
    "AUS" "BEL" 2005  2581 36927.105  33995.85 16319.186800683594   .00001124043
    "AUS" "BEL" 2010  2831  44358.26  51800.93 16319.186800683594   .00004508717
    "AUS" "BEN" 1980     0  377.9566 10187.797 15227.900650333022    -.001472961
    "AUS" "BEN" 1985     0  243.9115 11436.447 15227.900650333022              0
    "AUS" "BEN" 1990     0  391.8935 18221.105 15227.900650333022              0
    "AUS" "BEN" 1995     0 362.47095  20364.25 15227.900650333022              0
    "AUS" "BEN" 2000     0   339.473  21666.95 15227.900650333022              0
    "AUS" "BEN" 2005    12   532.611  33995.85 15227.900650333022  2.9800176e-06
    "AUS" "BEN" 2010    13  690.0023  51800.93 15227.900650333022    2.15963e-07
    "AUS" "BFA" 1980     0 282.68576 10187.797  15918.28391053009  -3.738065e-06
    "AUS" "BFA" 1985     0 200.89426 11436.447  15918.28391053009              0
    "AUS" "BFA" 1990     0 351.97925 18221.105  15918.28391053009              0
    "AUS" "BFA" 1995     0 235.83223  20364.25  15918.28391053009              0
    "AUS" "BFA" 2000     0 224.92865  21666.95  15918.28391053009              0
    "AUS" "BFA" 2005     9  406.9988  33995.85  15918.28391053009  1.3235432e-06
    "AUS" "BFA" 2010    10  592.6075  51800.93  15918.28391053009  1.2710365e-07
    "AUS" "BGD" 1980   413  219.5756 10187.797  8667.331951999282   .00001009465
    "AUS" "BGD" 1985   489 229.22635 11436.447  8667.331951999282  1.6687867e-06
    end


  • #2
    Hi Lea,
    It looks like your "flow" variable could be better described as the "change in the migrant stock" , which obviously could be negative based on how you've defined it.

    One general point to make is that this is not an issue specific to PPML. If you are trying to fit a log-linear relationship between migration and distance using OLS, there is no way you can do this if your migration variable takes on negative values.

    Here are some suggestions:
    - Since the original variable you observe is a stock, you could use stocks as your dependent variable so that you can identify cross-sectional variation in stocks as a function of log distance, etc.

    - If you are interested in how migration stocks change over time, you could again use stocks as your dependent variable but also include a bilateral (origin-destination) fixed effect so that identification is solely based on changes in stocks over time. However, this would also mean that you could only identify the effects of variables that themselves change over time (such as changes in visa requirements, for example)

    - If you instead really want to look at changes in the stock as a function of log distance, etc, you could define the change in the stock as (migration stock in year t ) divided by (migration stock in year t-1). This will generally give you positive values (though it will be undefined if the prior year's migration stock is zero). The problem with this approach is that the functional form you would be assuming seems atheoretical; for this reason, there would not seem to be a compelling reason to use PPML as opposed to taking logs and using OLS.

    - Finally, you could re-define a "flow" so that it only refers to the number of people who migrated from country A to country B in a given period, irrespective of what country they are originally native to. This would again give you a dependent variable that is always positive, but may change the question you are trying to ask.

    Hope this is helpful.

    Regards,
    Tom

    Comment


    • #3
      Dear Tom Zylkin I am facing a similar kind of issue.
      In my setup, i am trying to fit a gravity model but insted of nominal trade flows as a dependent variable, i have quantity (in KG's) as dependent variable.
      1) is it theoretically consistent to estimate gravity of this kind, where dependent variable is quantity in KG's as compared to nominal trade flows. ?
      2) I am trying to specify exponential mean model [log(quantity) = exp(x'ij,t)eij,t], and estimate it using PPML. However, PPML does not work as some observations in log(quantity) are negative and also this is wrong functional form when estimator is PPML. However, when i specify quantity in levels (instead of logrithm) and then use PPML, it says dependent variable takes large values, consider rescaling. The model now runs, but the estimated elasticities of variable of interest are > 1. Does elasticitiy estimated of >1 make any sense.
      3) I can use OLS in log(quantity) with origin, destination, and time fixed effects, but i am not sure how feasible OLS might be in that case.

      Thanks and regards,
      (Ridwan)

      Comment


      • #4
        Ridwan: I would not use log(quanity) along with an exponential mean. You can define quantity as, say, thousands of kg, and so divide quantity by 1000. This changes nothing important. To say more, I'd have to see your output. (Please read the FAQ for tips on effective posting.) I assume you used both country. If quantity > 0 always, you can try a linear model with log(quantity) for comparision. But if you have lots of zeros, you should stick with Poisson FE with an exponential mean.

        Comment


        • #5
          Negative logarithms indicate positive arguments below 1. There is nothing wrong with that. I am puzzled at the implication that there is.

          Jeff Wooldridge recommends scaling to avoid big numbers -- which is often a good idea. That would make negative logarithms more likely, but again that's no problem. Measurements (meaning here, not counts) depend on a choice of units of measurement. In principle, you could have a unit such that all measurements are below 1 in value. Probability is often a case in point.

          Comment


          • #6
            Thanks Jeff Wooldridge . I am sorry for being little ambigious and not posting the dataex example. I will be careful in future.
            The problem is I have lot of zeros in my dependent variable, so as you said, I should stick to Poisson FE.
            Yes, i have bilateral quantity flows (in KG's). Once i include importer, exporter, and time fixed effects, it produces large estimates for trade elasticities, if i just include time dummies, the coefficients are of perverse sign.

            Dear Nick Cox large values are problem if the convergence is not achieved using PPML estimator, so my command runs, therefore i can ignore rescaling or the warning. I have negative values in logarithm not levels, since PPML require exponential mean of dependent variable to be in levels not logarithm, therefore i need not require log-transformation of my dependent variable neither any scaling, as PPML does achieves convergence. The problem i am facing is OLS gives theoretically consistent estimates but not the PPML (which is more feasible here). I also tried glm specifying family(poisson) and link(log) option with clustered s.e, this produces the same estimates as PPML, since PPML is technically builds on glm principle.
            ​​​​​​​
            Thanks,
            (Ridwan)

            Comment

            Working...
            X