Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adapting Multinomial Logit model (MLOGIT) to land use shares as the dependent variable

    I have a panel data set for 200 states from 1980 to 2011. My data is contains information on area under different crops and their yields. It looks thus.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float YEAR int(id1 AREA1 AREA2) long AREA3 float AREA4 int YIELD1 float(YIELD2 YIELD3 YIELD4)
    1980 407  762  500  332   1.9  368    5.3    0      .1
    1981 407  569  500  379     2  450    5.1    0      .1
    1982 407  582 1000  389   1.8  389    4.7    0      .1
    1983 407  516    0  365     2  450    6.1    0      .1
    1984 407  491 1000  483   2.2  455    4.5    0      .2
    1985 407  533 1000  433   2.3  304    4.7    0      .2
    1986 407  649    0  561   2.2  318    4.6    0      .2
    1987 407  535    0  459   2.3  522    4.7    0      .3
    1988 407  462    0  481   2.5  560    4.7    0      .4
    1989 407  518    0  506   2.6  538    5.4    0      .6
    1990 407  685 1000  512  2.75  575   6.47    0     .56
    1991 407  621    0  646  2.03  576   7.36    0     .86
    1992 407  365    0  563  3.16  601   7.37    0     .67
    1993 407  704    0  628  3.13  505   8.44  .01     .84
    1994 407  805    0  654   2.9  621   9.52    0     1.1
    1995 407  714 1000  494   2.7  481  9.214 .006   1.079
    1996 407  804    0  664   2.4  500  9.169    0    1.08
    1997 407  514    0  440   2.6  385  8.429 .001    .815
    1998 407  622    0  591  4.33  474  8.546 .002    .783
    1999 407    0    0  670  1.07  826  9.399 .005   1.125
    2000 407  588    0  507  2.59  251 11.681 .001   2.782
    2001 407  702 1000  736  2.59  324 13.867 .005   5.194
    2002 407  754 1000  638   2.8  303 14.021 .001   9.295
    2003 407  779  667  981  2.96  374 15.507 .004   13.12
    2004 407  636 1000  501  3.57  341 15.439    0  23.961
    2005 407  636 1000  501  3.57  341 15.462 .001  30.107
    2006 407  723    0  724  4.29  294 15.957 .043  37.183
    2007 407  921 1167  910  3.25  375 18.614 .032  48.909
    2008 407  855 1000  857  3.63  340 20.257 .035  61.332
    2009 407  898  167  883   3.7  428 24.512  .04  63.689
    2010 407  995  778  949  6.09  356 27.677    0  73.916
    2011 407 1116 1133 1008  3.84  419 28.441 .101  79.584
    2012 407 1087  579 1131  3.77  435 28.285 .147  91.007
    2013 407 1043  545  756  3.57  469 26.278 .128 109.003
    2014 407 1151  786 1056  3.73  473 28.757 .114 103.116
    2015 407 1121  750  755  4.88  494 32.698 .031 106.764
    1980 270 1069    0  333  16.7  479     .9    0       0
    1981 270 1100 1000  400  14.8  473     .9    0       0
    1982 270 1219 1000  435  14.9  423     .7    0       0
    1983 270  931    0  409  16.2  469     .7    0       0
    1984 270 1071    0  417  16.5  455     .5    0       0
    1985 270 1000    0  435  16.5  406     .6    0       0
    1986 270 1000    0  333  16.4  366     .6    0       0
    1987 270  667    0  333  16.6  560     .7    0       0
    1988 270 1071    0  407  16.9  586     .8    0       0
    1989 270 1038    0  417  15.9  560     .7    0       0
    1990 270 1200    0  474  15.3  639    .68    0     .04
    1991 270 1364    0  588 15.26  689   1.03    0     .03
    1992 270 1250    0  471 11.99  531    .81    0     .13
    1993 270 1444    0  550 12.77  509    .95    0     .08
    1994 270 1500    0  632  12.4  573    .85    0      .1
    1995 270 1562    0  550  11.7  573   .959    0    .075
    1996 270 1650    0  600  11.4  640   .828    0    .075
    1997 270 1571    0  643  10.3  534   .962    0    .043
    1998 270 1583    0  548  16.2  549   .727    0     .08
    1999 270    0    0  591 10.08 1045   .825    0    .079
    2000 270 1618    0  555  9.14  336  1.391    0    .074
    2001 270 1552    0  789  9.25  369  2.199    0    .064
    2002 270 1799    0  655  6.76  246  1.864    0    .081
    2003 270 1583    0  918  7.53  452   2.04    0    .068
    2004 270 1479    0  683  7.34  413  1.765    0    .061
    2005 270 1479    0  683  7.34  413  1.565    0    .049
    2006 270 1667    0  728  7.28  260   1.45    0     .04
    2007 270 1970    0  928  7.06  402  1.413    0    .055
    2008 270 1520    0  740  6.73  331  1.431    0    .083
    2009 270 1799    0 1079  6.36  453  1.374    0    .081
    2010 270 2000    0 1133   5.9  441   1.29    0     .12
    2011 270 1872    0  936  5.03  577  1.085    0    .101
    2012 270 1976    0 1306     5  586  1.059    0    .104
    2013 270 1804    0  813  4.85  598   .864    0    .115
    2014 270 2296    0 1279  4.92  573   .713    0    .126
    2015 270 2083    0  881  4.03  640   .747    0    .167
    1980 253 1150    0  436    .3  667    9.2    0      .1
    1981 253 1006    0  512    .3  667    9.3    0      .1
    1982 253 1088    0  510    .2  500    8.3    0      .6
    1983 253  641    0  479    .3  667    8.4    0      .3
    1984 253  726    0  491    .3  667    6.2    0      .2
    1985 253  699    0  537    .3  667    8.7    0      .5
    1986 253 1111    0  444    .3  333    6.3    0      .3
    1987 253  941    0  429    .4  750      6    0      .6
    1988 253  703    0  404    .7  571    4.1    0      .5
    1989 253  799    0  494    .6  667    6.4    0      .5
    1990 253 1112    0  515   .81  691   8.03    0     .98
    1991 253  907    0  669    .9  467   6.25    0     .88
    1992 253  988    0  617  2.39  536   7.04    0    1.22
    1993 253 1153    0  695  2.37  430   9.72    0    1.83
    1994 253 1217    0  770   1.9  474   9.85    0    1.74
    1995 253 1226    0  598   1.8  611   7.58    0   1.583
    1996 253 1176    0  640   1.7  588  6.497    0   1.285
    1997 253  978    0  568   1.5  467  6.608    0   1.137
    1998 253 1006    0  595   3.7  541  5.458    0    .962
    1999 253    0    0  724  1.46  918  7.061    0   1.016
    2000 253 1307    0  658   .94  319  5.509    0    .922
    2001 253 1258    0  838  2.33  365 10.772    0   1.696
    2002 253 1281    0  723  2.16  319  9.859    0   2.528
    2003 253 1233    0 1016  6.92  397 12.337    0   5.994
    2004 253 1037    0  564  6.11  363 10.133    0   4.093
    2005 253 1037    0  564  6.11  363  9.304    0   5.387
    2006 253  929    0  780  6.74  337  8.353    0   4.846
    2007 253 1145    0  843  4.62  398  9.124    0   4.765
    end
    I want to run a multinomial regression on dependent variables area and independent variable yields. I am aware that for a multinomial regression I need a categorical variable so I reshape my data like.


    Code:
    reshape long AREA, i( YEAR id1 ) j(CROPCAT)
    egen concatenate = concat( id1 CROPCAT )
    destring concatenate, replace
    xtset YEAR concatenate
    And now it looks like this


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float YEAR int id1 byte CROPCAT float AREA int YIELD1 float(YIELD2 YIELD3 YIELD4) int concatenate
    1980 25 1 2333    0   .29  .999    0 251
    1980 25 2 1900    0   .29  .999    0 252
    1980 25 3    0    0   .29  .999    0 253
    1980 25 4    0    0   .29  .999    0 254
    1980 26 1  820  179 2.738  .305 .004 261
    1980 26 2 1111  179 2.738  .305 .004 262
    1980 26 3    0  179 2.738  .305 .004 263
    1980 26 4  2.8  179 2.738  .305 .004 264
    1980 30 1 2187  500 123.1    .2   .4 301
    1980 30 2  500  500 123.1    .2   .4 302
    1980 30 3    0  500 123.1    .2   .4 303
    1980 30 4    2  500 123.1    .2   .4 304
    1980 31 1 1645  250  .084  .006    0 311
    1980 31 2 1958  250  .084  .006    0 312
    1980 31 3    0  250  .084  .006    0 313
    1980 31 4   .4  250  .084  .006    0 314
    1980 33 1 1400    0  .811 1.549    0 331
    1980 33 2 1200    0  .811 1.549    0 332
    1980 33 3    0    0  .811 1.549    0 333
    1980 33 4    0    0  .811 1.549    0 334
    1980 34 1 2715  792 288.5    .6  4.7 341
    1980 34 2 2000  792 288.5    .6  4.7 342
    1980 34 3  333  792 288.5    .6  4.7 343
    1980 34 4   24  792 288.5    .6  4.7 344
    1980 35 1 2045 1000  69.6     0   .1 351
    1980 35 2    0 1000  69.6     0   .1 352
    1980 35 3  750 1000  69.6     0   .1 353
    1980 35 4    1 1000  69.6     0   .1 354
    1980 36 1 1431  250 9.977  .198 .072 361
    1980 36 2 1340  250 9.977  .198 .072 362
    1980 36 3  500  250 9.977  .198 .072 363
    1980 36 4   .4  250 9.977  .198 .072 364
    1980 37 1 1437  250 2.535  .558 .006 371
    1980 37 2 1571  250 2.535  .558 .006 372
    1980 37 3    0  250 2.535  .558 .006 373
    1980 37 4   .4  250 2.535  .558 .006 374
    1980 39 1 1278  429 2.092  .023 .062 391
    1980 39 2 1667  429 2.092  .023 .062 392
    1980 39 3  269  429 2.092  .023 .062 393
    1980 39 4   .7  429 2.092  .023 .062 394
    1980 40 1 2455 2000   208    .1   .5 401
    1980 40 2    0 2000   208    .1   .5 402
    1980 40 3 1000 2000   208    .1   .5 403
    1980 40 4    1 2000   208    .1   .5 404
    1980 42 1 2194 1000  45.6    .5   .3 421
    1980 42 2 1500 1000  45.6    .5   .3 422
    1980 42 3  308 1000  45.6    .5   .3 423
    1980 42 4    1 1000  45.6    .5   .3 424
    1980 43 1 2778    0  90.9    .1   .2 431
    1980 43 2    0    0  90.9    .1   .2 432
    1980 43 3    0    0  90.9    .1   .2 433
    1980 43 4    0    0  90.9    .1   .2 434
    1980 44 1 1056  500   3.7    .1    0 441
    1980 44 2 1111  500   3.7    .1    0 442
    1980 44 3    0  500   3.7    .1    0 443
    1980 44 4   .2  500   3.7    .1    0 444
    1980 45 1 1275  167 5.475  .191 .051 451
    1980 45 2 1611  167 5.475  .191 .051 452
    1980 45 3  196  167 5.475  .191 .051 453
    1980 45 4   .6  167 5.475  .191 .051 454
    1980 46 1  952 1000     2     0    0 461
    1980 46 2 1100 1000     2     0    0 462
    1980 46 3    0 1000     2     0    0 463
    1980 46 4   .1 1000     2     0    0 464
    1980 47 1 1281  600 8.546  .953 .193 471
    1980 47 2 1436  600 8.546  .953 .193 472
    1980 47 3  750  600 8.546  .953 .193 473
    1980 47 4   .5  600 8.546  .953 .193 474
    1980 48 1 3162 1000 264.2   4.6   .8 481
    1980 48 2 1800 1000 264.2   4.6   .8 482
    1980 48 3  500 1000 264.2   4.6   .8 483
    1980 48 4    1 1000 264.2   4.6   .8 484
    1980 49 1 1403 3500  12.6    .1   .1 491
    1980 49 2 1360 3500  12.6    .1   .1 492
    1980 49 3  750 3500  12.6    .1   .1 493
    1980 49 4   .2 3500  12.6    .1   .1 494
    1980 50 1 1303    0   3.7    .2    0 501
    1980 50 2 1176    0   3.7    .2    0 502
    1980 50 3 1000    0   3.7    .2    0 503
    1980 50 4   .1    0   3.7    .2    0 504
    1980 51 1 1110  500   9.7    .3    0 511
    1980 51 2  964  500   9.7    .3    0 512
    1980 51 3    0  500   9.7    .3    0 513
    1980 51 4   .2  500   9.7    .3    0 514
    1980 54 1 2633  400 268.4   5.8  1.2 541
    1980 54 2 1875  400 268.4   5.8  1.2 542
    1980 54 3  667  400 268.4   5.8  1.2 543
    1980 54 4    5  400 268.4   5.8  1.2 544
    1980 55 1 3066  545 330.4  11.9  7.4 551
    1980 55 2 2000  545 330.4  11.9  7.4 552
    1980 55 3  731  545 330.4  11.9  7.4 553
    1980 55 4   11  545 330.4  11.9  7.4 554
    1980 56 1 1016  500   9.4    .3    0 561
    1980 56 2  963  500   9.4    .3    0 562
    1980 56 3    0  500   9.4    .3    0 563
    1980 56 4   .2  500   9.4    .3    0 564
    1980 57 1  940  333   5.1    .2   .1 571
    1980 57 2  871  333   5.1    .2   .1 572
    1980 57 3    0  333   5.1    .2   .1 573
    1980 57 4   .6  333   5.1    .2   .1 574
    end

    My problems are the following

    1) Second Is it ok to run mlogit on a panel data. I saw a thread by Clyde Schechter (https://www.statalist.org/forums/for...for-panel-data) and it mentioned to xtset the data and use femlogit. However in my case it takes a very long time. So I xtset my data and try to use mlogit.


    2) I find that several papers mention using ‘land use shares’ as the dependent variables (e.g https://doi.org/10.1111/agec.12551). While I know that the multinomial model requires a categorical variable. Is there any way to account for share as the dependent variable and then link it to the categorical variable (something like the cmmprobit command in Stata 16 does)

    3) Following from point 2 evidently I am missing something because when I run the mlogit I get the following result


    Click image for larger version

Name:	1Untitled.png
Views:	1
Size:	24.7 KB
ID:	1536179


    However if I run a fixed effects regression on the Area variable I get the following resul which seems to indicate that 'yields do impact area under a certain crop'

    Click image for larger version

Name:	Untitled.png
Views:	1
Size:	22.1 KB
ID:	1536178


    What am I missing? Any suggestions would be helpful.

  • #2
    Seems like what you are looking for is a fractional multinomial logit , see ssc desc fmlogit and http://maartenbuis.nl/publications/prop.html
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Dear Mr. Maarten Buis thank you for your prompt reply.
      Post your reply I went though the FMLOGIT help file and followed the examples demonstrated there. That does seem to be what I need to do.
      Also reading though some of the literature shows that these studies are using FMLOGIT (e.g. https://pubs.er.usgs.gov/publication/70197705).

      once again thank you

      Comment


      • #4
        Dear Mr. Maarten Buis
        (I am not sure whether I should start a new thread or continue with this one)

        FMLOGIT is working fine.

        Using the same data as above I run the following command

        Code:
        fmlogit RABI_OTHERS_PRO_AREA WHEAT_PRO_AREA RM_PRO_AREA  CPEA_PRO_AREA  BARLEY_PRO_AREA, ///
            eta(WHEAT_PROD CPEA_PROD RM_PRO BARLEY_PROD )

        However I still am facing two issues.

        1) The signs of the explanatory variables are opposite to what they should be. For example wheat productivity should have a positive impact on wheat land use , but it is negative.
        So I check the correlation between the explanatory variables (i.e. yield of all crops) and find that two pairs have high correlation (0.7 and 0.56).
        What would be a feasible way to deal with this issue. I ask this because I have to see the impact of all yield variables on their respective land uses, hence i am reluctant to drop any variable. I get that this is a rather ambiguous question, but perhaps you have a suggestion?


        2) I run the suest post estimation command and get the error "model was estimated with a nonstandard vce (robust)" I followed a few threads on the topic and think perhaps this is because the robust option is included? Is it possible to use suest after FMLOGIT?

        Comment

        Working...
        X