Problem about interpretation of "percentage" variables

Beni Imeraj

Join Date: May 2018

Posts: 7
#1

Problem about interpretation of "percentage" variables

30 May 2018, 05:59

Dear users,

This is my first post on Statalist. I am empirically investigating the relationship between investment banks and firms during SEOs (seasoned equity offerings). The main goal is to find out whether the reputation of investment banks influences the issue costs or not. The sample consists of 1'035 observations (SEOs) and 178 investment banks during 1990-2017. So far I managed to build a literature-based regression model. Here is the regression output I obtained from Stata:

Regression Output.pdf

The dependent variable is "GPCT", which shows the issue costs as a percentage of the issue amount. The independent variable is "MKSH" (also referred to as market share of the bank). It is defined as the total issue amount of the bank divided by the total issue amount of all banks together in the sample. Since both variables are defined as "percentages" (i.e. with a range from zero to hundred), I am a bit unsure how to interpret the coefficient correctly (-3.106). My first intuition was this: if "MKSH" increases by one percent, "GPCT" decreases on average by 3.106 percent. But as already mentioned, I feel highly uncertain about that.

Does anyone have any experience with proportional variables? Any form of help is appreciated.

Kind regards,
Beni
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

31 May 2018, 12:36

You didn't get a quick answer. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex. Also, it is not a good idea to attach files - for concerns with viruses, many won't open files.

That is probably the right interpretation.
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#3

31 May 2018, 12:37

To be specific, you might want to say "predicted GPCT declines" rather than "on average". These are slightly different and what you have is a predicted change. You may find that using the margins command after your regression will be helpful in interpretation.
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#4

31 May 2018, 12:50

I have not seen the output, since it is not under code delimiters. That said, since the DV is a percentage, you may wish to check previous posts on this matter. In short, a generalized linear mode of the binomial family and logit link is mentioned as good strategy. Hopefully that helps.

Best regards,

Marcos
Comment
Amin Sofla

Join Date: May 2018

Posts: 67
#5

31 May 2018, 12:54

Duplicate reply #4

Originally posted by Beni Imeraj View Post

Dear users,
The dependent variable is "GPCT", which shows the issue costs as a percentage of the issue amount.
Beni

Note: As far as I understood, your dependent variable is bounded (between 0 and 1). If you use models such as OLS, Logit, Tobit, you might face some estimation challenges (e.g., Using OLS might produce predicted values that lie outside the percentage.). I suggest you to use the fractional response model (Papke and Wooldridge 1996, 2008) that overcomes several limitations of other models.
Comment
Beni Imeraj

Join Date: May 2018

Posts: 7
#6

01 Jun 2018, 01:28

Originally posted by Phil Bromiley View Post

You didn't get a quick answer. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex. Also, it is not a good idea to attach files - for concerns with viruses, many won't open files.

That is probably the right interpretation.

Hello Phil,

Thank you for your advices using Stata. I'll keep that in mind for the future.
Comment

Beni Imeraj

Join Date: May 2018
Posts: 7

01 Jun 2018, 02:16

Hello Marcos and Amin,

Thank you for your replies. Unfortunately, I have to use multilinear regression (OLS) since it is the only estimation method I know from the lectures. Also, I do not have any experience with logistic or fractional response models as suggested by Amin.

Here is the data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double GPCT float(MKSH DEALS LNSIZE LNAMT) double(OP RETURN VOLA) float MLEV
 2.35  .184667 71  7.971431  4.832306     50   .08403893560171127  .09115817124758006    .21262294
 .802  .184667 71  8.543835   5.46806   23.7   .09594570472836494  .07297996183052784     .5929236
 .719  .184667 71    9.4892  5.595407  36.88   -.1807723673991859   .0889716048763638    .03987548
5.013  .184667 71  5.769698  4.147585 28.125   .16072244849056005 .033895644536343944     .4177664
3.153  .184667 71  7.727737  4.901475  50.75 .0025220802053809166  .07276648292016454    .26218113
 .241  .184667 71  9.320098  6.983039    137 -.018764515407383442  .03897641466675567     .2273697
4.587  .184667 71  8.286655  6.190711  16.35  -.14634296670556068  .24538693617712057      .495086
 .498  .184667 71  9.637445  6.531242  152.5   .17802192131057382   .0211344983626902     .2930205
    4  .184667 71  7.109841 4.4998097     20  -.03629590105265379  .08308287291030995    .27959198
 .344  .184667 71   7.63292   5.44889  23.25   .28656600415706635 .058558392066924016     .4616725
  1.2  .184667 71   9.10093  5.569298  52.45     .288580946624279   .0211344983626902    .49189475
    4  .184667 71  9.347186  6.907755    5.5   -1.003104072064161  .35573440958067604    .17157084
2.505  .184667 71  9.563318  6.774224   45.9   .14528556168079376 .023163206981263996     .3906057
3.226  .184667 71  7.002659 4.6279097     31    .7493090853095055  .20208232646790023     .7719988
2.247  .184667 71  7.974396   5.13049  22.25  .050643717870116234  .14310206079634358     .1960281
  .95  .184667 71  7.050526  3.570518  18.95    .2615654394030571  .06557364177223586     .3777544
 .423  .184667 71  8.508331  6.421622  30.75   .20116464421153069  .11628903560505101     .3882173
 .133  .184667 71  7.909074  5.665561    105    .3900191185530275  .08078607423744823     .1745572
4.688  .184667 71 4.7044544 4.3820267     16                    0                   .            0
  .85  .184667 71  9.545109  5.940829   58.5  -.21892344672232866  .12980281248559888    .28018346
1.368  .184667 71  8.053697  5.515443  24.85  .009648095816373825  .08697413369643892     .3811419
    2  .184667 71  5.048798  5.676754   36.5    .3389544580131769  .07362044341901973    .11031355
 1.29  .184667 71  5.674491  3.919495   7.75   .13256442826241255  .09153615939545921     .1762318
 1.42  .184667 71   5.95605  5.026692  28.76  -.15887927636504173  .15229152963300172    .02842673
 .563  .184667 71 10.744473  5.948557     16   .12391593866050243  .10958125856020848     .7610158
 .888  .184667 71   8.35494  5.304349  52.95    .6274538673460484  .14241499908996447     .4586891
    1  .184667 71  6.399388  5.253582   25.5     .428437327966094  .18538867701028594   .006629554
1.105  .184667 71  8.632733  5.986279     19    .3650582693517208  .08634364629093495      .573481
 .984  .184667 71  9.503068  6.463165   31.5    .0125001585111022  .03649550852928136     .3735777
1.116  .184667 71  7.015883  5.746341  92.24    .6519124432234094  .14932422064868756     .1137495
3.234  .184667 71  7.917194 4.1502523 29.375  -.06453852262347937  .05166505428114556     .3717685
2.629  .184667 71  6.895411  5.724402   87.5   .38099225610494614 .046225530881965564    .11083287
    3  .184667 71  8.562491  5.480639     30  -.17185026314109564  .11787302549791355    .24320415
 .868  .184667 71  6.322362  5.523459   33.4   .27206784300506115  .02379178001725137  .0024910606
2.778  .184667 71  4.633223  3.052491     36   .23163817450404167   .1274592692984387            0
3.226  .184667 71  5.879974  5.056959     31   -.1831250786781311  .16010615633848654    .14580448
1.267  .184667 71  6.570109  5.276338  28.25   -.1582496613264084  .10695415658097711    .08901337
1.184  .184667 71  7.181857  4.671482  10.98    .3451331942342222  .08937155946771218    .27637178
  1.5  .184667 71 12.576808   5.90525     91  -.29182301089167595  .18261398951817115     .7671688
1.035  .184667 71  7.674068 4.6349716  79.25   .02095681242644787  .05252093826920835    .22914417
4.829  .184667 71  6.273436 3.3322046     28 -.019502240465953946  .08073803607502335     .4528725
 .397  .184667 71 8.0330715 4.6051702  37.75   .24665410118177533   .2624884269843744     .5218191
1.113  .184667 71  6.510464 3.9871304  26.95    .2649298217147589  .06306133748307655    .19435094
2.756  .184667 71  13.69283  8.517193    123  -.18840251117944717  .18356574465770767     .9311063
 .632  .184667 71  8.418234  5.221706  14.25 -.043560296413488686  .07036241082690346     .4615819
1.085  .184667 71  7.604845  4.859812  32.25   .21154207873041742  .09232019569294842    .21937226
  .94  .184667 71  8.196178 4.6109037  37.25    .1284463454503566 .028957044803129266     .4377641
5.737  .184667 71 2.9976804  2.944439     19   .17783723399043083  .23841584520286827            0
3.005  .184667 71  9.188197  6.697775   9.65   -.3420655392110348   .1139160130796965     .4083571
    3  .184667 71  8.446932  5.371972     27    .3528213668614626  .03146069386924564     .7844225
 .559  .184667 71  8.459141  6.234705  26.85    .2639068104326725  .07504918833405361     .3673933
4.412  .184667 71  5.681135  3.303217    6.8   -.8754687299951911  .12787735282874285    .01462418
1.473  .184667 71  9.354267  5.603962   5.43     1.29472716152668   .3692937537114834    .55000466
1.299  .184667 71  6.724668   4.64391  11.55   .22909900406375527  .04136934742887819    .13515094
1.554  .184667 71  9.424209  6.171283   39.9   .11005536839365959  .16175408212102457    .38933495
2.756  .184667 71 13.928658  8.517193    123  -.03393610846251249  .09419537411763893     .8788741
  3.5  .184667 71  7.677598 3.4773865   18.5  .014929777011275291  .03965568515159191     .5693086
 .724  .184667 71  8.123513  4.612434  34.55   .08942968677729368  .08020627097959156     .7435487
1.546  .184667 71  8.784085  4.418636   4.85     .958699893206358     .84544851333111     .3636561
  .43  .184667 71  7.596116 -.3147107   39.5     .644860319327563  .08639016619853328     .2401595
 .426  .184667 71  8.460475  6.068426     54    .5144746135920286  .05269109052079194     .4241902
4.129  .184667 71  8.293065  5.919727     31   .08055201172828674  .21256817023636312     .3537462
2.083  .184667 71 10.158372  7.090077     24   -.2870521694421768   .2052432129379847     .2776395
1.846  .184667 71  6.364407 4.4367514     26    .6095025613903999  .19633671654602053    .25653303
  1.5  .184667 71 12.651457  7.180597     93   .07788868620991707  .09125201851197555     .7944344
  .79  .184667 71  9.618911   5.53418  31.65   .23559929989278316  .06287969398255927     .4268032
1.171  .184667 71  8.093821  5.927752   20.5  -.11995471827685833   .1085271307963974     .3246204
 .913  .184667 71  8.271647  5.634789     23   .14343246491625905  .04417838885240488     .3843912
    3  .184667 71 10.290608  4.638605     47   .45907454937696457  .21931196461566663     .9288774
 .785  .184667 71  7.793128  3.937496   44.6   .14028429659083486   .0211344983626902     .4101256
2.857  .184667 71  7.341703  5.059425  26.25    .6506794821470976   .1253309185847216    .58279806
 3.25 .1545828 43  4.983565  4.744932   57.5    .4996480457484722   .2978136562781373            0
1.961 .1545828 43  10.21994  6.639876  43.35   -.3137439927086234  .11934545719395896     .3538321
1.371 .1545828 43  9.846354  5.820913     62    .3885155562311411  .15858910154723338     .3504521
2.089 .1545828 43  5.622583 4.6793494   35.9     .372711393982172  .06211437737037708            0
3.382 .1545828 43  8.621008 4.6395717   20.7 -.018447123700752854  .03232309690994448     .6047401
2.098 .1545828 43   7.29059  3.273288   7.15   .37481136340647936  .08511924225411426    .38441715
 .357 .1545828 43  7.858061  5.269728  53.25   .34781939536333084  .06969361036005257     .2484216
 .398 .1545828 43 10.015774  5.556152 123.25   .08624245226383209  .07153185380911076     .3989606
  .36 .1545828 43  8.961387  5.850765   69.5  -.02017359109595418 .049627019542600445    .14503181
 .119 .1545828 43  7.282053  3.921478  33.65    .2714935161639005  .05803583975592963     .3302673
    5 .1545828 43  5.856383 4.6979227   30.5      .17289147246629  .09091432450181787 .00058121874
 .783 .1545828 43  7.782797  5.437209  19.15   1.2005863087251782  .19850946626301513     .8549332
 .265 .1545828 43  9.361869  5.493062   16.2  -.03712723567150533  .07022852918081908     .2370153
 .284 .1545828 43  7.529183  3.559625  35.15   .17811250174418092  .04156607278458014     .5368909
  5.5 .1545828 43  5.368743  3.962716  13.15    .1298801302909851  .08487283827014248     .1627186
  2.5 .1545828 43 14.261572 9.2103405   40.5 -.054600369185209274  .14583644942564208     .7435288
3.504 .1545828 43  4.443251  3.565327 43.375   .47082076454535127  .10005624230890829   .006174415
1.308 .1545828 43  7.037745 4.3851466  26.75  .029086739756166935  .03251304873214222    .21222636
2.751 .1545828 43  7.770875  6.213958  63.25    .2814764748327434  .11388281759502845    .23509933
    6 .1545828 43 4.2468367  3.937067    2.8   .06856267340481281  .24344255552910232            0
5.389 .1545828 43 4.6736603 3.1696014  25.05    .2667304575443268   .2539736283842976    .08603836
5.287 .1545828 43 4.2619047 3.2317526  19.48   .11878353683277965    .170819092549315 .00043109275
 .933 .1545828 43  9.802423  5.642935   67.5  -.09172967728227377  .04491585513457121    .07118935
    2 .1545828 43  8.653621  5.195731     19   .13362956792116165  .36223386055228174     .6214647
 .756 .1545828 43  7.625888  5.019661   17.2  -.05706942453980446   .1206897699862096     .5541437
 .373 .1545828 43 8.9383745  5.997695  40.25   .08503249287605286  .11484360281278093    .52851105
    6 .1545828 43  3.567192  3.948509     24   .09439147682860494  .22402522199691133    .09293515
2.752 .1545828 43 14.592563  8.518292  35.25   .15315750474110246  .16448126307302985     .8515996
 .606 .1545828 43  7.612217  5.629195  41.25    .1661089714616537  .08985361520112993     .4391505
end

Code:

reg GPCT MKSH LNSIZE LNAMT OP RETURN VOLA MLEV

Output:

Hope the data descriptions can help further.
What is for sure is that an increasing market share of the bank (MKSH) decreases the issue costs (negative sign). But the question by how much remains tricky.
I have looked for answers in many statistical forums. Without success, unfortunately.

Attached Files

Comment

Beni Imeraj

Join Date: May 2018

Posts: 7
#8

01 Jun 2018, 02:18

Please ignore the second table. It was an older version of the output.
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#9

01 Jun 2018, 04:54

In #1, it seems you really wished information about the best approach when dealing with a DV measured as a proportion. Being this so, the recommendation to use - glm - with binomial family and logit link still applies.

In #7, the reason you provided for using OLS related to lectures only on this matter being given. Logically, this argument conflicts with the advice wished from members with expertise in using DV as a proportion.

By the way, there is a difference between proportion and percentage, and I recommend you take a look on that.

What is more, the data set you shared, apparently, neither provides a percentage nor a proportion for the DV.

If we look carefully, out of 100 observations, we found in the DV - lo and behold! - 65 with values beyond 1 (it goes from .119 to 6).

With regards to the standard interpretation of the DV in OLS regression, I gather that won't change on account of the DV being a proportion, I mean, the rule of "for 1 point increase in the IV..."), found in any textbook, still applies. The use of - margins - may be an interesting strategy. Also, elasticities or semielasticities, depending on the case.

All in all, these aspects are not related to the main issue when dealing with DV as a proportion: the fact that, under OLS, you may get predicted values beyond 1 as well as below zero, which are somewhat preposterous.

Hopefully that helps.

Last edited by Marcos Almeida; 01 Jun 2018, 05:07.

Best regards,

Marcos
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35699
#10

01 Jun 2018, 05:25

Looking at the sample data I see that the response -- named GPCT -- has a distribution like this:

On this evidence, it is not a bounded proportion (values fall between 0 and 1).

1. It may be a bounded percent (values fall between 0 and 100)

If so, logit applies, but you need to scale by 100, and use glm

2. It may be a unbounded percent (notably that negative values are possible).

If so, regress is possibly OK.

3. It may be unbounded percent, but negative values are impossible.

If so, regress is not OK, as it could predict negative responses (and your model does do that for the sample data).

If so, poisson is a better bet.

Moral: There is no unequivocal good advice unless you tell us more about your data. You do give us a verbal definition for GPCT as issue costs as a percent of the issue amount, but I really have no idea of whether that can be negative. I guess not, so mark #3 just above.

Last edited by Nick Cox; 01 Jun 2018, 05:28.
Comment

Announcement