Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Negative independent values

    Hello,

    For my research I am currently investigating the effect of financial ratios on the initial return of a company (that is, the return on the first trading day after an IPO). However, these ratios contain positive and negative values due to negative earnings at the moment of the IPO. I wanted to use a logarithmic transformation to achieve a normal distribution since these values tend to have outliers and exist in a wide range. I have read a lot of forums where people add a constant to solve this problem, however, this is not my preferred method as it is very sensitive to the chosen constant. What is the best method to deal with this problem?

    A second question, I would like to investigate if the relation between my dependent variable and positive and negative numbers for my independent variable differs. In other words,

    Regress Initial Return = a + b1*X- + b2*X+ + c

    Where X+ { X > 0 --> X } so that I can compare the value of b1 and b2
    X =< 0 --> 0

    Thanks in advance,















  • #2
    Anke:
    welcome to this forum.
    If your concern is about normality of regressand and/or regressors, thsi is definitely not an issue, as normality is a (weak) requiriment for regression residual distribution only: hence, keep the variables on their original metric.
    As fas as you second question, unless what you're after is a simple -if- clause; please provide na excerpt/example of your data via -dataex-. Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Sorry for not posting my data in the first place. I have included it now.

      My first question was regarding the variables ir, EP and MTBV. I want to regress ir EP MTBV. However I believe heteroskedasticity is a problem for these variables. Which was the reason I wanted to transform them.

      My second question was regarding ir and Park 1. I want to regress Park1 ir+ ir- ( Where ir+ { ir>0 -> ir ; ir<=0 -> 0} and ir- { ir< 0 -> ir ; ir >= 0 -> 0 }



      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float(ir Park1) double(EP MTBV)
       -5.166667  .07506804  -.159280365600182             -1.21
       10.813824   .1032066 -.0066878660794277 -147.541614689595
        38.88889  .10917184 -.0662938710029665 -16.0235100278121
       34.349354  .23550603                  .  44.5581505927566
            48.5  .23550603 -.0111059090909091  244.116695177771
              -6  .12051503 -.0634009625160069  40.1109628597523
        69.81132   .7214728 -.0387479433288622  12.0501387314184
             -.8 .067390166 -.0355592676937005  33.0226105578414
           41.25  .08019377 -.0058494607843137  10.5911645390367
       -.3344482   .0543181 -.0260356337194098  98.8715063543959
       -8.695652   .0736302 -.0885745137984951  27.5266791660271
        37.43483   .2933486 -.0182533510540454 -5.15005761859218
       16.666666  .17131655 -.0260799456562239 -53.1452300809544
       14.636364  .14934006  -.017592212037795 -26.5432238984674
        44.17647   .1694825 -.0234113339649493 -26.0375007988284
             -10  .09197666 -.0470971682358634 -24.2358647333957
       14.802982   .0839516  -.050039886364976 -8.62371968726303
       1.3806707  .04028942 -.0512810570243555 -27.4959817514443
       12.472648   .0987192    -.1176154709482  -114.41515534373
         .464499  .05978204 -.0427312446626371 -13.2141207643215
       4.1666665  .04987947  -.085104811811677  -14.767256910096
        48.31256   .3466376 -.0282117760430546 -13.5544252436707
            11.5  .11276045 -.0352587786951645 -25.5120351203628
               2  .20114894 -.0816238197529043  -41.957090780234
       18.333334   .1211018 -.0309908751030619 -10.6989072293378
           -12.4  .12330186  -.103037493421294  7.76397048559178
       22.764227   .1165677 -.0024608129532119 -407.001762830482
       .27272728  .06753794 -.0263740485637518 -175.255595723014
       12.663755  .18736294  -.188837195853655 -8.66848198080891
       -13.55034  .04105661                  . -19.9408845953907
       29.605005   .1001632 -.0221103246609368  5.38980968515628
              36  .18792374 -.0196124819603476 -46.1387510609276
      -.13333334  .08526183 -.0451997177028823 -6.01585341985216
               0  .20665532  .0345480179998136  6.71121859411459
       -26.74271  .17444123  -.115604815755071 -1.62307989478876
       73.611115  .14753233 -.0323318648033409 -12.3952738905527
        22.01835   .1401964 -.0309239076547588 -25.9581190644933
            54.5  .27170283  .0699988441780822  4.38565447914064
       12.533334   .1410097 -.0262878751526938 -45.1066701680672
       64.210526   .1335804 -.0423969216033562 -19.0120274475902
        20.07992   .2088904 -.0484806101478577 -12.0719395762315
       -2.189781  .12871909  -.044957321504329 -8.83061844993994
            -3.2  .07972112 -.0688405087649353 -12.2407048055463
       18.918919  .07915023 -.0576209196793244 -14.0787035358114
       23.076923   .1691419                  . -12.9471547611634
            43.5  .18526524  .0154932864864865  28.1516537194421
       -6.666667  .08354575 -.0884703687045884 -5.42923845397969
       -6.382979  .10697188 -.0968848480683023 -8.70365557370355
        6.716418     .06168  -.156361725678508  4.25546260202251
        2.407287   .0507281 -.0653625478126614 -6.42481135663829
       22.135706  .24989253 -.0618220015939023 -6.32630520766506
        7.692307  .05562487   .141569144867227 -2.62950702789534
           26.25   .1112235 -.0616005698264012 -13.1068935998791
       -2.803738  .07818856  .0020503578474999  15.9652924427408
       20.459566  .09424493 -.0236279929625642 -38.9227069027828
        26.82927   .1980817 -.0536719651764872 -5.79647306866294
            34.8  .15181875 -.0886804383172765 -5.17784359329068
        5.817174  .12433068 -.0833231423171306 -4.89439852953822
       37.716263  .12071796 -.0464325715693879  -11.027244143288
        .1996008 .072239205  -.095297694555838  -4.7969980729702
      -16.785715  .10386935 -.0881968668821188  59.4221463573926
       10.714286  .08019377 -.0730490175361597 -6.39302537274392
         37.9123  .06091908 -.0583552288629429 -5.11387718129746
              26   .2423067  .0365911140687149  13.1290175792633
       32.333332   .0757264 -.0839369175692802 -4.76621061702784
        51.87437  .12840733 -.0136950716949654 -41.9593618581907
       1.0666667  .05094146 -.0425916617602054 -16.4806471782813
       11.242603  .17985405 -.0294926698711139 -20.5307319485658
          41.375  .17106213  -.043507116712349 -14.4226456210737
       10.634495   .3031678  -.176480366661885 -3.97945860913623
      -1.1538461  .05663421  -.162041081497423  4.59493844634628
       -29.53368  .14875774  -.198623633668869 -12.7983741070909
       13.793103  .13401136 -.0821903965486526 -1.60853109341809
        4.952381  .08535817  .0293445841843133  13.1366193348594
           -6.25  .05742464  -.105357359681444 -4.82182943615611
       -22.22222  .09257692   .150285069090909  3.08630710807278
        34.15638   .2037856 -.0491291776998745 -7.02240870146442
        24.88263   .1268361 -.0125287466756425 -7.17244296480765
       -53.30189    .327733 -.0090721627920684  104.479257612128
               0  .05002808 -.0096439877523572 -104.142255731911
        74.94118  .11723028 -.0675605532254492 -5.45105602348277
      -18.518518  .12894647  -.104828021954057  -5.8652461755929
       -28.93333  .15406907   -.15257939643256  -3.5152229790285
              50  .13197936  -.035969679638163  6.48430538513974
       20.318726  .10739539 -.0460410650864324 -15.7888643422812
       28.607853   .1734828  .0157503523937914  5.14302255289468
      -2.6694045  .07484468  -.341927928702089 -2.08348003136965
       25.438597  .03728819  .0075169245890396  15.7504459674215
      -2.0588236   .1404235 -.0626873923965982 -8.57098343219472
       36.842106  .13908553 -.0538867662922857 -8.70591670926046
        5.434783  .19772157 -.0250199472700409 -40.7146471719276
             -25  .15120652 -.0330262107048005 -31.3808385026949
      -23.333334    .090041  -.139476553447125 -2.32457541623199
        38.81453  .28467557 -.0649390292861098 -7.27790262566629
      -29.411764  .20207217  -.224893330899793 -13.4969132495994
         -5.9375  .03277328 -.0936620611751568  5.74347185545518
       68.882355  .25030366 -.0377385016703177  -12.661791243504
       35.064934  .07817847 -.0388599205239236 -15.1826768410328
          1.5625  .02538486 -.0669341963663729 -9.31845227783564
               0  .05655833  -.267602458624033 -1.63422547238951
      end



















      Comment


      • #4
        I'd say that heteroscedasticity is the least of your worries. How confident you are in Xb as a functional form and whether long tails or even outliers warp and tilt regression results seem to me bigger deals.

        Variables that can be positive, zero or negative but are prone to long tails or outliers are widely ignored in literature on transformations but are also covered in several journal papers. Cube roots, so called neglog and inverse sinh are three candidates. Of these cube roots are most likely to be familiar from early algebra teaching but if your education was like mine they were only touched on briefly. It's vital to notice that Stata isn't (and can't be) smart about cube roots of negative numbers, and you need to write your own code with the form

        Code:
        sign(x) * abs(x)^(1/3)
        as explained at https://www.stata-journal.com/articl...article=st0223

        The so-called neglog transformation is in Stata terms

        Code:
        sign(x) * log(1 + abs(x))
        or (better)

        Code:
        sign(x) * ln1p(abs(x)) 
        and has an arm-waving justification as preserving the sign of its argument (including mapping 0 to 0) and as behaving like log x for x >> 0 and like -log(-x) for x << 0.


        Using your example data I pushed the data for one variable MTBV through transplot (SSC) (see https://www.stata-journal.com/articl...article=st0223) Here I show normal quantile plots for the variable as it comes and as transformed by cube root and neglog. Using the normal as reference distribution is just a convenience and doesn't imply that a predictor need have a normal distribution. But the converse is that skewness, long tails and outliers that might cause problems are explicit on such plots.

        Code:
        transplot qnorm MTBV , trans(@ sign(@)*abs(@)^(1/3) sign(@)*log(1+abs(@))) combine(row(1)) ms(Oh)
        Note that the gap around 0 for both transforms is a side-effect of how far there are sample values at or near 0 and of the fact that the transformation function is steepest at 0. It's not pathological.

        Although there is always a question of interpretation, I would consider neglog as an alternative to the original scale for this predictor. It pulls in the outliers quite well.
        Click image for larger version

Name:	transplot.png
Views:	1
Size:	48.1 KB
ID:	1574830



        Comment


        • #5
          Sorry, the second link is just the first repeated and should be https://www.statalist.org/forums/for...dable-from-ssc

          Comment

          Working...
          X