No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Log Transformation Willingness to Pay Data

    I am working on a OLS meta regression with a Willingness to Pay (WTP) variable as dependend variable. Since the values vary from small USD cent amounts to ~30 I want to use it in log form. Unfortunately, some WTP values are negative. I've read about the procedure of adding a constant so that all values are positive. I think this is the easiest way, but I am not sure if I can apply this on any data. I wonder if there is any other tool in stata 16 that can be used for that?
    Additionally, do I need to transform also explanatory variables? I have some with monetary values, even though they are not directly influencing WTP, and several others, categorical and discrete.
    Thanks in advance for your help!

  • #2
    30 in what units? Cents, dollars, million dollars, .... I would not transform WTP unless there is a standard transformation used by people who write on it. Zero has a clear interpretation as a point on the scale and transforming to fudge that won't make much easier. Consider Poisson regression (more generally, generalized linear models with logarithmic link).

    Whether it is a good idea to transform explanatory variables is a different question. Categorical variables aren't usually candidates in any case.


    • #3
      30 Dollars. I thought logarithms are a good way for keeping the variance low, since the wtp values differ in magnitude, which is probably not unusual for a meta analysis. Maybe I give Poisson a try, thanks for the suggestion.
      Last edited by Janik Kaden; 17 Oct 2020, 10:06.


      • #4
        What is the smallest (meaning, largest negative) value?


        • #5
          -27, while the next is already -7


          • #6
            The fudge log(y + k) where k must be big enough for (y + k) > 0 is highly problematic:

            1. How do you choose k apart from that rule?

            2. Results are highly sensitive to choice of k. See graph below.

            3. The curve is necessarily steepest for the lowest arguments, which may not be what you want. At worst, outliers can be created by this transformation.

            4. The transformation doesn't even respect the sign of the argument.

            5. How do you compare results with any other study's results?

            For a range from -27 cents to 30 dollars, these problems aren't trivial. Your lowest two data points are plotted on the curve below.

            Click image for larger version

Name:	badidea.png
Views:	1
Size:	27.5 KB
ID:	1577737

            The transformation sign(WTP) log(1 + |WTP|) has fewer problems and has the merit that it preserves sign, but I wouldn't use it unless you can find that other people in your field use it -- or that you can justify it confidently.

            Some papers:

            Whittaker, J., Whitehead, C., & Somers, M. 2005. The neglog transformation and quantile regression for the analysis of a large credit scoring database. Journal of the Royal Statistical Society. Series C (Applied Statistics) 54: 863-878.

            Webber, J.B.W. 2013.. A bi-symmetric log transformation for wide-range data. Measurement Science and Technology 24: 027001. doi 10.1088/0957-0233/24/2/027001.

            See also inverse hyperbolic sine or asinh (sometimes incorrectly called arcsinh).


            • #7
              Some rummaging around found these papers too. Despite their length, the papers' abstracts fail to be concrete about exactly what they did. I have the impression that the papers should be freely available but at the time of writing the Wiley website appears to be down.

              Parks D.R., Roederer, M. and Moore, W.A.
              A new "logicle" display method avoids deceptive effects of logarithmic scaling for low signals and compensated data.
              Cytometry A 69: 541-551.
              doi: 10.1002/cyto.a.20258. PMID: 16604519.

              Bagwell, C.B.
              Hyperlog - a flexible log-like transform for negative, zero, and positive valued data.
              Cytometry A. 64: 34-42.
              doi: 10.1002/cyto.a.20114. PMID: 15700280.


              • #8
                Thanks a lot so far! I will work through these papers and try to find some cases where something like that was applied to WTP values. This is already a big help for me.