Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why generating the variable in logarithm form change the sign of correlation and scatterplot?

    Hello, first of all I apologize for this simple question.
    I try to do some tests to find a relationship between fertility and stringency indexes. When I use the fertility rate correlation between them will be positive but when I use logarithm of fertility the correlation will be negative. I do not think it makes sense. I search a lot but unfortunately, I could not reach a specific reason.

    Code:
     
    Variables (1) (2) (3) (4)
    (1) lnfer_change 1.000
    (2) GDP_growth2 0.333 1.000
    (3) stringencyinde~s -0.238 -0.186 1.000
    (4) GDP_growth 0.250 0.863 -0.197 1.000
    and without ln:
    Code:
     
    Variables (1) (2) (3) (4)
    (1) fer_change 1.000
    (2) GDP_growth2 -0.074 1.000
    (3) stringencyinde~s 0.346 -0.186 1.000
    (4) GDP_growth -0.077 0.863 -0.197 1.000
    Best regards,

  • #2
    This thread may be of interest: https://www.statalist.org/forums/for...-two-variables

    Comment


    • #3
      @Ken Chui thank you so much for sharing this.

      Best regards,

      Comment


      • #4
        Below is an example where y and x are positively correlated while ln(y) and x are negatively correlated.

        Code:
        clear
        set obs 100
        gen x = _n
        gen y = x in 1/90
        replace y = 0.01 if mi(y)
        gen lny = ln(y)
        Code:
        . corr y x
        (obs=100)
        
                     |        y        x
        -------------+------------------
                   y |   1.0000
                   x |   0.4952   1.0000
        
        
        . corr lny x
        (obs=100)
        
                     |      lny        x
        -------------+------------------
                 lny |   1.0000
                   x |  -0.2321   1.0000

        Comment


        • #5
          Logarithm is a, nonlinear transformation. If the side-effect is to pull in or to pull out observations, a weak correlation can easily change sign, and the same can even be true of stronger correlations.

          You already showed us that GDP growth can be negative (or in principle exactly zero). Taking its logarithm would make no sense here as you would get results only for positive values.

          Comment


          • #6
            @Fei Wang and @Nick Cox thank you so much for your reply.

            Best regards,

            Comment

            Working...
            X