Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate an empty matrix and fill it row by row in a loop?

    Hi guys, I'm very new to Stata and very stuck! I am trying to generate an empty matrix (which I have filled with '.' missing values) and then fill it row by row vertically with 18 values generated for 5 categories of a variable in a loop. I have done this by first creating the empty J(5, 18, .) matrix, then running ladder command in a loop to generate the data which is to be stored in the matrix, then generating a matrix from this data (I also wonder if it is better at this stage to make a vector instead of a matrix?), then merging the 2 matrices. However, the new values do not replace the '.' missing values of the empty matrix. My code is as follows:

    /////////////////////

    matrix varsub_trans_matrix = J(5, 18, .)

    foreach var_sub in `var_subs' {
    ladder `var_sub'
    matrix `var_sub'_matrix = (r(ident), r(P_ident), r(square), r(P_square), r(cube), r(P_cube), r(sqrt), r(P_sqrt), r(inv), r(P_inv), r(invsq), r(P_invsq), r(invcube), r(P_invcube), r(invsqrt), r(P_invsqrt), r(log), r(P_log))
    matrix `var'_matrix = varsub_trans_matrix \ `var_sub'_matrix
    }

    /////////////////////

    Is there a different way of merging the matrices to input my newly generated data and replace the missing values, so that the new data is inputted in iterative rows?

    In the following step, I plan to then combine the 5 x 18 matrices of 9 different variables in a larger J(45, 18, .) matrix, which I also generated with missing values.

    Many thanks for your help!!
    Last edited by Liz Broom; 26 May 2023, 03:40.

  • #2
    I can't easily follow what you are trying without a data example

    The natural order of transformations is that used by ladder --from cube downwards.

    For an alternative to ladder see transplot from SSC. https://www.statalist.org/forums/for...dable-from-ssc

    There was a detailed critique of ladder and its siblings in the presentation cited in that thread. Here is the reference again:

    The slides are accessible at https://www.stata.com/meeting/uk19/slides/uk19_cox.pptx
    Last edited by Nick Cox; 26 May 2023, 04:22.

    Comment


    • #3
      In essence, your code adds extra elements to the matrix; it doesn't replace anything.

      This code works in the sense that it gets a result.

      Code:
      sysuse auto, clear 
      
      local vars mpg weight price turn trunk 
      
      capture matrix drop A 
      
      foreach v of local vars { 
          ladder `v'
          matrix A = nullmat(A) , (r(invcube), r(P_invcube), r(invsq), r(P_invsq), r(inv), r(P_inv), r(invsqrt), ///
                      r(P_invsqrt), r(log), r(P_log), r(sqrt), r(P_sqrt), r(ident), r(P_ident), r(square),  /// 
                      r(P_square), r(cube), r(P_cube))' 
      } 
      
      local rownames r(invcube), r(P_invcube), r(invsq), r(P_invsq), r(inv), r(P_inv), r(invsqrt), r(P_invsqrt), r(log), r(P_log), r(sqrt), r(P_sqrt), r(ident), r(P_ident), r(square), r(P_square), r(cube), r(P_cube 
      local rownames : subinstr local rownames "r(" "", all 
      local rownames : subinstr local rownames ")," "", all 
      
      mat colnames A = `vars'
      mat rownames A = `rownames'
      
      mat li A 
      
      A[18,5]
                       mpg     weight      price       turn      trunk
        invcube  24.296495  12.366109  6.7738147  5.1172701  62.503264
      P_invcube  5.298e-06  .00206411  .03381309  .07741033  2.677e-14
          invsq  11.987886  8.0306501  1.7523758   5.442786  41.465625
        P_invsq  .00249381  .01803709  .41636712  .06578306  9.905e-10
            inv  2.3593071  8.0371655  4.7115097  6.3145239  18.993858
          P_inv  .30738522  .01797843   .0948219  .04254206  .00007508
        invsqrt  .19910439  9.5796074  6.6180213  6.3303476  9.9333864
      P_invsqrt   .9052427  .00831409  .03655232  .04220681  .00696615
            log  .86951019  10.369083  10.487306  5.8491466  4.6114645
          P_log   .6474232   .0056025  .00528093   .0536876  .09968578
           sqrt  4.9426961  8.8087444  15.815618  4.9348465   3.749922
         P_sqrt  .08447091  .01222378  .00036786  .08480309  .15336095
          ident  10.949392  5.6597478  21.767246  3.7600703  4.1940211
        P_ident   .0041915   .0590203  .00001876  .15258474  .12282305
         square  27.027402  4.4900511  33.773732  2.2164426  4.5459277
       P_square  1.352e-06  .10592483  4.636e-08  .33014567  .10300643
           cube  43.592814  12.937007  44.972316  4.3309401  12.174223
         P_cube  3.419e-10  .00155155  1.715e-10  .11469601  .00227196
      Minimally, 18 rows and 5 columns in my view work better than the transpose.

      I wouldn't want to have to read this. If you remain wedded to this approach -- manifestly from #2 I am not -- I would suggest splitting the P-values from the rest.

      Comment


      • #4
        Here is the last suggestion taken forward:

        Code:
        sysuse auto, clear 
        
        local vars mpg weight price turn trunk 
        
        capture matrix drop A 
        capture matrix drop A1
        capture matrix drop A2
        
        
        foreach v of local vars { 
            ladder `v'
            matrix A = nullmat(A) , (r(invcube), r(P_invcube), r(invsq), r(P_invsq), r(inv), r(P_inv), r(invsqrt), ///
                        r(P_invsqrt), r(log), r(P_log), r(sqrt), r(P_sqrt), r(ident), r(P_ident), r(square),  /// 
                        r(P_square), r(cube), r(P_cube))' 
            matrix A1 = nullmat(A1) , (r(invcube), r(invsq), r(inv), r(invsqrt), r(log), r(sqrt), r(ident), r(square), r(cube))' 
            matrix A2 = nullmat(A2) , (r(P_invcube), r(P_invsq), r(P_inv), ///
                        r(P_invsqrt), r(P_log), r(P_sqrt), r(P_ident), r(P_square), r(P_cube))' 
        } 
        
        local rownames r(invcube), r(P_invcube), r(invsq), r(P_invsq), r(inv), r(P_inv), r(invsqrt), r(P_invsqrt), r(log), r(P_log), r(sqrt), r(P_sqrt), r(ident), r(P_ident), r(square), r(P_square), r(cube), r(P_cube 
        local rownames : subinstr local rownames "r(" "", all 
        local rownames : subinstr local rownames ")," "", all 
        
        mat colnames A = `vars'
        mat rownames A = `rownames'
        
        mat li A 
        
        local rownames r(invcube), r(invsq), r(inv), r(invsqrt), r(log), r(sqrt), r(ident), r(square), r(cube
        local rownames : subinstr local rownames "r(" "", all 
        local rownames : subinstr local rownames ")," "", all 
                      
        matrix rownames A1 = `rownames'
        matrix colnames A1 = `vars' 
        matrix rownames A2 = `rownames'
        matrix colnames A2 = `vars'
        
        mat li A1, format(%4.1f)
        
        mat li A2, format(%4.3f)

        Code:
         
        . mat li A1, format(%4.1f)
        
        A1[9,5]
                    mpg  weight   price    turn   trunk
        invcube    24.3    12.4     6.8     5.1    62.5
          invsq    12.0     8.0     1.8     5.4    41.5
            inv     2.4     8.0     4.7     6.3    19.0
        invsqrt     0.2     9.6     6.6     6.3     9.9
            log     0.9    10.4    10.5     5.8     4.6
           sqrt     4.9     8.8    15.8     4.9     3.7
          ident    10.9     5.7    21.8     3.8     4.2
         square    27.0     4.5    33.8     2.2     4.5
           cube    43.6    12.9    45.0     4.3    12.2
        
        . 
        . mat li A2, format(%4.3f)
        
        A2[9,5]
                    mpg  weight   price    turn   trunk
        invcube   0.000   0.002   0.034   0.077   0.000
          invsq   0.002   0.018   0.416   0.066   0.000
            inv   0.307   0.018   0.095   0.043   0.000
        invsqrt   0.905   0.008   0.037   0.042   0.007
            log   0.647   0.006   0.005   0.054   0.100
           sqrt   0.084   0.012   0.000   0.085   0.153
          ident   0.004   0.059   0.000   0.153   0.123
         square   0.000   0.106   0.000   0.330   0.103
           cube   0.000   0.002   0.000   0.115   0.002

        Comment


        • #5
          The example is arbitrary -- just the good old auto data -- but it underlines the perils and pitfalls of letting any command automate transformation choices for you (and, despite its wonderful name, I am not much keener on Box-Cox).

          In particular ladder suggests or at least implies inverse square root for mpg as getting you closer to a normal distribution. I don't recollect anyone showing enthusiasm for inverse square roots. It's true that mpg is right-skewed and has higher kurtosis than a normal, but the marginal distribution is not problematic for most analyses.

          What can be an issue is when you look at relationships, say trying to predict mpg from weight where moderate but definite curvature limits the utility or success of a linear fit. But that issue is solved by using the reciprocal of mpg -- labelled inv in ladder output -- which has a much simpler rationale in terms of units and dimensions than inverse square root, as gallons per mile is an easy reworking of miles per gallon. Metric equivalents of miles per gallon are sensibly used in many countries. What's more, the physical or engineering interpretation of the relationship is now easier to approach.

          Comment

          Working...
          X