Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help to recall Mata function inside postfile command

    Hi there,

    I am trying to run at least 10,000 replications of my code with a Mata function inside using the postfile command. The code should replicate the results of xtreg y x1 x2, fe r and collect 10,000 values of robust standard errors in a dataset. However, when I attempt to run the code I get the following error message:

    unknown function ()
    post: above message corresponds to expression 1, variable se1



    I might have made some mistake in either recalling the arguments of the function or in defining the function in Mata, or both. Unfortunately, I cannot see/understand where I made such mistake. Can someone help me identifying the problem, please? Thank you.


    Just to be clear, when I run a simplified code with Mata code inside Stata code and without the postfile command, everything runs smoothly and I get the desired results.


    Here, a simplified version of the code:
    _________________________________

    clear all
    ***Mata function
    mata:
    mata clear

    function robest(real matrix X,
    real matrix XtX,
    real vector res,
    real matrix info,
    real scalar k,
    real scalar N,
    real scalar nt)
    {

    V = J(k,k,.)

    for(i=1;i<=N; i++) {
    Xi = panelsubmatrix(X,i,info)
    res_i = panelsubmatrix(res,i,info)
    V = V + Xi'*(res_i*res_i')*Xi
    }

    Vhat = invsym(XtX)*V*invsym(XtX)*((nt-1)/(nt-k))*(N/(N-1))
    se = sqrt(diagonal(Vhat))

    results = se'
    return(results)
    }
    end


    ***Simulation program
    program mysim
    version 14
    tempname sim
    postfile `sim' se1 se2 se3 using reg_stimates, replace
    quietly {
    forvalues i = 1/10000 {
    drop _all
    set obs 50
    gen id = _n
    expand 5
    bysort id: generate time = _n

    gen y = rnormal()
    gen x1 = rnormal()
    gen x2 = rnormal()
    gen cons = 1

    foreach j in y x1 x2{
    bys id: egen m_`j' = mean(`j')
    gen dm_`j' = `j' - m_`j'
    sum `j'
    replace dm_`j' = dm_`j' + r(mean)
    }

    putmata y = dm_y X = (dm_x1 dm_x2 cons), replace
    m id = st_data(., "id")
    m info = panelsetup(id, 1)

    m N = rows(info)
    m nt = rows(y)
    m k = cols(X)

    m XtX = cross(X,X)
    m Xty = cross(X,y)
    m b = cholsolve(XtX,Xty)
    m res = y - X*b

    m robest(X,XtX,res,info,k,N,nt)

    post `sim' (se1)(se2)(se3)
    }
    }
    postclose `sim'
    end

    ***Run simulation
    set seed 12345
    mysim

  • #2
    Welcome to Statalist.

    Here is simplified code that should point you in a useful direction. Feel free to ask followup questions, but given how far you'd gotten on your own, I'm guessing you'll be able to see how this applies to your code.
    Code:
    cls
    clear all
    mata:
    mata clear
    function gnxl() {
    results = 1\2\3
    "here are the results that gnxl() will return to the Mata environment"
    results
    return (results)
    }
    end
    
    
    program mysim
    tempname sim
    postfile `sim' se1 se2 se3 using xkcd, replace
    mata sss = gnxl()
    display _newline "here is what gnxl() returned that Mata placed into the Mata matrix sss"
    mata sss
    mata st_matrix("se",sss)
    display _newline  "here is what Mata placed into the Stata matrix se"
    matrix list se
    post `sim' (se[1,1]) (se[2,1]) (se[3,1])
    postclose `sim'
    end
    
    mysim
    use xkcd, clear
    // here is what is in the postfile
    list
    Code:
    . clear all
    
    . mata:
    ------------------------------------------------- mata (type end to exit) ---------------------
    : mata clear
    
    : function gnxl() {
    > results = 1\2\3
    > "here are the results that gnxl() will return to the Mata environment"
    > results
    > return (results)
    > }
    
    : end
    -----------------------------------------------------------------------------------------------
    
    .
    .
    . program mysim
      1. tempname sim
      2. postfile `sim' se1 se2 se3 using xkcd, replace
      3. mata sss = gnxl()
      4. display _newline "here is what gnxl() returned that Mata placed into the Mata matrix sss"
      5. mata sss
      6. mata st_matrix("se",sss)
      7. display _newline  "here is what Mata placed into the Stata matrix se"
      8. matrix list se
      9. post `sim' (se[1,1]) (se[2,1]) (se[3,1])
     10. postclose `sim'
     11. end
    
    .
    . mysim
      here are the results that gnxl() will return to the Mata environment
           1
        +-----+
      1 |  1  |
      2 |  2  |
      3 |  3  |
        +-----+
    
    here is what gnxl() returned that Mata placed into the Mata matrix sss
           1
        +-----+
      1 |  1  |
      2 |  2  |
      3 |  3  |
        +-----+
    
    here is what Mata placed into the Stata matrix se
    
    se[3,1]
        c1
    r1   1
    r2   2
    r3   3
    
    . use xkcd, clear
    
    . // here is what is in the postfile
    . list
    
         +-----------------+
         | se1   se2   se3 |
         |-----------------|
      1. |   1     2     3 |
         +-----------------+
    
    .
    Added in edit: a slightly tighter version of the mysim program.
    Code:
    program mysim
    tempname sim
    postfile `sim' se1 se2 se3 using xkcd, replace
    mata st_matrix("se",gnxl())
    display _newline  "here is what Mata placed into the Stata matrix se"
    matrix list se
    post `sim' (se[1,1]) (se[2,1]) (se[3,1])
    postclose `sim'
    end
    Last edited by William Lisowski; 12 Feb 2019, 09:44.

    Comment


    • #3
      Hi William,

      Thank you for your detailed example. Now, I clearly see what was missing in my code.

      However, I still get the same error - r(133) - despite the corrections. I wonder if it makes any difference that you have an already populated vector in your example; whereas, I have a vector which is populated once the arguments are passed through the function.

      Comment


      • #4
        Without seeing your corrected code, I would only be guessing about why it is not working.

        Comment


        • #5
          You're perfectly right. So, this is the edited version of the code:


          __________________________________
          ***Mata function
          mata:
          mata clear

          function robest(real matrix X,
          real matrix XtX,
          real vector res,
          real matrix info,
          real scalar k,
          real scalar N,
          real scalar nt)
          {


          for(i=1;i<=N; i++) {
          Xi = panelsubmatrix(X,i,info)
          res_i = panelsubmatrix(res,i,info)
          V = J(k,k,.)
          V = V + Xi'*(res_i*res_i')*Xi
          }

          Vhat = invsym(XtX)*V*invsym(XtX)*((nt-1)/(nt-k))*(N/(N-1))
          se = sqrt(diagonal(Vhat))

          results = se

          results
          return(results)
          }
          end


          ***Simulation program
          program mysim
          version 14
          tempname sim
          postfile `sim' se1 se2 se3 using reg_stimates, replace
          quietly {
          forvalues i = 1/10000 {
          drop _all
          set obs 50
          gen id = _n
          expand 5
          bysort id: generate time = _n

          gen y = rnormal()
          gen x1 = rnormal()
          gen x2 = rnormal()
          gen cons = 1

          foreach j in y x1 x2{
          bys id: egen m_`j'=mean(`j')
          gen dm_`j' = `j' - m_`j'
          sum `j'
          replace dm_`j' = dm_`j'+r(mean)
          }

          putmata y = dm_y X = (dm_x1 dm_x2 cons), replace
          m id = st_data(., "id")
          m info = panelsetup(id, 1)

          m N = rows(info)
          m nt = rows(y)
          m k = cols(X)

          m XtX = cross(X,X)
          m Xty = cross(X,y)
          m b = cholsolve(XtX,Xty)
          m res = y - X*b

          m sss = robest(X,XtX,res,info,k,N,nt)

          m sss
          m st_matrix("se",sss)

          matrix list se

          post `sim' (se[1,1])(se[2,1])(se[3,1])
          }
          }
          postclose `sim'
          end

          ***Run simulation
          set seed 12345
          mysim
          u reg_stimates, clear
          list

          Comment


          • #6
            After playing with your code, I have found that the immediate cause of your problem is that your post command is does not match the post command in my code. Instead of
            Code:
            post `sim' (se[1,1])(se[2,1])(se[3,1])
            you need
            Code:
            post `sim' (se[1,1]) (se[2,1]) (se[3,1])
            However, that will not be the end of your problems. I changed the forvalues loop to run 10 times rather than 10000 time (always a good idea when you're running code that isn't working) and got the following results.
            Code:
            . u reg_stimates, clear
            
            . list
            
                 +-----------------+
                 | se1   se2   se3 |
                 |-----------------|
              1. |   .     .     . |
              2. |   .     .     . |
              3. |   .     .     . |
              4. |   .     .     . |
              5. |   .     .     . |
                 |-----------------|
              6. |   .     .     . |
              7. |   .     .     . |
              8. |   .     .     . |
              9. |   .     .     . |
             10. |   .     .     . |
                 +-----------------+
            So next I changed the forvalues loop to run 1 time, and changed quietly to noisily.
            Code:
            . ***Run simulation
            . set seed 12345
            
            . mysim
            number of observations (_N) was 0, now 50
            (200 observations created)
            
                Variable |        Obs        Mean    Std. Dev.       Min        Max
            -------------+---------------------------------------------------------
                       y |        250    .0966823     1.00641  -2.809036   3.021777
            (250 real changes made)
            
                Variable |        Obs        Mean    Std. Dev.       Min        Max
            -------------+---------------------------------------------------------
                      x1 |        250    .0116697     1.07827  -2.949293   3.653764
            (250 real changes made)
            
                Variable |        Obs        Mean    Std. Dev.       Min        Max
            -------------+---------------------------------------------------------
                      x2 |        250    .0287728    1.017686  -2.771651   2.599667
            (250 real changes made)
            (1 vector, 1 matrix posted)
                   1
                +-----+
              1 |  .  |
              2 |  .  |
              3 |  .  |
                +-----+
                   1
                +-----+
              1 |  .  |
              2 |  .  |
              3 |  .  |
                +-----+
            
            se[3,1]
                c1
            r1   .
            r2   .
            r3   .
            
            . u reg_stimates, clear 
            
            . list
            
                 +-----------------+
                 | se1   se2   se3 |
                 |-----------------|
              1. |   .     .     . |
                 +-----------------+
            Now we can see that the results vector calculated in robest() isn't what you hope it will be. Further experimentation shows that V is initialized to a matrix full of missing values, to which other matrices are added. Basic Stata: the result of an arithmetic calculation that includes a missing value will be a missing value. You needed to initialize V to a matrix full of zeroes.

            Actually, your code is in a loop, reinitializing V each time through the loop, then adding a matrix to it, the reinitializing it the next time through the loop. I think you are trying to add up a number of matrices, and what you want is
            Code:
            V = J(k,k,0) 
            for(i=1;i<=N; i++) {
            Xi = panelsubmatrix(X,i,info) 
            res_i = panelsubmatrix(res,i,info)
            V = V + Xi'*(res_i*res_i')*Xi 
            }
            Vhat = invsym(XtX)*V*invsym(XtX)*((nt-1)/(nt-k))*(N/(N-1))
            se = sqrt(diagonal(Vhat))
            When I make those changes to your robest() function and to your post command I at least get nonmissing values in the output. I have no idea if they're correct, but at least they are not missing, and the postfile is successfully created.
            Last edited by William Lisowski; 12 Feb 2019, 17:58.

            Comment


            • #7
              The code runs smoothly and I finally get the postfile .dta. I double-checked the results from xtreg and the standard errors are extremely close. So, the simulation results must be correct.

              Thank you for your guidance and your help, William.


              Anna

              Comment


              • #8
                Hi Anna Polsi , I'm a bit late to the party but I wanted to mention that i) that's some impressive code!, and ii) you can actually do it in a few lines of code if you just call the Mata functions underlying reghdfe (see "help reghdfe_mata" if you have reghdfe installed).

                -Sergio

                Comment

                Working...
                X