Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • double loops.

    Dear All, I run the following code
    Code:
    clear 
    set obs `=40 * 500 * 10' 
    set seed 2803 
    egen year = seq(), block(10000) 
    egen industry = seq(), block(4000) 
    gen x = runiform()
    
    gen x2 = .
    forvalues y = 1(1)20 {
      forvalues i = 1(1)50 {
        xtile temp = x if (year==`y')&(industry==`i'), nq(5)
        replace x2 = temp if (year==`y')&(industry==`i')
        drop temp
        local i = `i'+1
      }
      local y = `y'+1  
    }
    but encounter the error message as
    Code:
    nquantiles() must be less than or equal to number of observations plus one
    Can anyone help me out? Thanks.
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    I know that this can be done in a more efficient way like (ssc install astile/runby)
    Code:
    // r; t=0.13 9:19:30
    astile x5 = x, nq(5) by(year industry)
    
    // r; t=0.37 9:19:25
    cap program drop by_xtile
    program define by_xtile
      xtile x6 = x, nq(5)  
    end
    runby by_xtile, by(year industry)
    I just want to compare their respective efficiency (time).
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

    Comment


    • #3
      The error message implies that there is some combination of year and industry for which there are fewer than 4 observations. It is, of course, impossible to calculate quintiles with fewer than 4 observations. You can find them:

      Code:
      by industry year, sort: gen byte too_small = (_N < 4)
      tab industry year if too_small
      Edit: Correct the minimum allowable sample size, originally specified as 5, to 4.
      Last edited by Clyde Schechter; 10 Feb 2019, 18:47.

      Comment


      • #4
        It is not clear to me what you hoped your data would look like, but when I run your code I obtain the following results.
        Code:
        . clear 
        
        . set obs `=40 * 500 * 10' 
        number of observations (_N) was 0, now 200,000
        
        . set seed 2803 
        
        . egen year = seq(), block(10000) 
        
        . egen industry = seq(), block(4000) 
        
        . gen x = runiform()
        
        . 
        . gen x2 = .
        (200,000 missing values generated)
        
        . forvalues y = 1(1)20 {
          2.   forvalues i = 1(1)50 {
          3.     xtile temp = x if (year==`y')&(industry==`i'), nq(5)
          4.     replace x2 = temp if (year==`y')&(industry==`i')
          5.     drop temp
          6.     local i = `i'+1
          7.   }
          8.   local y = `y'+1  
          9. }
        (4,000 real changes made)
        (4,000 real changes made)
        (2,000 real changes made)
        nquantiles() must be less than or equal to number of observations plus one
        r(198);
        
        end of do-file
        
        r(198);
        
        . tab industry year
        
                   |                               year
          industry |         1          2          3          4          5          6 |     Total
        -----------+------------------------------------------------------------------+----------
                 1 |     4,000          0          0          0          0          0 |     4,000 
                 2 |     4,000          0          0          0          0          0 |     4,000 
                 3 |     2,000      2,000          0          0          0          0 |     4,000 
                 4 |         0      4,000          0          0          0          0 |     4,000 
                 5 |         0      4,000          0          0          0          0 |     4,000 
                 6 |         0          0      4,000          0          0          0 |     4,000 
                 7 |         0          0      4,000          0          0          0 |     4,000 
                 8 |         0          0      2,000      2,000          0          0 |     4,000 
                 9 |         0          0          0      4,000          0          0 |     4,000 
                10 |         0          0          0      4,000          0          0 |     4,000 
                11 |         0          0          0          0      4,000          0 |     4,000 
                12 |         0          0          0          0      4,000          0 |     4,000 
                13 |         0          0          0          0      2,000      2,000 |     4,000 
                14 |         0          0          0          0          0      4,000 |     4,000 
                15 |         0          0          0          0          0      4,000 |     4,000 
                16 |         0          0          0          0          0          0 |     4,000 
                17 |         0          0          0          0          0          0 |     4,000 
                18 |         0          0          0          0          0          0 |     4,000 
                19 |         0          0          0          0          0          0 |     4,000 
                20 |         0          0          0          0          0          0 |     4,000 
                21 |         0          0          0          0          0          0 |     4,000 
                22 |         0          0          0          0          0          0 |     4,000 
        --Break--
        We can see that the inner loop actually succeeded three times before it failed for year=1 and industry=4, and then the tabulation shows us that the code was trying to create quantiles of 0 observations, and 5 quantiles is not less than 0 observations plus 1.

        Comment


        • #5
          Dear Clyde, Thanks for the reply. Is it possible to modify the simulation code to allow the code to run?
          Ho-Chuan (River) Huang
          Stata 19.0, MP(4)

          Comment


          • #6
            Dear William, Thanks for the reply. Is it possible to modify the simulation code to allow the code to run?
            Ho-Chuan (River) Huang
            Stata 19.0, MP(4)

            Comment


            • #7
              This might do.
              Code:
              clear
              set obs `=40 * 500 * 10'
              set seed 2803
              egen year = seq(), block(10000)
              egen industry = seq(), block(4000)
              gen x = runiform()
              
              gen x2 = .
              forvalues y = 1(1)20 {
                forvalues i = 1(1)50 {
               count  if (year==`y')&(industry==`i')
                if r(N)>=4 {
                      xtile temp = x if (year==`y')&(industry==`i'), nq(5)
                      replace x2 = temp if (year==`y')&(industry==`i')
                      drop temp
               }
               local i = `i'+1
                }
              local y = `y'+1  
              }
              The three green lines are new. The two red lines should be deleted, they serve no good purpose - forvalues takes care of incrementing the loop indexes.
              Last edited by William Lisowski; 10 Feb 2019, 19:14.

              Comment


              • #8
                Dear William, Thanks again for your helpful suggestions.
                Ho-Chuan (River) Huang
                Stata 19.0, MP(4)

                Comment


                • #9
                  For this particular data structure, this should work:

                  Code:
                  cscript
                  set obs `=40 * 500 * 10'
                  set seed 2803
                  egen year = seq(), block(10000)
                  egen industry = seq(), block(4000)
                  gen x = runiform()
                  
                  gen x2 = .
                  local from = 1
                  local to   = 3
                  forvalues y = 1(1)20 {
                    forvalues i = `from'/`to' {
                      xtile temp = x if (year==`y')&(industry==`i'), nq(5)
                      replace x2 = temp if (year==`y')&(industry==`i')
                      drop temp
                    }
                    local one = !mod(`y',2)
                    local from = `from' + 2 + `one'
                    local to   = `to' + 2 + `one'
                  }
                  Last edited by Rafal Raciborski (StataCorp); 10 Feb 2019, 19:59.

                  Comment


                  • #10
                    Dear Rafal, Many thanks for this helpful suggestion.
                    Ho-Chuan (River) Huang
                    Stata 19.0, MP(4)

                    Comment

                    Working...
                    X