Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is it possible for me to create a command similar to inlist?

    I am interested in creating a user-created command very similar to inlist, except that it will accept more than 10 arguments. Is it even possible to do this?

    Note that someone else has made "inlist2" (https://ideas.repec.org/c/boc/bocode/s458920.html), which does take more than 10 arguments, but its usage is different than inlist in that it creates a dummy variable, and that it cannot be used on one line like: "keep if inlist2(var, val1, val2)".

  • #2
    I'm pretty sure the answer is no.

    Comment


    • #3
      Clyde is correct. Only StataCorp can create the specific functionality you are asking about. However, you could produce a command like -inlist2- if you desire, and work around the limitations that this may involve.

      Note that the built-in function -inlist()- is limited to a max of 10 strings or 250 numbers.

      Comment


      • #4
        The key nuance here is that inlist() is a Stata function, not a command; and users cannot write functions strict sense. You can write commands and egen functions and Mata functions, but none of those fits the bill here.

        The limit referred to is the number of string arguments; the limit for numeric arguments is higher.



        Comment


        • #5
          That makes sense (and is too bad!). Thanks all for your quick responses!

          Comment


          • #6
            I've often seen people refer to the limit on string arguments as inhibiting if not irritating, although I wonder what limit would not be? 250?

            For now, apart from inlist2 from SSC, which I've never used, there are various work-arounds including

            0. Consider whether there is a more systematic and concise way to specify a list, e.g. working with states in regions of the US not states themselves.

            1. Consider whether the negation !inlist() will work better.

            2. Consider whether inrange() will do the job, as it will work with string arguments

            Code:
            . di inrange("Stata", "Analytic", "Zetetic")
            1
            3. Just chaining together multiple inlist() calls.

            4. Using merge instead: seriously! See https://www.stata.com/support/faqs/d...s-for-subsets/

            I said "Including" and there must be others.

            Comment


            • #7
              I said "Including" and there must be others.
              5. -encode- the string variable. Then you can put up to 250 arguments in. And, if you strategically define a value label to use for the -encode-ing you might easily reduce the whole thing to a short -inrange()-.

              Comment


              • #8
                Nick and Clyde show how to address the issue seriously.

                Here is a not so serious approach for calling community-contributed functions:

                I set up a wrapper command, ccfcn (community-contributed function), that acts as a prefix

                Code:
                *! version 1.0.0  06apr2023
                program ccfcn
                    
                    version 17
                    
                    mata : ccfcn()
                    
                end
                
                
                version 17
                
                
                mata :
                
                
                mata set matastrict   on
                mata set mataoptimize on
                
                
                void ccfcn()
                {
                    transmorphic scalar t
                    string       scalar tok
                    string       scalar cmdline
                    real         scalar rc
                    
                    
                    t = tokeninit("", "@")
                        tokenset(t, st_local("0"))
                    
                    cmdline = ""
                    
                    while ((tok=tokenget(t)) != "") {
                        
                        if (tok == "@") tok = call_ccfcn(t)
                        
                        cmdline = cmdline+tok
                        
                    }
                    
                    if (rc = _stata(cmdline)) exit(rc)
                }
                
                
                string scalar call_ccfcn(transmorphic scalar t)
                {
                    string scalar fcnname
                    string scalar fcnargs
                    string scalar tmpname
                    real   scalar rc
                    
                    
                    tokenpchars(t, "")
                    tokenqchars(t, "()")
                    
                    fcnname = tokenget(t)
                    fcnargs = tokenget(t)
                    
                    tokenpchars(t, "@")
                    tokenqchars(t, (`""""', `"`""'"'))
                    
                    tmpname = st_tempname()
                    
                    if (rc = _stata("ccfcn"+" "+fcnname+" "+tmpname+" "+fcnargs))
                        exit(rc)
                    
                    return(tmpname)
                }
                
                
                end
                
                
                exit

                With this setup, I can then define my (pseudo-)function as a Stata program that take two arguments: a temporary name and a list of arguments, enclosed in parentheses.

                Code:
                program inlist_10_plus
                    
                    gettoken varname 0 : 0
                    gettoken fcnargs 0 : 0 , match(leftpar)
                    
                    generate `varname' = 0
                    
                    gettoken firstarg fcnargs : fcnargs  , parse(",") bind
                    
                    while (strtrim(`"`fcnargs'"') != "") {
                        
                        gettoken comma fcnargs : fcnargs , parse(",") bind
                        
                        gettoken value fcnargs : fcnargs , parse(",") quotes bind
                        
                        quietly replace `varname' = 1 if `firstarg' == `value'
                        
                    }
                    
                end

                I can then call my (pseudo-)function

                Code:
                sysuse auto
                
                ccfcn keep if @inlist_10_plus(make, ///
                    "AMC Concord",   /// 1
                    "AMC Pacer",     /// 2
                    "AMC Spirit",    /// 3
                    "Buick Century", /// 4
                    "Buick Electra", /// 5
                    "Buick LeSabre", /// 6
                    "Buick Opel",    /// 7
                    "Buick Regal",   /// 8
                    "Buick Riviera", /// 9
                    "Buick Skylark", /// 10
                    "Cad. Deville"   /// 11
                    )

                to obtain

                Code:
                . ccfcn keep if @inlist_10_plus(make, ///
                >     "AMC Concord",   /// 1
                >     "AMC Pacer",     /// 2
                >     "AMC Spirit",    /// 3
                >     "Buick Century", /// 4
                >     "Buick Electra", /// 5
                >     "Buick LeSabre", /// 6
                >     "Buick Opel",    /// 7
                >     "Buick Regal",   /// 8
                >     "Buick Riviera", /// 9
                >     "Buick Skylark", /// 10
                >     "Cad. Deville"   /// 11
                >     )
                (63 observations deleted)
                Last edited by daniel klein; 06 Apr 2023, 13:19. Reason: typo in Mata code

                Comment


                • #9
                  The limit of 10 string arguments may be a side-issue. Strings can be very long indeed, regardless of how often people want to type such long strings or even copy them somehow.

                  Comment

                  Working...
                  X