Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • With -bayesmh-, what is the proper syntax for a custom prior density for a factor variable's parameter that is a function of the parameter?

    With bayesmh, you can create a custom prior density with the density() option of prior().

    And that probability density function can be a function of the parameter (see first example below).

    And a parameter for a factor variable can have a custom density (see second example below).

    But when I try to create a custom density that combines both of these, that is, is a function of the parameter for a factor variable, I get an error message whose complaints I've ruled out (third example below).

    Here's the code (I have also attached a do-file containing the complete code for two reproducible examples)
    Code:
    quietly sysuse auto
    
    generate byte k = mod(_n, 2)
    
    /* Not a problem for the custom density to be a function of a parameter (or its linear combination) */
    bayesmh foreign, likelihood(logistic) ///
        prior({foreign:_cons}, density(exp({foreign:_cons}) / (1 + exp({foreign:_cons}))^2)) ///
        nomodelsummary
    
    /* Not a problem for a factor variable to have custom density prior */
    bayesmh foreign i.k, likelihood(logistic) ///
        prior({foreign:_cons}, normal(0, 5)) ///
        prior({foreign:i1.k}, density(1)) ///
        nomodelsummary
    
    /* Problem when custom density prior is a function of the factor variable's
       parameter (or its linear combination) */
    bayesmh foreign i.k, likelihood(logistic) ///
        prior({foreign:_cons}, normal(0, 5)) ///
        prior({foreign:i1.k}, density(exp({foreign:i1.k}) / (1 + exp({foreign:i1.k}))^2))
    
    exit
    and here's the result (I've also attached an SMCL log file that contains the result).

    .ÿ/*ÿNotÿaÿproblemÿforÿtheÿcustomÿdensityÿtoÿbeÿaÿfunctionÿofÿaÿparameterÿ(orÿitsÿlinearÿcombi
    >ÿnation)ÿ*/
    .ÿbayesmhÿforeign,ÿlikelihood(logistic)ÿ///
    >ÿÿÿÿÿÿÿÿÿprior({foreign:_cons},ÿdensity(exp({foreign:_cons})ÿ/ÿ(1ÿ+ÿexp({foreign:_cons}))^2))
    >ÿÿ///
    >ÿÿÿÿÿÿÿÿÿnomodelsummary
    ÿÿ
    Burn-inÿ...
    Simulationÿ...

    BayesianÿlogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿMCMCÿiterationsÿÿ=ÿÿÿÿÿ12,500
    Random-walkÿMetropolis-HastingsÿsamplingÿÿÿÿÿÿÿÿÿBurn-inÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿ2,500
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿMCMCÿsampleÿsizeÿ=ÿÿÿÿÿ10,000
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ74
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿAcceptanceÿrateÿÿ=ÿÿÿÿÿÿ.4486
    Logÿmarginalÿlikelihoodÿ=ÿ-47.010554ÿÿÿÿÿÿÿÿÿÿÿÿÿEfficiencyÿÿÿÿÿÿÿ=ÿÿÿÿÿÿ.2163
    ÿ
    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿEqual-tailed
    ÿÿÿÿÿforeignÿ|ÿÿÿÿÿÿMeanÿÿÿStd.ÿDev.ÿÿÿÿÿMCSEÿÿÿÿÿMedianÿÿ[95%ÿCred.ÿInterval]
    -------------+----------------------------------------------------------------
    ÿÿÿÿÿÿÿ_consÿ|ÿ-.8417671ÿÿÿ.2630011ÿÿÿ.005654ÿÿ-.8390754ÿÿ-1.384746ÿÿ-.3504125
    ------------------------------------------------------------------------------

    .ÿ
    .ÿ/*ÿNotÿaÿproblemÿforÿaÿfactorÿvariableÿtoÿhaveÿcustomÿdensityÿpriorÿ*/
    .ÿbayesmhÿforeignÿi.k,ÿlikelihood(logistic)ÿ///
    >ÿÿÿÿÿÿÿÿÿprior({foreign:_cons},ÿnormal(0,ÿ5))ÿ///
    >ÿÿÿÿÿÿÿÿÿprior({foreign:i1.k},ÿdensity(1))ÿ///
    >ÿÿÿÿÿÿÿÿÿnomodelsummary
    ÿÿ
    Burn-inÿ...
    Simulationÿ...

    BayesianÿlogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿMCMCÿiterationsÿÿ=ÿÿÿÿÿ12,500
    Random-walkÿMetropolis-HastingsÿsamplingÿÿÿÿÿÿÿÿÿBurn-inÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿ2,500
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿMCMCÿsampleÿsizeÿ=ÿÿÿÿÿ10,000
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ74
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿAcceptanceÿrateÿÿ=ÿÿÿÿÿÿ.2666
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿEfficiency:ÿÿminÿ=ÿÿÿÿÿÿÿ.136
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿ.1436
    Logÿmarginalÿlikelihoodÿ=ÿ-47.085067ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿ.1513
    ÿ
    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿEqual-tailed
    ÿÿÿÿÿforeignÿ|ÿÿÿÿÿÿMeanÿÿÿStd.ÿDev.ÿÿÿÿÿMCSEÿÿÿÿÿMedianÿÿ[95%ÿCred.ÿInterval]
    -------------+----------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿ1.kÿ|ÿ-.0355851ÿÿÿ.4936124ÿÿÿÿ.01269ÿÿ-.0221173ÿÿ-.9684661ÿÿÿ.9078394
    ÿÿÿÿÿÿÿ_consÿ|ÿ-.8504277ÿÿÿ.3452926ÿÿÿ.009364ÿÿ-.8448572ÿÿÿ-1.54843ÿÿÿ-.212137
    ------------------------------------------------------------------------------

    .ÿ
    .ÿ/*ÿProblemÿwhenÿcustomÿdensityÿpriorÿisÿaÿfunctionÿofÿtheÿfactorÿvariable'sÿ
    >ÿÿÿÿparameterÿ(orÿitsÿlinearÿcombination)ÿ*/
    .ÿbayesmhÿforeignÿi.k,ÿlikelihood(logistic)ÿ///
    >ÿÿÿÿÿÿÿÿÿprior({foreign:_cons},ÿnormal(0,ÿ5))ÿ///
    >ÿÿÿÿÿÿÿÿÿprior({foreign:i1.k},ÿdensity(exp({foreign:i1.k})ÿ/ÿ(1ÿ+ÿexp({foreign:i1.k}))^2))
    noÿpriorÿspecifiedÿforÿforeign:i1.k
    ÿÿÿÿPriorÿdistributionsÿmustÿbeÿspecifiedÿforÿallÿmodelÿparameters.ÿYouÿmayÿhaveÿomitted
    ÿÿÿÿoptionÿprior()ÿforÿsomeÿofÿtheÿparametersÿorÿmistypedÿtheÿnamesÿofÿtheÿparametersÿinÿthe
    ÿÿÿÿspecifiedÿprior()ÿoptions.

    r(198);
    Attached Files

  • #2
    With 1/0 variable, the factor variable notation is redundant: so instead of including i.female as a covariate in your model, just include female. Therefore, in your example above

    Code:
     bayesmh foreign k, likelihood(logistic) ///    
    prior({foreign:_cons}, normal(0, 5)) ///    
    prior({foreign:k}, density(exp({foreign:k}) / (1 + exp({foreign:k}))^2))
    -bayesmh- will complain if you do not specify all levels of your categorical variables. So in your case, the syntax is

    Code:
    bayesmh foreign i.k, likelihood(logistic)///
    prior({foreign:_cons}, normal(0, 5)) ///
    prior({foreign:1.k 0.k}, density(exp({foreign:1.k 0.k}) / (1 + exp({foreign:1.k 0.k}))^2))
    where the first in order defines the level that you want to be evaluated.
    Last edited by Andrew Musau; 12 May 2017, 06:28.

    Comment


    • #3
      Thank you for your reply. Do you know where this is documented? Before posting, I had searched the online help file and user's manual entry, and even Googled the terms, but didn't find anything about it.

      I chose the 0/1 variable as the simplest case just for illustration. The second example in the attached do-file has five categories.

      I see that when you explicitly reference the factor variables in the expression, Stata expands the term to the linear combination (the portion of the linear predictor that involves that factor variable). Up until now, I have been doing the following, i.e., referencing the parameter, itself, by name in the prior substitutable expression instead of by the factor variable name in order to keep reference to the data out of the prior.
      Code:
      tabulate k, generate(k)
      drop k1
      summarize k, meanonly
      local priors
      local k_list
      forvalues i = 2/`r(max)' {
          local k_list `k_list' k`i'
          local parm {foreign_k`i'}
          local expression exp(`parm') / (1 + exp(`parm'))^2
          local priors `priors' prior({foreign:k`i'}, density(`expression'))
      }
      bayesmh foreign `k_list', likelihood(logistic) ///
          prior({foreign:_cons}, normal(0, 5)) `priors' ///
          block({foreign:}, split) . . .
      Because "the data" in the case of factor variables are nothing but 0/1 indicator variables, it shouldn't matter (I think) that they are used in computing the prior. I thought that there should be a short-cut to all of that code above by referencing the factor variable as something like {foreign:i.k} just as when using one of the built-in prior density functions. That's when I encountered the problem. Thanks again.

      Comment


      • #4
        Joseph: This is illustrated in the example on p.66 of the manual where you have

        bayesmh y x i.cat, likelihood(probit)
        > prior(y:x _cons, normal(0, 1000))
        > prior(y:1.cat 2.cat, mvnormal0(2,{Sigma,m}))
        > prior({Sigma,m}, iwishart(2,10,V))
        > block({Sigma,m}, gibbs)
        although StataCorp could do a better job at explaining how the command handles factor variables.

        Comment


        • #5
          Thanks, Andrew.

          It seems that referencing the parameter by factor variable name in the substitutable expression of a custom prior doesn't tip Stata off to condense "the data" down to an identity matrix (one row for each of k-1 categories of the factor variable). Rather, it still appears to sum the linear combination across all N rows of the design matrix, just as if it were in a custom likelihood.

          That makes the substitutable expression you give in #2 above suitable for a likelihood, but not really for a prior density. For a custom prior, it seems that we need to reference the parameter in the substitutable expression by the parameter name {y_x} and not by the variable name {y:x}.

          (It doesn't help that the example of a custom prior in the user's manual appears to be wrong: the example with Gelman's suggested Cauchy prior for logistic regression that is shown at the top of Page 65 of the user's manual shows {y:x} instead of {y_x}. I'll send a note to technical services about this apparent erratum.)

          So, as far as I can tell, we cannot use factor variable notation in a model where we need to specify custom prior densities on the regression coefficients. It's a little late for Stata 15, but such a convenience would be something for the developers to consider later on. For now, it's back to
          Code:
          tabulate cat, generate(cat)
          forvalues i = 2/`r(r)' { . . .

          Comment


          • #6
            Thanks Joseph. You are right, for example the following will run

            Code:
            sysuse auto
            replace rep78=3 if missing(rep78)
            qui tab rep78, gen(rep78)
            bayesmh foreign rep783, likelihood(logistic) ///
            prior({foreign:_cons}, normal(0, 5))///
            prior({foreign:rep783}, density(exp({foreign:rep783}) / (1 + exp({foreign:rep783}))^2))
            But the following will give you the error


            Code:
            bayesmh foreign 3.rep78, likelihood(logistic) ///
            prior({foreign:_cons}, normal(0, 5)) ///
            prior({foreign:3.rep78}, density(exp({foreign:3.rep78}) / (1 + exp({foreign:3.rep78}))^2))
            failed to view  i3.rep78
            r(503);
            Please post if you get a solution from technical services.
            Last edited by Andrew Musau; 18 May 2017, 15:28.

            Comment

            Working...
            X