Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • APC analysis with survey data

    I'm using Stata 13.1 SE under Windows 7E. I'm using apc_ie (http://econpapers.repec.org/software...de/s456754.htm) with some survey data, Since the module was not originally set up for survey data, the author kindly provided me with a modification to accommodate these kinds of data. This consisted of:

    Definition of a new variable at the top of the program:
    Code:
     svyopts(string)
    A code block to fill that variable:
    Code:
     *survey options?
      if "`svyopts'"=="" local svypre ""
      else local svypre "`svyopts':"

    And modification of the glm command to pick up the svy prefix:
    Code:
     *estimate;
      if "`exposure'"~="" local exposurestr "exposure(`exposure')";
      if "`offset'"~="" local offsetstr "offset(`offset')";
      `svypre' glm `varlist' `xnames' if `touse' [`weight' `exp'], `constant'
      `exposurestr' `offsetstr' scale(`scale') `options' nodisplay;

    So, one should be able to write:
    Code:
    apc_ie2 y, age(age) period(period) cohort(cohort) svyopts("svy")
    for

    Code:
    svy: glm y 

    However, I can't can't get it to recognize the svypre variable in the code. It's as if it's a null string. I get the same results with or without the svyopts option. I'm at a loss to explain why this is happening and how to fix it. Does anyone have any ideas?

  • #2
    I get the same results with or without the svyopts option.
    Do you mean that you get the same ultimate output from apc_ie2? If so, the first question is how did you -svyset- your data?

    If you mean that you have run the apc_ie2 command with trace on and the -glm- line is showing up without -svy: - in front of it, that's a problem arising somewhere in the ado file that needs further investigation: presumably somehow the macro svypre is either being erased or going out of scope between the place where you show it being defined and the -glm- command.

    One more thing: you begin your post by referring to defining a new variable at the top of the program. But -svyopts(string)- does not define any variables: it is an additional option in the syntax of the command and its result is the creation of a local macro, not a variable. So if there is someplace else in the code of this .ado file that requires some new variable, then that might be the source of the problem.

    I'm not familiar with this apc_ie2 program so I can't offer more specific advice here.

    Comment


    • #3
      Yes, I've svyset the data.


      I haven't tried trace on, but I'm not much of a Stata programmer, so I'm not sure what to look for . I'll give it a shot and let you know what I see.


      Maybe I'm not using the correct lingo here since I mainly use the command and not the programming language. Here's the entire code block from the top of the program that defines the variables:

      [CODE syntax varlist(numeric ts) [fw aw pw iw] [if] [in],
      [age(varname numeric) period(varname numeric) cohort(varname numeric)
      GENerate(name) eigenvectors_in(name) eigenvectors_out(name)
      design_in(name) design_out(name) xe_in(name) xe_out(name)
      noCONStant EXPosure(varname numeric) OFFset(varname numeric)
      SCAle(string) LEvel(cilevel) EForm
      noHEADer notable nodisplay svyopts(string) *];
      ][/CODE]


      Not sure if that helps. If you can give me a clue on what to inspect while trace is on, I'll report my findings.

      Comment


      • #4
        So, to limit the amount of extraneous output, I would -set tracedpth 1- and then -set trace on-. Be sure to log the output: there may be so much of it that it exceeds the buffer capacity of the Results window. Then run your apc_ie2 program.

        When running a program in this way Stata will show you all of the commands that are being executed "inside" the .ado file. It will show them to you first as they are actually written in the .ado file (including all the macro references), and it will also then show you what they look like with all the macros replaced by whatever they refer to. Search through that output for the -glm- command. Then you'll want to see whether it says -svy: glm- or just -glm-.

        Comment


        • #5
          I thought I'd try the instructions on a simple program:

          Code:
          program define xreg
            
              syntax varlist [,  svyopts(str)]
              *survey options?
              if "`svyopts'"=="" local svypre ""
              else local svypre "`svyopts':"
              `svypre'reg  `varlist'
          end
          which I saved in my PERSONAL ado directory, "x" folder. (adopath will show you where the directory is.)

          I then ran the following do file:
          Code:
          sysuse auto, clear
          svyset rep78 [pw = turn]
          xreg  length mpg, svyopts("svy")
          and it worked, with the svy features shown in the results. So the instructions are okay. Try prefacing your do file with

          Code:
          capture program drop _all
          When you've set trace as Clyde suggested, search for "svypre" in the results. When I did this, I found the following lines
          Code:
           - `svypre'reg `varlist'
            = svy:reg length mpg
          which indicated that "svypre" had been successfully translated.

          Your last opening CODE statement is missing a closing bracket. Always test this stuff with the Preview button before posting.
          Steve Samuels
          Statistical Consulting
          [email protected]

          Stata 14.2

          Comment


          • #6
            I think that Stata isn't seeing your new ado file. I was able to get apc_ie to work after some other modifications.

            As I mentioned, I like to have
            Code:
            capture program drop _all
            at the top of my do files when I am tinkering with ado files.

            I copied apc_ie to my working directory and renamed it apc_ie2; then made the modifications you describe. When I run the first part of "apc_example.do", distributed with the package. It now contains the following:

            Code:
             apc_ie2 death_f if age<=90,
               age(age) period(year) cohort(cohort) family(poisson) link(log)
                exposure(exp_f) scale(x2) svyopts("svy") ;
            After I svyset the data, and run the do file, i get an error:

            Code:
             option scale() not allowed with svy prefix
             r(198);
            If I remove the scale() option from the do file, i get another error:
            Code:
             option nodisplay not allowed with svy prefix
            r(198);
            If I then remove the "nodisplay" option from the ado file header you quoted in your prior post, the command runs without further error and glm acknowledges the survey settings in its header information. Removing options is always risky, and if you need the scale() option, this solution won't help.

            Last edited by Steve Samuels; 22 Mar 2015, 07:16.
            Steve Samuels
            Statistical Consulting
            [email protected]

            Stata 14.2

            Comment


            • #7
              I think this is the relevant section:

              - drop _merge
              - if "`exposure'"~="" local exposurestr "exposure(`exposure')"
              = if ""~="" local exposurestr "exposure()"
              - if "`offset'"~="" local offsetstr "offset(`offset')"
              = if ""~="" local offsetstr "offset()"
              - `svypre' glm `varlist' `xnames' if `touse' [`weight' `exp'], `constant' `ex
              > posurestr' `offsetstr' scale(`scale') `options' nodisplay
              = glm ADD2i __00000J __00000K __00000L __00000M __00000N __00000O __00000P
              > __00000Q __00000R __00000S __00000T __00000U __00000V __00000W __00000X __000
              > 00Y __00000Z if __000001 [ ], scale() family(binomial) link(logit) iterate
              > (20) nodisplay

              Iteration 0: log likelihood = -38606.063

              ADD2i is the dependent variable. This model iterates to completion. I've omitted the results. I'm not sure how to interpret this, but this line looks suspect to me:

              = glm ADD2i __00000J __00000K __00000L __00000M __00000N __00000O __00000P
              It seems like there should be a svy or svypre in front of the glm command. I'm not sure how to fix this. Maybe an if statement like some of the other options in the lines immediately preceding the glm call?


              if "`svypre'"~="" local svyprestr "svypre(`svypre')"

              Comment


              • #8
                Steve-Thanks for your suggestions. Tried to preview that post and it threw and error. Pretty bad day when both your program and the bulletin board vomit on you! Anyway, my log shows I'm reading apc_ie2 ok. I think it's just not seeing the svy prefix as I indicated above. I'm not a good enough programmer to fix it. Adding the svy prefix is the most simple case. I also need to do subpopulation analyses, so once I get it to read the prefix, I'll try adding that. Want to get the svy prefix working first. Appreciate the suggestions you made. Will keep those in mind for future debugging efforts.

                Comment


                • #9
                  Thought about this some more and I believe my first hunch was incorrect. In the program, under the variable declarations, there is this block of code:

                  Code:
                   marksample touse;
                    markout `touse' `age' `period' `cohort' `offset' `exposure';
                    *survey options?
                    if "`svyopts'"=="" local svypre ""
                    else local svypre "`svyopts':"
                    *how many additional explanatory variables?;
                    local n_extra: word count `varlist';
                    local n_extra=`n_extra'-1;
                    *check that at least two of a, p, c are present;
                    if ("`age'"=="" & ("`period'"=="" | "`cohort'"=="")) |
                    ("`period'"=="" & "`cohort'"=="") {;
                    di as error "must specify at least two of age, period and cohort";
                    exit=1;
                    };

                  In my log I see:

                  Code:
                    - marksample touse
                    - markout `touse' `age' `period' `cohort' `offset' `exposure'
                    = markout __000001 apc_agei apc_periodi apc_cohorti 
                    - local n_extra: word count `varlist'
                    = local n_extra: word count ADD2i
                    - local n_extra=`n_extra'-1
                    = local n_extra=1-1
                    - if ("`age'"=="" & ("`period'"=="" | "`cohort'"=="")) | ("`period'"=="" & "`
                  > cohort'"=="") {
                    = if ("apc_agei"=="" & ("apc_periodi"=="" | "apc_cohorti"=="")) | ("apc_perio
                  > di"=="" & "apc_cohorti"=="") {
                      di as error "must specify at least two of age, period and cohort"
                      exit=1
                      }

                  So, what happened to the block?

                  Code:
                   *survey options?
                    if "`svyopts'"=="" local svypre ""
                    else local svypre "`svyopts':"
                  It doesn't seem like it's being executed. Maybe this is the issue?

                  Comment


                  • #10
                    Check your semicolons. Note that you have

                    Code:
                      *survey options?
                    and there is no semicolon to end the comment.
                    -------------------------------------------
                    Richard Williams, Notre Dame Dept of Sociology
                    StataNow Version: 19.5 MP (2 processor)

                    EMAIL: [email protected]
                    WWW: https://academicweb.nd.edu/~rwilliam/

                    Comment


                    • #11
                      Richard is right. You also need a semicolon after the "if" statement, so the whole block should be:

                      Code:
                      *survey options?;
                      if "`svyopts'"=="" local svypre "" ;
                      else local svypre "`svyopts':" ;
                      Last edited by Steve Samuels; 22 Mar 2015, 13:13.
                      Steve Samuels
                      Statistical Consulting
                      [email protected]

                      Stata 14.2

                      Comment


                      • #12
                        Yes. I made these changes and I now see the svy prefix. I also removed the nodisplay as suggested throughout the program. However, it is still being added to glm. This causes an error and the program stops:

                        Code:
                          = svy: glm ADD2i  __00000J __00000K __00000L __00000M __00000N __00000O __000
                        > 00P __00000Q __00000R __00000S __00000T __00000U __00000V __00000W __00000X _
                        > _00000Y __00000Z if __000001 [ ],    scale() family(binomial) link(logit) ite
                        > rate(20) nodisplay
                        option nodisplay not allowed with the svy prefix
                        
                          ------------------------------------------------------------- end apc_ie3 ---
                        r(198);
                        
                        end of do-file
                        
                        r(198);

                        Now, this is where things get really weird. When I modified the code, I saved to a new file, apc_ie3. Here are a few of the first few lines:

                        Code:
                        if !replay() {;
                        
                          syntax varlist(numeric ts) [fw aw pw iw] [if] [in],
                            [age(varname numeric) period(varname numeric) cohort(varname numeric)
                             GENerate(name) eigenvectors_in(name) eigenvectors_out(name)
                             design_in(name) design_out(name) xe_in(name) xe_out(name)
                             noCONStant EXPosure(varname numeric) OFFset(varname numeric)
                             SCAle(string) LEvel(cilevel) EForm
                             noHEADer notable  svyopts(string) *];
                        
                          marksample touse;
                          markout `touse' `age' `period' `cohort' `offset' `exposure';
                        
                          *survey options?;
                          if "`svyopts'"=="" local svypre "";
                          else local svypre "`svyopts':";

                        Note that the nodisplay has been removed.


                        Here's what I'm seeing in my log:

                        Code:
                        . apc_ie3 ADD2i /*
                        > */ if A_imputation_==1 & ADD2i<2, /*
                        > */ age(apc_agei)  period(apc_periodi) cohort(apc_cohorti) family(binomial) /*
                        > */  link(logit) iterate(20) /*
                        > */ svyopts("svy")
                          ----------------------------------------------------------- begin apc_ie3 ---
                          - version 9.2
                          - if !replay() {
                          - syntax varlist(numeric ts) [fw aw pw iw] [if] [in], [age(varname numeric) p
                        > eriod(varname numeric) cohort(varname numeric) GENerate(name) eigenvectors_in
                        > (name) eigenvectors_out(name) design_in(name) design_out(name) xe_in(name) xe
                        > _out(name) noCONStant EXPosure(varname numeric) OFFset(varname numeric) SCAle
                        > (string) LEvel(cilevel) EForm noHEADer notable nodisplay svyopts(string) *]
                          - marksample touse
                          - markout `touse' `age' `period' `cohort' `offset' `exposure'
                          = markout __000001 apc_agei apc_periodi apc_cohorti 
                          - if "`svyopts'"=="" local svypre ""
                          = if "svy"=="" local svypre ""
                          - else local svypre "`svyopts':"
                          = else local svypre "svy:"

                        The call is to apc_ie3, but for the life of me, I cannot figure out how the nodisplay is in the header. I removed it in other places in the program and the log reflects that, but not here. How can code not in a program suddenly appear? The changes I made to the survey options block are there.

                        Comment


                        • #13
                          Solved this problem by going back and starting form scratch. Edited apc_ie2 and saved under another name. Everything ran fine. Thanks to everyone for your help.

                          Comment


                          • #14
                            Here's another thing that I cold use some help with. I'm running analyses on a large number of variables in a loop. If there's an problem (specifically, convergence or collinearity) with a variable, I need it noted in the log and then the program should move to the next variable. Note this section of code:


                            Code:
                             *check whether a regressor was dropped;
                              *if so, give up because we can't transform back to original coordinates;
                              if `princomps'+`cshift'+`n_extra'>e(k) {;
                              di as error "regressor collinear with age, period or cohort variables, or too many empty cells in age-by-period matrix";
                              exit=1;
                              };

                            If I have a collinearity issue, the program aborts. Using capture noisily is one option, but there may be other errors of which I'm unaware which I want to terminate the program. So, I'm not so excited about that. I need to trap for the exit=1 condition. However, I might also want to change that since exit=1 results in a r(1) error condition, the same as a break. If my trap ignores r(1), I wouldn't be able to issue a control break to stop the program. Right now, per Clyde's suggestion, I'm enclosing the APC analysis in a block like so:


                            Code:
                            if e(converged) {
                            
                            
                            }
                            Could I change this to:

                            Code:
                            if e(converged) or `c(rc)' == 1 {
                            
                            
                            }
                            to trap the exit=1? How could I customize the error code exit=1 so that it is something other than break?

                            Comment


                            • #15
                              Well, "or" is not legal Stata syntax. It would be

                              Code:
                              if e(converged) | `c(rc)' == 1 {
                                  ...
                              }
                              Since you have already made some customizations to this particular .ado file, I suppose you can make some more. To change the error code from 1 to something else, you just have to replace the 1 in "exit=1" by whatever particular error number you would prefer to return. If, as seems to me, your purpose is to trap and identify this particular part of the code, I would search through the ado-file to see what other exit codes it uses, and pick a number that is distinct from all of those. To be doubly safe, you might scan the list of Stata error codes in [P]'s -error- chapter and pick a number that is not used there either.

                              Comment

                              Working...
                              X