stata command unrecognized r(199) error

Dominique Bourget

Join Date: Sep 2019

Posts: 43
#1

stata command unrecognized r(199) error

12 Oct 2019, 08:10

Hello,
I am trying to run the Stata code below, and everything runs except at the very end I am getting 'the command i unrecognized r(199) error'. How can I avoid this error? I am new to Stata and I am not so sure. I have attached the pharmacy_small.dta file with this post so that you can run the code on your computer.

STATA CODE:

clear

//import the pharmacy_small Stata dataset
use pharmacy_small

// change the the variables store_type, area, and compliance into binary categorical variables with 0's and 1's
generate chain = store_type == "CHAIN"
generate north = area == "North"

// numericize all the string categorical variables while retaining the same label
encode county, generate(county_num)

python:

# install sklearn, sfi, numpy, and pandas packages first
# make sure to install them first!
from sklearn.naive_bayes import MultinomialNB
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn import metrics # import scikit-learn metrics module for accuracy calculation
from sfi import Data
import numpy as np
import pandas as pd

# Use the sfi Data class to pull data from Stata variables
X = pd.DataFrame(Data.get("educate north county_num chain"),
columns = ['educate', 'north', 'county_num', 'chain'])

Y = pd.DataFrame(Data.get("compliance"), columns = ['compliance'])

# split the pharmacy_small dataset into a training and a test set using the python commands
# splitting data into a test and training set is much easier in Python than in Stata (takes 1 line)
# 'test_size = 0.25' tells Python that we want to reserve 25% of our data for the test set
# train_test_split() will automatically shuffle the data before the split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.25)

end

clear

gen Alpha = .
gen AUC = .
local i = 0
range alphas 0.0 1.0 20

foreach a in alphas {

i++

python: a = Data.get("a")

// predict using the best value for alpha
python: mnb = MultinomialNB(alpha = a, class_prior = None, fit_prior = True)

// calculate probability of each class on the test set
// '[:, 1]' at the end extracts the probability for each pharmacy to be under compliance
python: Y_mnb_score = mnb.fit(X_train, np.ravel(Y_train)).predict_proba(X_test)[:, 1]

// make test_compliance python variable
python: test_compliance = Y_test['compliance']

// transfer the python variables Y_mnb_score and test_compliance to STATA
python: Data.setObsTotal(len(Y_mnb_score))
python: Data.addVarFloat('mnbScore')
python: Data.store(var = 'mnbScore', obs = None, val = Y_mnb_score)

python: Data.setObsTotal(len(test_compliance))
python: Data.addVarFloat('testCompliance')
python: Data.store(var = 'testCompliance', obs = None, val = test_compliance)

roctab testCompliance mnbScore
replace AUC = r(area) in `i' // at this point I am getting an error, I think
replace Alpha = `a'

}

Thank you for your help!
Attached Files

pharmacy_small.dta (41.2 KB, 1 view)
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30116
#2

12 Oct 2019, 09:21

Just inside the loop that starts:

Code:

foreach a in alphas { i++

you have the illegal command -i++-. There is no command -i-, as Stata is telling you. Presumably the intent here is to increment the value of local macro i. The correct syntax for that is:

Code:

local ++i
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

12 Oct 2019, 10:42

The following demonstration shows how local macro incrementation can occur inline rather than as a separate command, and the difference between `++i' and `i++'.

Code:

 clear

. set obs 5
number of observations (_N) was 0, now 5

. generate str8 text = "....."

. local i 0

. foreach a in dog cat frog {
  2. replace text = "`a'" in `++i'
  3. }
(1 real change made)
(1 real change made)
(1 real change made)

. list, clean

        text  
  1.     dog  
  2.     cat  
  3.    frog  
  4.   .....  
  5.   .....  

. display "`i'"
3

. foreach a in wren sparrow {
  2. replace text = "`a'" in `i++'
  3. }
(1 real change made)
(1 real change made)

. list, clean

          text  
  1.       dog  
  2.       cat  
  3.      wren  
  4.   sparrow  
  5.     .....  

. display "`i'"
5

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35711
#4

12 Oct 2019, 10:49

Note further that the loop starting

Code:

foreach a in alphas {

is a loop over one term, the variable name alphas. In particular, it is not a loop over the distinct values of that variable.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#5

12 Oct 2019, 11:17

Expanding on Nick's answer, and thinking back to your earlier question on the range command at

https://www.statalist.org/forums/for...missing-values

perhaps what you want is to call the python MultinomialNB function successively with the values 0, .05, .10, ..., .95, 1.0 for the alpha= argument. In that case I think something like this might do what you want.

Code:

forvalues a20=0(1)20 { local a = `a20'/20 python: mnb = MultinomialNB(alpha = `a', class_prior = None, fit_prior = True) ...

which will in succession run the commands

Code:

python: mnb = MultinomialNB(alpha = 0, class_prior = None, fit_prior = True) python: mnb = MultinomialNB(alpha = .05, class_prior = None, fit_prior = True) python: mnb = MultinomialNB(alpha = .10, class_prior = None, fit_prior = True) ... python: mnb = MultinomialNB(alpha = .95, class_prior = None, fit_prior = True) python: mnb = MultinomialNB(alpha = 1, class_prior = None, fit_prior = True)

Note that since the fraction 1/20 cannot be precisely represented as a floating point number, I choose to index the loop on integer values and recalculate a on each iteration, rather than accumulate an increeasingly imprecise sum of 20 terms.

Let me add the following more general advice. Your coding suggests that perhaps you are an experienced python user new to Stata? If so, I'm sympathetic to you as a new user of Stata - it's a lot to absorb. And even worse if perhaps you are under pressure to produce some output quickly. Nevertheless, I'd like to encourage you to take a step back from your immediate tasks.

When I began using Stata in a serious way, I started, as have others here, by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. There are a lot of examples to copy and paste into Stata's do-file editor to run yourself, and better yet, to experiment with changing the options to see how the results change.

All of these manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu. The objective in doing the reading was not so much to master Stata as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax, and know how to find out more about them in the help files and PDF manuals.

Stata supplies exceptionally good documentation that amply repays the time spent studying it - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry and to work effectively.
2 likes
Comment

Announcement

stata command unrecognized r(199) error

Comment

Comment

Comment

Comment