Regression model with r103 error

Laurent Magon

Join Date: May 2015

Posts: 21
#1

Regression model with r103 error

20 May 2015, 02:09

Hi everyone,

I'd like to use a regression following the model as below :

Code:

ln(v) = Delta(ij) + Delta(ik) + Delta(jk)

With the Deltas being fixed effect.

I Created my dummy variables ij, ik and jk but when I try to regress, I unfortunately encounter a r103 error. Even reducing my dataset doesn't change anything.

As you mind guess, I'm not really used to stata so any help is welcome.

I'm actually blocked here so any suggestion will be welcomed as I really need to go further in my research.

I'm trying to compute the RCA index using the research paper by Elsa Leromain and Guianluca Orefice if anyone is interested. i correspond to the exporter country code and j to the importer's one, finally v correspond to the value of exportation in sector k. I computed my dummies as follow :

Code:

gen ij = group(i j)

Then use it as follow in the regression:

Code:

reg lnv i.ij i.ik i.jk

Am I doing something wrong ?

Thank you.
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35734
#2

20 May 2015, 02:27

If you call up group() with generate you call up an undocumented and not very useful function. I am confident you meant to use egen.

See e.g. http://www.stata.com/statalist/archi.../msg00406.html

http://www.statalist.org/forums/foru...group-function

The rest of your post is well off my beaten track so I can't comment except that the FAQ Advice gives a clear signal about giving full citations to literature.

Last edited by Nick Cox; 20 May 2015, 02:49.
Comment
Laurent Magon

Join Date: May 2015

Posts: 21
#3

20 May 2015, 04:05

Yeah, of course it's egen, I made a typo, I can't correct the post but it's clearly the following code :

Code:

egen ij = group(i j)

I used the group() in order to establish the country-pairs. It was a student in PhD that suggested to use this method.
Comment
Martin Bresslein

Join Date: Apr 2014

Posts: 51
#4

20 May 2015, 04:06

Laurent,

the problem is actually a very simple one. If you use Stata SE or MP, the default number of variables allowed is 5000, which you can increase to 32767 - see "help set maxvar". However, afterwards you run into matsize problems since here the maximum is 11000. For those like Nick who might not know the precise data structure, The authors have 20 exporting countries i, 76 importing countries j and either 70 or a lot more industries/products k, depending on the level of aggregation. With this kind of dataset it's pretty easy to reach the maxvar and matsize limits.

Since the authors mention in footnote 16 that they used Stata to perform their calculations, it should work at the two-digit industry level when setting maxvar to 32767 and matsize to 11000, so before loading your dataset, type

Code:

set maxvar 32767 set matsize 11000

As I take it, for the 4-digit product level they outline a different estimation strategy, which you should follow, else you might have to use a different software package...

Best,
Martin
Comment
Laurent Magon

Join Date: May 2015

Posts: 21
#5

20 May 2015, 04:17

Hi,

I tried to change the maxvar and matsize before opening the data but then the result is the same with the regression, still showing error r103.

I'm working on a larger dataset than the paper, indeed I'm working on 200+ country codes in 97 sectors (Aggregated in HS2-chapter level).

Do you know what should I do ?
Comment
Martin Bresslein

Join Date: Apr 2014

Posts: 51
#6

20 May 2015, 06:32

You could try doing it in R. I don't think there is a "variable" limit, apart from the size of RAM you can allocate. Else, I think you should follow the authors' strategy for the 4-digit product level data...
Comment
Laurent Magon

Join Date: May 2015

Posts: 21
#7

20 May 2015, 06:38

I'm not working on the 4 digit product code but on a two digit chapter level. I'm also following the author's strategy until the regress.

Well, I might misunderstood the regression but I don't think so.

I never used R before so I'll have to see how to do it.
Comment
Martin Bresslein

Join Date: Apr 2014

Posts: 51
#8

20 May 2015, 06:50

Sorry if I did not make myself clear enough. I got that you work on the 2-digit level. Still, the problem is the same one the authors mention for their dataset when going from 2 to 4 digits. On page 10, they outline a strategy that reduces the number of fixed effects so Stata can cope with it, essentially assuming that the productivity can be split into 2 parts. This is what I meant you should try to adapt to your problem at hand, because else I don't know of any solution in Stata since the maximum number of variables is binding. Maybe somebody else has a workaround...
Comment
Laurent Magon

Join Date: May 2015

Posts: 21
#9

20 May 2015, 07:59

Oh, yeah, I just verified what the authors' did. I don't think that I will be able to do the same as they divided the productivity using the hs2 and the specification (don't know if I'm clear). I don't know how I can divide my HS2 chapter code.

I also tried to reduce the number of observation, keeping the country with a country code > 480 just to try if it works. Unfortunately, I have the same error.
Comment
Martin Bresslein

Join Date: Apr 2014

Posts: 51
#10

20 May 2015, 08:43

As it says here, the maximum number of independent variables - assuming you are running either Stata SE or MP, is 10,998. Thus, either you reduce your dataset to a size such that when all relevant dummy variables are generated (or the number of levels from the factor variable notation respectively) the number does not exceed this limit. Or, you have to impose additional assumptions. In this case, a bit like their approach, you could assume - granted this would be very strong - there is a country-specific component, and then one specific to each HS2 chapter. That would mimic the authors' approach. Very crude, but at least a viable option within Stata's limits.
I´m sorry, I don't have a good alternative as of yet...

Last edited by Martin Bresslein; 20 May 2015, 09:02.
Comment
Laurent Magon

Join Date: May 2015

Posts: 21
#11

26 May 2015, 05:19

Excuse me for my late response.

What do you mean by a country specific component ? I'm not sure to understand it clearly (may be because it's in english).
I know it might be too much to ask but would you please explain what you meant ?

Thank you.
Comment

Announcement

Regression model with r103 error

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment