Fillin client market pairs, but only for markets with a branch in that client's municipality?

Peter Meier

Join Date: Apr 2016

Posts: 86
#1

Fillin client market pairs, but only for markets with a branch in that client's municipality?

07 Jun 2022, 12:37

Hi all,

I would like to estimate the determinants of how much each client buys in each market. As I explicitly observe amounts only for client-market pairs in which something is bought this quarter, I would ike to

Code:

fillin client market

. However I would like to fill in for each client only those markets that do have a branch in the client's municipality.

I do not find an option for this in the fillin command, so I'm wondering if anyone would have an idea of how to code this most efficiently?

Thank you so much,
PM
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30356
#2

07 Jun 2022, 13:28

How do you know which markets have a branch in the client's municipality?

The solution to this problem, since -fillin- doesn't do what you want, probably will not be a simple one or two liner. Knowing the details of how your data is organized will likely be important, perhaps a sine qua non, for solving your problem. So please, when posting back, show example data from your data set. And be sure to use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment

Peter Meier

Join Date: Apr 2016
Posts: 86

08 Jun 2022, 01:41

Hi Clyde,

thanks, I was not aware of the -dataex- command before but like -codebook- and related commands it seems useful to give a better idea of the dataset structure by specifying the format of each variable.
Does this make it easier to think about a solution to my question?

Thanks so much and best regards,
PM

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input double clientid long marketid double year int town
 163  831173 1996  104
 163  831173 1997  104
 163  831173 1998  104
 163  831173 1999  104
 163  831173 2000  104
 163  831173 2001  104
 163  831173 2002  104
 163  831173 2003  104
 163  831173 2004  104
 163 2505068 1998  104
 163 2505068 1999  104
 166 4138728 1998  105
 166 4138728 1999  105
 166 4138728 2000  105
 166 4138728 2001  105
 177 3044116 2002 1902
 177 3044116 2003 1902
 177 4138728 2002 1902
 177 4138728 2003 1902
 261 2056664 2002 1426
 261 2056664 2003 1426
 261 2056664 2004 1426
 591  352251 1996  233
 591 4138728 1997  233
 591 4138728 1998  233
 591 4138728 1999  233
 591 4349452 1999  233
1219 1017261 2003  106
1219 1017261 2004  106
1219 4138728 2003  106
1219 4138728 2004  106
1227  170544 1996  301
1227  170544 1998  301
1227  170544 1999  301
1227  170544 2000  301
1227  170544 2001  301
1227  170544 2002  301
1227  170544 2003  301
1227  170544 2004  301
1227  352251 1996  301
1227  394077 1996  301
1227  394077 1997  301
1227  394077 1998  301
1227  394077 1999  301
1227  394077 2000  301
1227  394077 2001  301
1227  394077 2002  301
1227  394077 2003  301
1227  394077 2004  301
1227  819333 1999  301
1227 1356075 1997  301
1227 1356075 1998  301
1227 1356075 1999  301
1227 1356075 2000  301
1227 1356075 2001  301
1227 1356075 2002  301
1227 1356075 2003  301
1227 1513355 2004  301
1227 2057004 1999  301
1227 2057004 2000  301
1227 2057004 2001  301
1227 2057004 2002  301
1227 2057004 2003  301
1227 2718948 2000  301
1227 2718948 2001  301
1227 2718948 2002  301
1227 2718948 2003  301
1227 2718948 2004  301
1227 2921933 2000  301
1227 2921933 2001  301
1227 2921933 2002  301
1227 2921933 2003  301
1227 2921933 2004  301
1227 3127159 1998  301
1227 3127159 1999  301
1227 3127159 2000  301
1227 3127159 2001  301
1227 3127159 2002  301
1227 3127159 2003  301
1227 3127159 2004  301
1227 3492307 2003  301
1227 3492307 2004  301
1227 3618948 1999  301
1227 3618948 2000  301
1227 3618948 2001  301
1227 3618948 2002  301
1227 3618948 2003  301
1227 3618948 2004  301
1227 3725142 1996  301
1227 3725142 2000  301
1227 3725142 2003  301
1227 3725142 2004  301
1227 4070578 2001  301
1227 4070578 2002  301
1227 4070578 2003  301
1227 4070578 2004  301
1227 4138728 1997  301
1227 4138728 1998  301
1227 4138728 1999  301
1227 4138728 2000  301
end

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30356
#4

08 Jun 2022, 13:43

Thank you. Either I don't understand what you want, or the example data you posted doesn't illustrate the problem well.

What I think you want is that, in effect, for each value of town, we compile a list of all marketid's that occur with that value of town somewhere in the entire data set. Then we want to, in the original data, create new observations for each client id that includes all of the marketids that show up in the compiled list associated to that town. If that's what you want, the following code will, I believe, do that:

Code:

tempfile original save `original' keep marketid town duplicates drop tempfile markets_and_towns save `markets_and_towns' use `original', clear keep clientid town duplicates drop isid clientid, sort joinby town using `markets_and_towns' merge 1:m clientid marketid town using `original'

But when I apply this code to your example data, you just end up with the original data as it is. And when I explore your data by eye, it appears that every clientid in your data is already paired with every marketid that appears matched with the clientid's town somewhere in the data. So that leaves me uncertain whether this is just an anomaly of the example data, or perhaps I have misunderstood what is wanted.
Comment
Peter Meier

Join Date: Apr 2016

Posts: 86
#5

10 Jun 2022, 02:48

Hi Clyde,

thanks so much. I did not know

Code:

joinby

but this seems to do exactly what I was looking for, namely fill all pairwise combinations but not overall but only within each town.

The

Code:

tempfile

command was also useful. I would just have saved the temporary datasets without it, but I understand that with tempfile it requires less space and possibly also less time which is good to know.

Doing this it turned out indeed that a large, and larger than I had expected, part of clients were already connected to each market in their town, but not all were.
The example I gave was misleading in this respect as its 100 observations were based on a 1% sample that did not include all market town combinations. When I repeated the exercise on a larger sample some fillins were made, so it seems to have worked now.

Thanks so much and best regards,
PM
Comment

Announcement

Fillin client market pairs, but only for markets with a branch in that client's municipality?

Comment

Comment

Comment

Comment