Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fillin client market pairs, but only for markets with a branch in that client's municipality?

    Hi all,

    I would like to estimate the determinants of how much each client buys in each market. As I explicitly observe amounts only for client-market pairs in which something is bought this quarter, I would ike to
    Code:
     fillin client market
    . However I would like to fill in for each client only those markets that do have a branch in the client's municipality.

    I do not find an option for this in the fillin command, so I'm wondering if anyone would have an idea of how to code this most efficiently?

    Thank you so much,
    PM





  • #2
    How do you know which markets have a branch in the client's municipality?

    The solution to this problem, since -fillin- doesn't do what you want, probably will not be a simple one or two liner. Knowing the details of how your data is organized will likely be important, perhaps a sine qua non, for solving your problem. So please, when posting back, show example data from your data set. And be sure to use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Hi Clyde,

      thanks, I was not aware of the -dataex- command before but like -codebook- and related commands it seems useful to give a better idea of the dataset structure by specifying the format of each variable.
      Does this make it easier to think about a solution to my question?

      Thanks so much and best regards,
      PM

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input double clientid long marketid double year int town
       163  831173 1996  104
       163  831173 1997  104
       163  831173 1998  104
       163  831173 1999  104
       163  831173 2000  104
       163  831173 2001  104
       163  831173 2002  104
       163  831173 2003  104
       163  831173 2004  104
       163 2505068 1998  104
       163 2505068 1999  104
       166 4138728 1998  105
       166 4138728 1999  105
       166 4138728 2000  105
       166 4138728 2001  105
       177 3044116 2002 1902
       177 3044116 2003 1902
       177 4138728 2002 1902
       177 4138728 2003 1902
       261 2056664 2002 1426
       261 2056664 2003 1426
       261 2056664 2004 1426
       591  352251 1996  233
       591 4138728 1997  233
       591 4138728 1998  233
       591 4138728 1999  233
       591 4349452 1999  233
      1219 1017261 2003  106
      1219 1017261 2004  106
      1219 4138728 2003  106
      1219 4138728 2004  106
      1227  170544 1996  301
      1227  170544 1998  301
      1227  170544 1999  301
      1227  170544 2000  301
      1227  170544 2001  301
      1227  170544 2002  301
      1227  170544 2003  301
      1227  170544 2004  301
      1227  352251 1996  301
      1227  394077 1996  301
      1227  394077 1997  301
      1227  394077 1998  301
      1227  394077 1999  301
      1227  394077 2000  301
      1227  394077 2001  301
      1227  394077 2002  301
      1227  394077 2003  301
      1227  394077 2004  301
      1227  819333 1999  301
      1227 1356075 1997  301
      1227 1356075 1998  301
      1227 1356075 1999  301
      1227 1356075 2000  301
      1227 1356075 2001  301
      1227 1356075 2002  301
      1227 1356075 2003  301
      1227 1513355 2004  301
      1227 2057004 1999  301
      1227 2057004 2000  301
      1227 2057004 2001  301
      1227 2057004 2002  301
      1227 2057004 2003  301
      1227 2718948 2000  301
      1227 2718948 2001  301
      1227 2718948 2002  301
      1227 2718948 2003  301
      1227 2718948 2004  301
      1227 2921933 2000  301
      1227 2921933 2001  301
      1227 2921933 2002  301
      1227 2921933 2003  301
      1227 2921933 2004  301
      1227 3127159 1998  301
      1227 3127159 1999  301
      1227 3127159 2000  301
      1227 3127159 2001  301
      1227 3127159 2002  301
      1227 3127159 2003  301
      1227 3127159 2004  301
      1227 3492307 2003  301
      1227 3492307 2004  301
      1227 3618948 1999  301
      1227 3618948 2000  301
      1227 3618948 2001  301
      1227 3618948 2002  301
      1227 3618948 2003  301
      1227 3618948 2004  301
      1227 3725142 1996  301
      1227 3725142 2000  301
      1227 3725142 2003  301
      1227 3725142 2004  301
      1227 4070578 2001  301
      1227 4070578 2002  301
      1227 4070578 2003  301
      1227 4070578 2004  301
      1227 4138728 1997  301
      1227 4138728 1998  301
      1227 4138728 1999  301
      1227 4138728 2000  301
      end

      Comment


      • #4
        Thank you. Either I don't understand what you want, or the example data you posted doesn't illustrate the problem well.

        What I think you want is that, in effect, for each value of town, we compile a list of all marketid's that occur with that value of town somewhere in the entire data set. Then we want to, in the original data, create new observations for each client id that includes all of the marketids that show up in the compiled list associated to that town. If that's what you want, the following code will, I believe, do that:
        Code:
        tempfile original
        save `original'
        
        keep marketid town
        duplicates drop
        tempfile markets_and_towns
        save `markets_and_towns'
        
        use `original', clear
        keep clientid town
        duplicates drop
        isid clientid, sort
        joinby town using `markets_and_towns'
        
        merge 1:m clientid marketid town using `original'
        But when I apply this code to your example data, you just end up with the original data as it is. And when I explore your data by eye, it appears that every clientid in your data is already paired with every marketid that appears matched with the clientid's town somewhere in the data. So that leaves me uncertain whether this is just an anomaly of the example data, or perhaps I have misunderstood what is wanted.

        Comment


        • #5
          Hi Clyde,

          thanks so much. I did not know
          Code:
          joinby
          but this seems to do exactly what I was looking for, namely fill all pairwise combinations but not overall but only within each town.

          The
          Code:
          tempfile
          command was also useful. I would just have saved the temporary datasets without it, but I understand that with tempfile it requires less space and possibly also less time which is good to know.

          Doing this it turned out indeed that a large, and larger than I had expected, part of clients were already connected to each market in their town, but not all were.
          The example I gave was misleading in this respect as its 100 observations were based on a 1% sample that did not include all market town combinations. When I repeated the exercise on a larger sample some fillins were made, so it seems to have worked now.

          Thanks so much and best regards,
          PM


          Comment

          Working...
          X