Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • count or tag with if condition

    Hello Stataliest,

    I have the following variables:

    ID
    Region
    Productname
    Date
    Price


    I want another variable that tells me how many IDs sell Product1 or Product2 in each region.


    I have tried egen tag_product=tag(product Region) if productname=="Product1" | "Product2"
    (I only obtain nulls, which must be wrong)

    and
    egen tag_product=count(Productname) if Productname=="Product1"| "Product2"

    Here I got a type missmatch

    Can you help me?
    Thanks.





  • #2
    Try this:

    Code:
    // MARK EACH ID TO SHOW WHETHER IT SELLS PRODUCTS 1 OR PRODUCT 2 IN A REGION
    by Region ID, sort: egen sells_p1_or_p2 = max(inlist(Productname, "Product1", "Product2"))
    
    //    MARK ONE OBSERVATIN FOR EACH REGION x ID
    egen flag = tag(Region ID)
    
    //    COUNT PROVIDING ID'S PER REGION
    by Region, sort: egen number_of_sellers = total(flag & sells_p1_or_p2)
    Note: You didn't provide example data, so this is not tested. It may have typos, unbalanced parens, etc. It also may simply be dead wrong if your data are not as I imagine them. To protect yourself against future instances of incorrect untested code you should always post an example of your data when you want help with code. The helpful way to do that is with the -dataex- command, which you should install and always use when you show example data. (If you are running Stata version 15.1, then you don't need to install it--it's part of official Stata now. If you are running an earlier version, run -ssc install dataex- to get it.) Read -help dataex- for instructions for using it.

    Do note that the construction -if Productname == "Product1" | "Product2"- is an attempt to translate ordinary English into Stata code. It doesn't work, as you discovered. First of all, "Product1" and "Product2" are strings, so the | operator doesn't work with them. The | operator requires logical expressions, which, in Stata, are always represented as numbers (0 = false, anything else = true). But what you really mean here is "if Product name is Product 1 or Productname is Product 2". So the representation of that would be -if Productname == "Product1" | Productname == "Product2"-. You cannot elide the second "Productname == " from this in Stata code the way you do in English.

    In any case, as shown in the code above, it isn't necessary to type all that out, as the -inlist()- function expresses the same meaning with less fuss.

    Comment


    • #3
      Thank you very much, I unfortunately only obtain nulls after your command. I am using Stata 13. Do you have any idea why this could be?

      Comment


      • #4
        Writing code for imaginary data is difficult; debugging it is impossible. Please re-read the Note I wrote following the code in #2.

        Comment

        Working...
        X