Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Number of observations per country

    Hi everybody,

    I was just wondering how to caculate the number of firms per country?
    In my case, country is a dummy variable ( I have dummies for asia, usa, europa and canada) and firm is a string variable.
    It is probably very easy but I'm struggling to find the answer.

    Kind regards,
    Astrid

  • #2
    depending on whether you have one observation per firm or not the solution might be not or a bit cumbersome.
    if you have so few countries (...continents?), then you could just use something like distinct(firmvariable) if asia==1. or if distinct is not installed "codebook"



    /edit: I just read that your thread is called number of observations while in the post you say number of firms. this suggests that you have one observation per firm. then you can also do something like

    foreach country in Asia Europe Canada USA {
    gen obs`country' = _N if `country'dummy==1
    }

    /another edit: and now I read that firm is a string variable. Sorry! I guess group() only works for numbers...?
    Last edited by Anya Fedyk; 03 Jul 2015, 03:55.

    Comment


    • #3
      The original question is vague on data structure, but supposing we have categorical variables country and firm then regardless of repetitions or variable type

      Code:
      egen tag = tag(country firm)
      egen nfirms = total(tag), by(country)
      answers the question. The first command assigns 1 to just one observation for each distinct combination of the two variables and 0 to any others. The second commands adds the 1s and 0s, equivalently adds the 1s, equivalently counts them, within countries.

      However, there are in this problem several indicator variables, not one categorical variable, so it would be a good idea to create a country variable, say by

      Code:
       
      gen country = "" 
      
      foreach v in asia usa europa canada { 
           replace country = "`v'" if `v' == 1 
      }
      and then you have reduced your problem to the one stipulated.

      Anya's helpful answer refers to distinct but does not say where it comes from.

      Code:
      search distinct
      will point to download locations on the Stata Journal website: use the most recent, but note that the 2008 article

      http://www.stata-journal.com/sjpdf.h...iclenum=dm0042

      is accessible to all and is more than a guide to the distinct command: it covers how to answer these questions for yourself. In the same spirit I started this with an answer from first principles.
      Last edited by Nick Cox; 03 Jul 2015, 04:12.

      Comment

      Working...
      X