Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Export to excel the 10 most common countries in the dataset

    Hello, I'm new to stata and am recreating code that I originally wrote in MatLab. It's for a weekly report type of document based on a user database where I want to see the (for example 10) most common countries where our users live. Right now I'm just tabulating the whole list of countries, like so:
    Code:
     tab2xl(country) using STATAtest_report.xlsx, col(11) row(2)
    Also ideally I'd want them sorted by number of people from each country, rather than alphabetically. How would I go about this?

    In my old code I made a 2 column array with nr of users in one and country name in the other, ran a sorting algorithm and deleted everything not in the top 10, then just paste to excel.
    But MatLab can handle limetless number of arrays at the same time while Stata seems to want only one open at a time?
    Thanks in advance

  • #2
    -tab2xl- is a user-written command, and I'm not familiar with it. Assuming your data set has a variable, country, that contains the country name, and another variable, call it n_users, that contains the number of users in the country, then you can basically emulate your Matlab approach:
    Code:
    keep country n_users
    gsort -n_users
    keep in 1/10
    Then you can send those results to a spreadsheet using the -putexcel- command.

    Caveat: If it happens that there are two or more countries tied for 10th place in number of users, this code will select from among them in a random and irreproducible way: the code will always select exactly 10 observations. If you would want to include all of the countries that are tied for 10th place, thereby allowing for more than 10 observations of results, you would change the last line of code to -keep if n_users >= n_users[10]-.

    But MatLab can handle limetless number of arrays at the same time while Stata seems to want only one open at a time?
    Stata has no objects that are called arrays. There are data sets, of which, until recently, only one could be open at a time. But since version 16 we have -frames- which allow you to open up to 100 at a time. And there are matrices--for which, as far as I know, there is no limit other than those imposed by the availability of memory in your system.

    Comment


    • #3
      Hey, thanks so much for your response! You are entierly correct in what I'm trying to achieve here and this solutions looks like just what I need.
      I'm struggling a little with handling data outside of the data sets/frames, is there a good resource you could point me to for further reading on this maybe?
      Thanks again, it comes as a godsend!

      Comment


      • #4
        StataCorp. has many books for learning Stata. See https://www.stata.com/bookstore/books-on-stata/. They differ in style, and in the kind of examples used to illustrate. I'm somewhat partial to Christopher Baum's An Introduction to Stata Programming, but I really think different readers will find different books most suitable for their own needs.

        Comment

        Working...
        X