Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Collapse command

    Hello,

    I am facing an issue with the collapse command. I am collapsing my dataset of individual observations by year and a subcategory (rural/urban) to find the mean of the variable of interest. My command is :

    Code:
    collapse(mean) variable_of_interest, by(year urban)
    (Urban is a dummy variable that takes the value 1 if the individual observation is from an urban locality and 0 otherwise)

    However, I also need to find the mean of the variable of interest only by year to show how the variable of interest changes over time, when in the entire population, and among subcategories of the population (urban or rural in this case).

    However, once, I have collapsed my dataset by year and subcategory, there is no way I can calculate how the variable of interest changes over time, in the entire population -other than reloading the dataset, running the collapse command by year and then merging the 2 results- one where I collapse the dataset by year and urban/rural, and one where I collapse the dataset only by year.

    Is there a way I can do this, without going about this route?

    I would appreciate any help on this!

    Thank you,
    Kanika

  • #2
    So, if you had the results you wish, what do you plan to do with it?

    You have correctly outlined the steps you would need to take to get it. (Well, almost. You would -append-, not -merge-, the two data sets.) There is a good reason that Stata does not make it easy to do this. That is because for data analysis purposes, at least in Stata, and in every other statistics package I have ever worked with, a data set that contains both aggregated (by year only) and disaggregated (by year and urban/rural) data together is a recipe for trouble. There is almost no reasonable analysis that can be done with such a data set. This kind of data array would really be suitable only for visual display, not for any additional calculation. So if you are planning on additional analysis with this data, all I can say is be extremely cautious--you are standing on a precipice. If you are planning on just displaying the data, then forget about all the work with -collapse- and use, for example, the -table- command.

    Comment


    • #3
      Hello Clyde,

      You are absolutely right- I indeed want to do this only for visual analysis and nothing more.

      I would be grateful if you could please elaborate on how can I use the table command to this end.

      Thank you very much,
      Kanika

      Comment


      • #4
        As you do not show example data, I illustrate the approach with the auto.dta:
        Code:
        . sysuse auto, clear
        (1978 automobile data)
        
        .
        . table (rep78) (foreign), statistic(mean mpg) nformat(%2.1f mean)
        
        ------------------------------------------------
                           |          Car origin        
                           |  Domestic   Foreign   Total
        -------------------+----------------------------
        Repair record 1978 |                            
          1                |      21.0              21.0
          2                |      19.1              19.1
          3                |      19.0      23.3    19.4
          4                |      18.4      24.9    21.7
          5                |      32.0      26.3    27.4
          Total            |      19.5      25.3    21.3
        ------------------------------------------------
        
        .
        And if you wish to have this table in Excel or Word, there is the -collect export- command that can be used after that.

        Comment


        • #5
          Thank you! I am grateful.

          Comment

          Working...
          X