Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining repeated values in a panel into one


    Hi all,


    I am currently working on a panel dataset and the panel ID is Village for 17 years and I have around 9 million observations, whereas around 3 million observations are repeated.

    I tried to set "xtset Village Year" and stata gives an error message "Repeated time values within a panel". I do not want to delete the repeated ids (villages), rather I would like to sum up the values of the dependent variable, i.e. Road Length, for a repeated ID together. I tried to find solutions from the past posts and came across the following one:

    https://www.stata.com/statalist/arch.../msg00912.html


    In the above link, Mr. Nic Cox have given a command to merge the repeated values into one and I tried the same as below:

    bysort Village: replace Road_Length = sum( Road_Length ) by Village: keep if _n == _N,

    but Stata returns an error message " invalid 'by' ". If I type the above command as in two lines,

    bysort Village: replace Road_Length = sum( Road_Length ) , and then
    by Village: keep if _n == _N, almost the observations get dropped.


    Any suggestions on this would be helpful.
Working...
X