Hello
I wanted to know if anybody has experience importing .sxv4 (SuperCROSS) files into Stata, or has an idea of how I might do this.
I am working with South African national census data which comes in this format. Currently I am forced to use the SuperCROSS program to create cross-tabulations, export this as a CSV file, and then import this into Stata. I have a strong preference for working in Stata from the beginning, as I find the SuperCROSS programme to be very clumsy for my purposes.
In case it helps, I have included a link below to one of the Census datasets in .sxv4 form (the compressed file is 73 mb and I had issues directly uploading it as an attachment to the Statalist).
To briefly describe the datatset: The data in the link has 10 "variables" from the national census. The census data is reported at the "Small Area Level" (SAL), which is the finest level of granularity released for public use by South Africa's national statistical agency. Each Small Area contains a number of people, normally around 1000, but this can vary dramatically. In Stata form, I imagine that each small area (denoted by a numeric code) would be one observation. When I said the data had 10 "variables" above, I used quotation marks because while in SuperCROSS there is one "Geography" variable, this can be disaggregated from a provincial-level variable all the way to the Small Area Level, and in Stata I imagine this as more than one variable.
I have little experience managing data outside of Stata and Excel and would be very grateful for any help offered.
Best,
Josh Budlender
I wanted to know if anybody has experience importing .sxv4 (SuperCROSS) files into Stata, or has an idea of how I might do this.
I am working with South African national census data which comes in this format. Currently I am forced to use the SuperCROSS program to create cross-tabulations, export this as a CSV file, and then import this into Stata. I have a strong preference for working in Stata from the beginning, as I find the SuperCROSS programme to be very clumsy for my purposes.
In case it helps, I have included a link below to one of the Census datasets in .sxv4 form (the compressed file is 73 mb and I had issues directly uploading it as an attachment to the Statalist).
To briefly describe the datatset: The data in the link has 10 "variables" from the national census. The census data is reported at the "Small Area Level" (SAL), which is the finest level of granularity released for public use by South Africa's national statistical agency. Each Small Area contains a number of people, normally around 1000, but this can vary dramatically. In Stata form, I imagine that each small area (denoted by a numeric code) would be one observation. When I said the data had 10 "variables" above, I used quotation marks because while in SuperCROSS there is one "Geography" variable, this can be disaggregated from a provincial-level variable all the way to the Small Area Level, and in Stata I imagine this as more than one variable.
I have little experience managing data outside of Stata and Excel and would be very grateful for any help offered.
Best,
Josh Budlender
Comment