Problem with dealing in unbalanced data obtained from CMIE database

Rajat Gupta

Join Date: Apr 2019

Posts: 1
#1

Problem with dealing in unbalanced data obtained from CMIE database

10 Apr 2019, 00:23

Hello everyone! I am a new user here. I'm an undergrad studying in Goa.
I'm working to study and observe a pattern in R&D behavior of manufacturing MSMEs as defined by the govt. There are around 6900 companies and the time considered is 1992 to 2018.

The issue is that all the companies haven't been in existence throughout the time frame, but while downloading the data contains just empty rows for the missing years. When I load the data into Stata, it shows strongly balanced, even though I know it's not.

I can't understand how to work with this unbalanced data.

Any help would be appreciated.
Thank you!
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

11 Apr 2019, 11:06

Welcome. For the most part, you can ignore the problem. If you care to, you can delete observations where all the variables are missing:

drop if x1==. & x2==.
or
drop if missing(x1,x2)

However, the main reason to do this would be to make the data set smaller. When Stata runs a regression or any other statistical analysis, it simply drops any observation that has missing data for any of the variables.
Comment

Announcement