Problem with difference-in-differences: how to simulate data with the same distribution of real data

Carola Segale

Join Date: Jul 2019

Posts: 1
#1

Problem with difference-in-differences: how to simulate data with the same distribution of real data

17 Jul 2019, 13:12

Dear All,

I ran a difference-in-differences with the data I want to analyze to evaluate the impact of a policy. However, there is a limited amount of observations. Due to this, I would like to simulate having more data distributed in the same way as the real data to see if and how standard errors shrink.

I am familiar with the Stata commands program and simulate, however I am not sure how to emulate the exact distribution of real data. How can I achieve this?

Thank you for you help.

Best,

CS
Tags: data, difference-in-differences, regression, simulation, syntax
Kye Lippold

Join Date: Jun 2019

Posts: 67
#2

17 Jul 2019, 21:41

It's unclear to me what you are hoping to learn from this exercise--the standard error formula depends on sample size in a predictable way (increased n shrinks the standard error at a decreasing rate). If you are trying to learn what sample size you should aim for when gathering more data, that would be a power calculation; see -help power-.

But the easiest way to do what you request is

Code:

expand 5

and re-run your model. (This replaces each observation with 5 copies, which will by definition have the exact same distribution as your original sample). You could vary the number of copies as needed to test different sample sizes.
1 like
Comment

Announcement