Dropping Observations and Creating Fertility Variable

Sandipa Bhattacharya

Join Date: May 2020

Posts: 26
#1

Dropping Observations and Creating Fertility Variable

29 Aug 2021, 14:34

Question 1:

I am using two fertlity variables- age at first intercourse (var name: age1stsex) and age at first birth (var name: age1stbirth).

I need to drop observations if a woman reported having a child at a certain age, but not having their first sexual intercourse at that age or before. Can you help me how to drop those observations by matching the records?

Question 2:

I have another vaiable - total number of child births (var name: ch_birth).

I want to generate the total fertility (i.e total number of child births) by a given age [This will be my main dependent variable. I want to run my regressions using this variable created for different age. So I basically will have several dependent variable.]. Along with that I would like to restrict the sample to only include observations older than the age specified while calculating the variable. What is the easiest way to code this?
Tags: None
Ken Chui

Join Date: Aug 2014

Posts: 1063
#2

30 Aug 2021, 06:38

Please kindly refer to the FAQ (http://www.statalist.org/forums/help) and use -dataex- to provide some example data. Without that it'd be very difficult to provide suggestions as the structure and format of the data is unclear.

I need to drop observations if a woman reported having a child at a certain age, but not having their first sexual intercourse at that age or before. Can you help me how to drop those observations by matching the records?

Assuming the age variables are numeric:

Code:

drop if age1stbirth < age1stsex

So I basically will have several dependent variable.]. Along with that I would like to restrict the sample to only include observations older than the age specified while calculating the variable. What is the easiest way to code this?

To answer this the structure of the data is needed. From what you're describing, you'd like some age-specific number of children, but with just one variable "ch_birth" and no other information, it'd not be possible to guess when was each child was born.
Comment
Sandipa Bhattacharya

Join Date: May 2020

Posts: 26
#3

12 Apr 2022, 17:52

For question 2, I also have the age variable. So do we code this way:

sort age

by age: egen ch_birth_15 = sum(ch_birth) if age>=15
by age: egen ch_birth_16 = sum(ch_birth) if age>=16
by age: egen ch_birth_17 = sum(ch_birth) if age>=17
by age: egen ch_birth_18 = sum(ch_birth) if age>=18
by age: egen ch_birth_19 = sum(ch_birth) if age>=19

Is that right?
Comment
Ken Chui

Join Date: Aug 2014

Posts: 1063
#4

13 Apr 2022, 05:39

Originally posted by Sandipa Bhattacharya View Post

For question 2, I also have the age variable. So do we code this way:

sort age

by age: egen ch_birth_15 = sum(ch_birth) if age>=15
by age: egen ch_birth_16 = sum(ch_birth) if age>=16
by age: egen ch_birth_17 = sum(ch_birth) if age>=17
by age: egen ch_birth_18 = sum(ch_birth) if age>=18
by age: egen ch_birth_19 = sum(ch_birth) if age>=19

Is that right?

As I said it's hard to judge if a codes are right or wrong without any context. This code creates age specific total children born and each unique age will have the same number. And if that's what the analysis needs, then I assume would be "correct."

It can be shortened:

Code:

sort age forvalues x = 15/19{ by age: egen ch_birth_`x' = sum(ch_birth) if age >= `x' & age < . }

Adding "age < ." can prevent including cases with missing age data.
Comment

Announcement

Dropping Observations and Creating Fertility Variable

Comment

Comment

Comment