Hi!
I have a database with multiple variables in it. My goal is to create a control variable for a OLS regression. The variable I want to create is the: maximum duration the member of the household of the migrant arrived to Portugal (years_sincemig_hh). This duration will be till the year of 2020 or 2021. The year 2020 or 2021 depends across observations and is the "bl_year" code.
I will put every already coded variable in red.
Therefore I want the minimum year that the member of the household arrived to PT. Like if the migrant has 2 members and one arrived in PT in 2014 and the other in 2004 I want to stay with the latter one. I don´t mind having the control variable I want with the year itself (like 2014) or the duration they are in PT(2020-2014= 6) , it´s the same interpretation, right?
I have:
c2_h* = the year the member arrived in PT
c2_h* = 77 -> if the member always lived here and here I need to replace by (bl_year - c2_d*) if used the year specifically not the duration (but I guess the purpose is the same)
c2_g* = 2 -> if the member of the hh moved before the migrant
c2_g* = 1 -> if the member of the hh moved at the same time and here replace by f6 which is the year the migrant arrived
c2_g* = 3 -> after
c2_g* = 77 -> if the member always lived here and here I need to replace by (bl_year - c2_d*) if used the year specifically not the duration (but I guess the purpose is the same)
So I want to gather all of these variables and then restrict to among the members the migrant have what is the one who has the maximum amount of years living here in PT.
Notes:
10 is the maximum number of members that one migrant can have in their household considering that some fill out only till c2_*2 so 2 members and others c2_*5 so 5 members not taking into account the migrant
(c2_g1; c2_g2; c2_g3; ...; c2_g10 - if the member moved before, after or at the same time)
(c2_d1, c2_d2 .... c2_d10 -> member' age)
Thank you so much
I have a database with multiple variables in it. My goal is to create a control variable for a OLS regression. The variable I want to create is the: maximum duration the member of the household of the migrant arrived to Portugal (years_sincemig_hh). This duration will be till the year of 2020 or 2021. The year 2020 or 2021 depends across observations and is the "bl_year" code.
I will put every already coded variable in red.
Therefore I want the minimum year that the member of the household arrived to PT. Like if the migrant has 2 members and one arrived in PT in 2014 and the other in 2004 I want to stay with the latter one. I don´t mind having the control variable I want with the year itself (like 2014) or the duration they are in PT(2020-2014= 6) , it´s the same interpretation, right?
I have:
c2_h* = the year the member arrived in PT
c2_h* = 77 -> if the member always lived here and here I need to replace by (bl_year - c2_d*) if used the year specifically not the duration (but I guess the purpose is the same)
c2_g* = 2 -> if the member of the hh moved before the migrant
c2_g* = 1 -> if the member of the hh moved at the same time and here replace by f6 which is the year the migrant arrived
c2_g* = 3 -> after
c2_g* = 77 -> if the member always lived here and here I need to replace by (bl_year - c2_d*) if used the year specifically not the duration (but I guess the purpose is the same)
So I want to gather all of these variables and then restrict to among the members the migrant have what is the one who has the maximum amount of years living here in PT.
Notes:
10 is the maximum number of members that one migrant can have in their household considering that some fill out only till c2_*2 so 2 members and others c2_*5 so 5 members not taking into account the migrant
(c2_g1; c2_g2; c2_g3; ...; c2_g10 - if the member moved before, after or at the same time)
(c2_d1, c2_d2 .... c2_d10 -> member' age)
Thank you so much

Comment