Reshape wide with many variables

Andressa Freire

Join Date: Aug 2022

Posts: 33
#1

Reshape wide with many variables

03 Nov 2022, 08:31

Hi guys
I kindly ask for your help , I'm trying to reshape this database. I want to organize the follow-up dates for food consumption and nutritional status consultations (variables "dataacomp_ca" and "dataacomp_en). Notice that I have 12 different dates for "dataacomp_ca" and 2 different dates for "dataacomp_en". all this for the same “id”
I would like to transform the bank to wide to calculate the difference between dates and identify the smallest gap between them.
In addition, I want to point out that the information on food consumption and nutritional status may be different for each "dataacomp_ca" and "dataacomp_en" because it is a follow-up. In addition, in this database I find several possibilities of number of follow-up dates for dataacom_ca and dataacomp_en. Here I put an example of the same person (id) who has 12 food consumption follow-ups and 2 nutritional status follow-ups. But the settings may be different for other people

I would like this configuration:
Id dataacomp_ca1 dataacomp_ca2 dataacomp_ca3 dataacomp_ca4 dataacomp_ca5 dataacomp_ca6 dataacomp_ca7 dataacomp_ca8 dataacomp_ca9 dataacomp_ca10 dataacomp_ca11 dataacomp_ca12 dataacomp_en1 dataacomp_en2

thank you very much for the help

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input long id float(dataacomp_ca dataacomp_en) 76555256 21769 21651 76555256 21864 21651 76555256 21889 21651 76555256 21830 21651 76555256 21683 21651 76555256 21861 21651 76555256 21801 21651 76555256 21648 21651 76555256 21738 21651 76555256 21560 21651 76555256 21593 21651 76555256 21626 21651 76555256 21830 21770 76555256 21683 21770 76555256 21560 21770 76555256 21861 21770 76555256 21593 21770 76555256 21738 21770 76555256 21769 21770 76555256 21626 21770 76555256 21801 21770 76555256 21864 21770 76555256 21889 21770 76555256 21648 21770 end format %td dataacomp_ca format %td dataacomp_en

------------------ copy up to and including the previous line ------------------

Listed 24 out of 24 observations
Tags: None
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1548
#2

03 Nov 2022, 08:39

Andressa Freire actually if your purpose is to find the smallest gap, or do other similar operations, you might be better off with the data being in long form, as it is already. Could you more clearly explain your final purpose? (smallest gap between which dates, for example? between _ca and _en? between successive _ca dates? between successive _en dates? something else?)
Comment
Andressa Freire

Join Date: Aug 2022

Posts: 33
#3

03 Nov 2022, 09:05

My purpose is to find the smallest gap between which between _ca and _en

I also think about transforming the database to wide format so that I have only one line for each "id" with all the information on food consumption (ca) and nutritional status (en) for that person. Should I then calculate the differences between dates (ca and en) before transforming the bank, choosing the smallest interval and finally transforming into wide only with the record with the smallest interval between ca and en?
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1548
#4

03 Nov 2022, 09:20

Okay, here is some code that might work for you. I first calculate the gaps and the minimum gap and mark the observation that has the minimum gap, and only then do the reshape to wide form. Doing those manipulations is easier in long form than in wide form.

I am using the absolute value of the gap, because it was sometimes positive and sometimes negative. You should change that if you want it done differently.

Code:

bysort id (dataacomp_ca): gen n = _n gen int gap = abs(dataacomp_ca - dataacomp_en) bys id: egen min_gap = min(gap) gen tag_min = (gap == min_gap) drop min_gap reshape wide dataacomp_ca dataacomp_en gap tag_min, i(id) j(n)

This keeps all the observations. It should be easy for you to change it to just keep the observations with the minimum gap, if that is all you want.
Comment

Andressa Freire

Join Date: Aug 2022
Posts: 33

03 Nov 2022, 10:29

Thank you so much! The code worked fine for the bank summary example I put here, but not for the bank in its entirety. In fact, I should have set a more faithful example to the bank. I'm really sorry

The following message appears when I run the full database:

. reshape wide dataacomp_ca dataacomp_en gap tag_min, i(id) j(n)
(note: j = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24)
variable datacad_ca not constant within id
variable cnes_ca not constant within id
variable sist_orig_ca not constant within id
variable local_atend_ca not constant within id
variable refeicao_tv_maior2a_ca not constant within id
variable cafe_maior2a_ca not constant within id
variable lanche_manha_maior2a_ca not constant within id
variable almoco_maior2a_ca not constant within id
variable lanche_tarde_maior2a_ca not constant within id
variable janta_maior2a_ca not constant within id
variable ceia_maior2a_ca not constant within id
variable feijao_maior2a_ca not constant within id
variable fruta_maior2a_ca not constant within id
variable verdura_maior2a_ca not constant within id
variable hamburguer_maior2a_ca not constant within id
variable bebida_adocada_maior2a_ca not constant within id
variable macarrao_instantaneo_maior2a_ca not constant within id
variable biscoito_recheado_maior2a_ca not constant within id

I paste another example below:

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long id float(idade_ca idade_en) byte(ind_maior2a_refeicao_tv_ca ind_maior2a_fruta_ca) float(alt_en peso_en) double imc_en float(z_pesidade_en z_imcidade_en dataacomp_ca dataacomp_en)
75993982 6.031485  6.20397 1 1 118 22  15.8   .458257  .3174728 21585 21648
75993982 6.031485 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21585 21817
75993982  6.06434 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21597 21817
75993982  6.06434  6.20397 1 1 118 22  15.8   .458257  .3174728 21597 21648
75993982 6.067077  6.20397 1 1 118 22  15.8   .458257  .3174728 21598 21648
75993982 6.067077 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21598 21817
75993982 6.138261 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21624 21817
75993982 6.138261  6.20397 1 1 118 22  15.8   .458257  .3174728 21624 21648
75993982 6.182067  6.20397 1 1 118 22  15.8   .458257  .3174728 21640 21648
75993982 6.182067 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21640 21817
75993982 6.316222  6.20397 1 1 118 22  15.8   .458257  .3174728 21689 21648
75993982 6.316222 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21689 21817
75993982 6.390144 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21716 21817
75993982 6.390144  6.20397 1 1 118 22  15.8   .458257  .3174728 21716 21648
75993982 6.477755 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21748 21817
75993982 6.477755  6.20397 1 1 118 22  15.8   .458257  .3174728 21748 21648
75993982 6.565366 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21780 21817
75993982 6.565366  6.20397 1 1 118 22  15.8   .458257  .3174728 21780 21648
75993982 6.647502 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21810 21817
75993982 6.647502  6.20397 1 1 118 22  15.8   .458257  .3174728 21810 21648
75993982 6.735113  6.20397 1 1 118 22  15.8   .458257  .3174728 21842 21648
75993982 6.735113 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21842 21817
end
format %td dataacomp_ca
format %td dataacomp_en

------------------ copy up to and including the previous line ------------------

Comment

Hemanshu Kumar

Join Date: Mar 2015

Posts: 1548
#6

03 Nov 2022, 12:14

Just put all the other variables (that are "not constant within id") into the i() option of the reshape command, along with id.
1 like
Comment
Andressa Freire

Join Date: Aug 2022

Posts: 33
#7

05 Nov 2022, 03:05

Thank you so much for the explanations!!!
Comment

Announcement