Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape wide with many variables

    Hi guys
    I kindly ask for your help , I'm trying to reshape this database. I want to organize the follow-up dates for food consumption and nutritional status consultations (variables "dataacomp_ca" and "dataacomp_en). Notice that I have 12 different dates for "dataacomp_ca" and 2 different dates for "dataacomp_en". all this for the same “id”
    I would like to transform the bank to wide to calculate the difference between dates and identify the smallest gap between them.
    In addition, I want to point out that the information on food consumption and nutritional status may be different for each "dataacomp_ca" and "dataacomp_en" because it is a follow-up. In addition, in this database I find several possibilities of number of follow-up dates for dataacom_ca and dataacomp_en. Here I put an example of the same person (id) who has 12 food consumption follow-ups and 2 nutritional status follow-ups. But the settings may be different for other people

    I would like this configuration:
    Id dataacomp_ca1 dataacomp_ca2 dataacomp_ca3 dataacomp_ca4 dataacomp_ca5 dataacomp_ca6 dataacomp_ca7 dataacomp_ca8 dataacomp_ca9 dataacomp_ca10 dataacomp_ca11 dataacomp_ca12 dataacomp_en1 dataacomp_en2


    thank you very much for the help

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long id float(dataacomp_ca dataacomp_en)
    76555256 21769 21651
    76555256 21864 21651
    76555256 21889 21651
    76555256 21830 21651
    76555256 21683 21651
    76555256 21861 21651
    76555256 21801 21651
    76555256 21648 21651
    76555256 21738 21651
    76555256 21560 21651
    76555256 21593 21651
    76555256 21626 21651
    76555256 21830 21770
    76555256 21683 21770
    76555256 21560 21770
    76555256 21861 21770
    76555256 21593 21770
    76555256 21738 21770
    76555256 21769 21770
    76555256 21626 21770
    76555256 21801 21770
    76555256 21864 21770
    76555256 21889 21770
    76555256 21648 21770
    end
    format %td dataacomp_ca
    format %td dataacomp_en
    ------------------ copy up to and including the previous line ------------------

    Listed 24 out of 24 observations

  • #2
    Andressa Freire actually if your purpose is to find the smallest gap, or do other similar operations, you might be better off with the data being in long form, as it is already. Could you more clearly explain your final purpose? (smallest gap between which dates, for example? between _ca and _en? between successive _ca dates? between successive _en dates? something else?)

    Comment


    • #3
      My purpose is to find the smallest gap between which between _ca and _en

      I also think about transforming the database to wide format so that I have only one line for each "id" with all the information on food consumption (ca) and nutritional status (en) for that person. Should I then calculate the differences between dates (ca and en) before transforming the bank, choosing the smallest interval and finally transforming into wide only with the record with the smallest interval between ca and en?

      Comment


      • #4
        Okay, here is some code that might work for you. I first calculate the gaps and the minimum gap and mark the observation that has the minimum gap, and only then do the reshape to wide form. Doing those manipulations is easier in long form than in wide form.

        I am using the absolute value of the gap, because it was sometimes positive and sometimes negative. You should change that if you want it done differently.

        Code:
        bysort id (dataacomp_ca): gen n = _n
        
        gen int gap = abs(dataacomp_ca - dataacomp_en)
        bys id: egen min_gap = min(gap)
        gen tag_min = (gap == min_gap)
        drop min_gap
        
        reshape wide dataacomp_ca dataacomp_en gap tag_min, i(id) j(n)
        This keeps all the observations. It should be easy for you to change it to just keep the observations with the minimum gap, if that is all you want.

        Comment


        • #5
          Thank you so much! The code worked fine for the bank summary example I put here, but not for the bank in its entirety. In fact, I should have set a more faithful example to the bank. I'm really sorry

          The following message appears when I run the full database:

          . reshape wide dataacomp_ca dataacomp_en gap tag_min, i(id) j(n)
          (note: j = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24)
          variable datacad_ca not constant within id
          variable cnes_ca not constant within id
          variable sist_orig_ca not constant within id
          variable local_atend_ca not constant within id
          variable refeicao_tv_maior2a_ca not constant within id
          variable cafe_maior2a_ca not constant within id
          variable lanche_manha_maior2a_ca not constant within id
          variable almoco_maior2a_ca not constant within id
          variable lanche_tarde_maior2a_ca not constant within id
          variable janta_maior2a_ca not constant within id
          variable ceia_maior2a_ca not constant within id
          variable feijao_maior2a_ca not constant within id
          variable fruta_maior2a_ca not constant within id
          variable verdura_maior2a_ca not constant within id
          variable hamburguer_maior2a_ca not constant within id
          variable bebida_adocada_maior2a_ca not constant within id
          variable macarrao_instantaneo_maior2a_ca not constant within id
          variable biscoito_recheado_maior2a_ca not constant within id



          I paste another example below:


          ----------------------- copy starting from the next line -----------------------
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input long id float(idade_ca idade_en) byte(ind_maior2a_refeicao_tv_ca ind_maior2a_fruta_ca) float(alt_en peso_en) double imc_en float(z_pesidade_en z_imcidade_en dataacomp_ca dataacomp_en)
          75993982 6.031485  6.20397 1 1 118 22  15.8   .458257  .3174728 21585 21648
          75993982 6.031485 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21585 21817
          75993982  6.06434 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21597 21817
          75993982  6.06434  6.20397 1 1 118 22  15.8   .458257  .3174728 21597 21648
          75993982 6.067077  6.20397 1 1 118 22  15.8   .458257  .3174728 21598 21648
          75993982 6.067077 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21598 21817
          75993982 6.138261 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21624 21817
          75993982 6.138261  6.20397 1 1 118 22  15.8   .458257  .3174728 21624 21648
          75993982 6.182067  6.20397 1 1 118 22  15.8   .458257  .3174728 21640 21648
          75993982 6.182067 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21640 21817
          75993982 6.316222  6.20397 1 1 118 22  15.8   .458257  .3174728 21689 21648
          75993982 6.316222 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21689 21817
          75993982 6.390144 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21716 21817
          75993982 6.390144  6.20397 1 1 118 22  15.8   .458257  .3174728 21716 21648
          75993982 6.477755 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21748 21817
          75993982 6.477755  6.20397 1 1 118 22  15.8   .458257  .3174728 21748 21648
          75993982 6.565366 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21780 21817
          75993982 6.565366  6.20397 1 1 118 22  15.8   .458257  .3174728 21780 21648
          75993982 6.647502 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21810 21817
          75993982 6.647502  6.20397 1 1 118 22  15.8   .458257  .3174728 21810 21648
          75993982 6.735113  6.20397 1 1 118 22  15.8   .458257  .3174728 21842 21648
          75993982 6.735113 6.666667 1 1 111 23 18.67 .40336975 1.6537515 21842 21817
          end
          format %td dataacomp_ca
          format %td dataacomp_en
          ------------------ copy up to and including the previous line ------------------

          Comment


          • #6
            Just put all the other variables (that are "not constant within id") into the i() option of the reshape command, along with id.

            Comment


            • #7
              Thank you so much for the explanations!!!

              Comment

              Working...
              X