Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Computing bilateral flow using function

    Dear all,

    This is not a statistical question (I apologize) but one for a function.

    I have data for origin and destination (data provided below); I would like to generate a variable based on a function, e.g.

    Code:
    gen y= a*var_o + b*var_d
    Such that y will be generated for each origin to all the dest but will not generate for the origin, e.g. for AFG, I would like generate y for all the destinations except AFG itself. Also, while the values for var_d will be for all the destinations, the value for var_o will be only for the origin (in this case AFG).

    I realize that I have to use foreach but I am having a difficult time with the condition. I hope I have provided enough information, any help will be highly appreciated. Thanks!

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str3(origin dest) float(var_o var_d a b)
    "AFG" "AFG" 14.3 14.3 .4 1.2
    "AGO" "AGO" 14.7 14.7 .4 1.2
    "ALB" "ALB" 15.4 15.4 .4 1.2
    "AND" "AND" 16.4 16.4 .4 1.2
    "ARE" "ARE" 23.9 23.9 .4 1.2
    "ARG" "ARG" 24.4 24.4 .4 1.2
    "BDI" "AUS"   25   25 .4 1.2
    "BEN" "AUT" 25.5 25.5 .4 1.2
    "BFA" "BDI" 15.9 15.9 .4 1.2
    "BGD" "BEN" 16.5 16.5 .4 1.2
    "BGR" "BFA" 17.2 17.2 .4 1.2
    end
    Sincerely,

    Chiara

  • #2
    I'm completely confused by your explanation.
    Code:
    gen y= a*var_o + b*var_d
    What is it you want that the above command (taken directly from your post) doesn't do for you?

    Comment


    • #3
      Dear Clyde Schechter,

      Thank you for your reply. This command would generate one value for each ovserobserv but I would like twelve observations (number of destinations) for each origin.

      Apologies if my original post was not clear. Thanks!

      Sincerely,

      Chiara

      Comment


      • #4
        OK. But this only makes sense if your data meets certain assumptions:

        a and b must have the same values in every observation of the data set. This is true in your example, but it must be true throughout. (And, if this is true, you would be better off not wasting the memory for variables on constants, which would be better stored as local macros or scalars--but I digress.)

        The value of var_o must be the same for all observations having a given origin. Similarly the value of var_d must be the same for all observations having a given dest. Otherwise it wouldn't be clear how to pair up values of var_o (resp. var_d) with values of origin (resp. dest).

        The following code first verifies those assumptions and then does what I understand you to want:

        Code:
        //    VERIFY NECESSARY ASSUMPTIONS
        assert a == a[1]
        assert b == b[1]
        by origin (var_o), sort: assert var_o[1] == var_o[_N]
        by dest (var_d), sort: assert var_d[1] == var_d[_N]
        
        preserve
        keep origin var_o
        tempfile origins
        save `origins'
        
        restore
        drop origin var_o
        cross using `origins'
        drop if origin == dest
        
        gen y = a*var_o + b*var_d
        By the way, that's not 12 observations per origin. You only have 11 origins and destinations in your example to start with. And in #1 you said you didn't want to pair an origin with itself. If the lists of origins and destinations were the same, that would make 11*10 = 110 observations each, but the lists are somewhat different, so there are some origins match with all 11 destinations because that origin is never a destination.
        Last edited by Clyde Schechter; 23 Jul 2018, 16:55.

        Comment


        • #5
          Dear Clyde Schechter,

          Thank you very much for the solution. As my data meets the conditions - this is great!

          I agree that a and b should be stored differently, thanks for the suggestions.

          If I could ask a follow-up question, this data is for one year, could this code be modified for different years where the value of var_o for a given origin will vary by year? Thank you again.

          Sincerely,

          Chiara

          Comment


          • #6
            So here's some example data where we have three years. The values of var_o and var_d differ over the years (I just added some random noise to your original values).

            Code:
            clear
            input str3(origin dest) float(var_o var_d a b year)
            "AFG" "AFG" 14.103353 14.217686 .4 1.2 2001
            "AFG" "AFG" 14.237127 14.172062 .4 1.2 2002
            "AFG" "AFG"  14.08275 14.199132 .4 1.2 2003
            "AGO" "AGO" 15.036702 14.619832 .4 1.2 2001
            "AGO" "AGO"  14.83705  14.85298 .4 1.2 2002
            "AGO" "AGO" 14.304163 14.733882 .4 1.2 2003
            "ALB" "ALB" 15.641335 15.530564 .4 1.2 2001
            "ALB" "ALB" 15.661524  15.00287 .4 1.2 2002
            "ALB" "ALB" 15.254153 15.509652 .4 1.2 2003
            "AND" "AND"  16.41256  15.97146 .4 1.2 2001
            "AND" "AND" 16.704914 16.294168 .4 1.2 2002
            "AND" "AND"  16.43757 16.380617 .4 1.2 2003
            "ARE" "ARE"  23.62865 23.839415 .4 1.2 2001
            "ARE" "ARE"  23.72575  24.15434 .4 1.2 2002
            "ARE" "ARE" 23.472576  23.53876 .4 1.2 2003
            "ARG" "ARG"  24.74086  24.46498 .4 1.2 2001
            "ARG" "ARG" 24.487635  24.20134 .4 1.2 2002
            "ARG" "ARG" 24.492886  24.21794 .4 1.2 2003
            "BDI" "AUS"   24.7882  24.96501 .4 1.2 2001
            "BDI" "AUS" 25.015873  25.03942 .4 1.2 2002
            "BDI" "AUS"  24.91072  24.88463 .4 1.2 2003
            "BEN" "AUT"  25.43505  24.90739 .4 1.2 2001
            "BEN" "AUT"  25.88994 25.933987 .4 1.2 2002
            "BEN" "AUT"  25.54256  25.45415 .4 1.2 2003
            "BFA" "BDI" 16.365185 15.733873 .4 1.2 2001
            "BFA" "BDI" 15.975468 15.403328 .4 1.2 2002
            "BFA" "BDI" 16.056576 16.738459 .4 1.2 2003
            "BGD" "BEN"  16.47887  15.97923 .4 1.2 2001
            "BGD" "BEN" 16.640467  16.64486 .4 1.2 2002
            "BGD" "BEN" 16.840734  16.94876 .4 1.2 2003
            "BGR" "BFA" 17.046305 17.732685 .4 1.2 2001
            "BGR" "BFA" 17.569124 17.659914 .4 1.2 2002
            "BGR" "BFA"   16.8588 17.442312 .4 1.2 2003
            end
            
            //    VERIFY NECESSARY ASSUMPTIONS
            assert a == a[1]
            assert b == b[1]
            by origin year (var_o), sort: assert var_o[1] == var_o[_N]
            by dest year (var_d), sort: assert var_d[1] == var_d[_N]
            
            preserve
            keep origin year var_o
            tempfile origins
            save `origins'
            
            restore
            drop origin var_o
            joinby year using `origins'
            drop if origin == dest
            
            gen y = a*var_o + b*var_d
            Changes to the original code are in italics.

            Comment


            • #7
              Dear Clyde Schechter,

              Thank you again - this works great!

              Sincerely,

              Chiara

              Comment

              Working...
              X