Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • r2 in a dynamic dyadic regression

    I have a network dataset featuring a set of individuals who can unilaterally send, either as individuals or as groups of individuals, messages to other individuals or to groups of individuals. Each observation corresponds to a message sent from an individual to another individual. The content of each message is labelled with one out of five categories. I also have an index for the "saliency" of each message. Groups of individuals are unfortunately endogenous. I know a bunch of demographic characteristics for each individual as well as the time evolution of the network.

    My goal would be to estimate the probability for an individual to respond to a message as a function of past actions and the evolving structure of the network.

    I understand there are several problems with a dataset of this sort: in particular the network is endogenous, and there are potentially many processes in place going on in parallel with the observed evolution of the communication structure. Even with exogenous regressors, standard errors would be inconsistent (but this could be fixed using Fafchamps and Gubert 2007, if I remember correctly)

    In an ideal world, I would run a logit regression of a binary variable (call it"response" picking value of 1 if a message comes from an individual to another individual who contacted them before) on a bunch of dyadic-level measures (such as whether the two individuals live in the same area), a battery of lags to capture whether the receiver sent (and when) a message earlier, a dynamically-adjusting measure of network centrality and time polynomial to fit possible time trends.

    It looks like the model R^2 undergoes a massive increment from about 0.16 to 0.50 once I introduce the time polynomials. Significance and values of pre-existing regressors is stable after introduction of polynomials. I am wondering: is that due to the fact that the longer the passing time, the more likely a message is received?

    More in general, do I have any hope to extract anything informative from a regression of this sort?



    HTML Code:
    Logistic regression                                    Number of obs = 152,160
                                                           Wald chi2(16) =       .
                                                           Prob > chi2   =       .
    Log pseudolikelihood = -1213.6146                      Pseudo R2     =  0.5037
    
    --------------------------------------------------------------------------------------------------
                                     |               Robust
                 response_bin | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    ---------------------------------+----------------------------------------------------------------
            age_diff |  -.0420463   .0069516    -6.05   0.000    -.0556712   -.0284215
            neigh_match |  -.0746072   .1194708    -0.62   0.532    -.3087657    .1595513
            ethn_match |   1.008889   .1245479     8.10   0.000     .7647799    1.252999
            sex_match |  -.1762324   .1198478    -1.47   0.141    -.4111298    .0586649
            eigen_cent_diff_std |  -6.781157   1.462708    -4.64   0.000    -9.648013   -3.914302
            eigen_cent_sum_std |    6.54639   1.155631     5.66   0.000     4.281394    8.811385
    message_sent_1De|  -2.260556   1.521529    -1.49   0.137    -5.242698    .7215858
    message_sent_5De|   1.283659   .6192781     2.07   0.038     .0698965    2.497422
    message_sent_30De|   .8581493   .2525828     3.40   0.001      .363096    1.353203
            trend |   .0029432   .0017652     1.67   0.095    -.0005164    .0064029
            trend2 |  -3.21e-06   1.78e-06    -1.81   0.071    -6.70e-06    2.76e-07
            trend3 |   1.54e-09   8.00e-10     1.93   0.054    -2.54e-11    3.11e-09
            trend4 |  -3.67e-13   1.76e-13    -2.08   0.038    -7.12e-13   -2.12e-14
            trend5 |   4.23e-17   1.87e-17     2.27   0.023     5.73e-18    7.89e-17
            trend6 |  -1.88e-21   7.58e-22    -2.48   0.013    -3.37e-21   -3.97e-22
            _cons |  -5.214207   .6676472    -7.81   0.000    -6.522771   -3.905642
    --------------------------------------------------------------------------------------------------
    Note: 143336 failures and 0 successes completely determined.
    thanks for your time!
    Last edited by Paola Bertolini; 03 Aug 2023, 13:16.
Working...
X