I have some questions about what the output of margins, dydx () at (x=) means, as well as the test of second differences (i.e., contrasts or comparisons) between them for supporting or not providing evidence for my hypotheses:
Here is an example of a hypothesis in my paper:
Family History of Substance Use will moderate the association between Family Relationship Quality and Past Month Alcohol Use, such that higher Family Relationship Quality is associated with higher Past Month Alcohol Use.
So, essentially, I am hypothesizing that there will be a significant moderation (i.e., interaction) effect. The hypothesis is presented by this model:
Family Relationship Quality ---> Past Month Alcohol Use
^
|
Family History of Substance Use
So, I hypothesize that as the left side (IV) increases, the right side (DV) also increases. I am confused as to whether the increase I interpret here or the "higher" levels mentioned in the hypothesis above refer to the higher levels of the ordinal variables (e.g., 2, 3, or 4 on 0 to 4 scale for family history) or the difference between the variable levels (e.g., 1 and 0, both low scores).
You mention in your paper that you will know which variable to manipulate. You also suggest that both sides of the interaction are interpreted, so both variables (i.e., IV and moderator) will be manipulated (i.e., assigned representative values); correct? When I calculate and plot the average marginal effects, I am analyzing one side of the interaction. I think the side that best represents my hypothesized interaction is:
mimrgns, dydx(FamRel) at (FamHx=(0)1(4))
Is this correct? The above code produces ouput with dydx values, which I understand to be the AME on (i.e., first difference in) the probability of Past Month Alcohol Use for Family Relationship Quality at different values of Family History. These dydx values have p-values associated with them. If they are significant, is it the interaction of Family Relationship Quality and Family History of Substance Use or the main effect of Family Relationship Quality that is significant? If it's the former, am I interested in all values of family history? If it's the latter, does significance matter at all to answer my hypotheses about the interaction? I would have to compare non-significant main effects (AME) for an interaction too, correct? That comparison would be the test of second differences. (I'm not sure of it's utility if the above code provides information on the significance of the interaction already.) Should this comparison be between: the lowest (i.e., none, 0) and highest level of family history (4); low (1) and highest (4); or relatively lower (e.g., 2) and higher (e.g., 3) levels of family history? If it's the third or last option, are the adjacent (e.g., 1 and 0, 2 and 1, 3 and 2, 4 and 3) or other (e.g., 3 and 0, 3 and 1, 4 and 2...) contrasts more relevant to the hypotheses?
It produces this graph:
Title: AME of FamRel at 95% CIs
Y Axis:
Effect on Pr(Alcohol)
0 1 2 3 4
X Axis:
FamHx
In order to support my hypothesis, on the plot, do I need to see y values above 0 (i.e., increase in the probability of using alcohol), or a positive slope (i.e., difference in difference of probability of using alcohol) between two or all points? I hypothesize that higher Family Relationship Quality will be associated with higher Past Month Alcohol Use. Does that mean: a positive effect (i.e., positive y values); smaller negative effect of Family Relationship Quality (i.e., less negative, positive slope) between levels of Family History of Substance Use (x values); or a larger positive effect (more positive, positive slope) of Family Relationship Quality?
Thanks in advance for answers to these questions!
Chuck Huber (StataCorp) daniel klein
Here is an example of a hypothesis in my paper:
Family History of Substance Use will moderate the association between Family Relationship Quality and Past Month Alcohol Use, such that higher Family Relationship Quality is associated with higher Past Month Alcohol Use.
So, essentially, I am hypothesizing that there will be a significant moderation (i.e., interaction) effect. The hypothesis is presented by this model:
Family Relationship Quality ---> Past Month Alcohol Use
^
|
Family History of Substance Use
So, I hypothesize that as the left side (IV) increases, the right side (DV) also increases. I am confused as to whether the increase I interpret here or the "higher" levels mentioned in the hypothesis above refer to the higher levels of the ordinal variables (e.g., 2, 3, or 4 on 0 to 4 scale for family history) or the difference between the variable levels (e.g., 1 and 0, both low scores).
You mention in your paper that you will know which variable to manipulate. You also suggest that both sides of the interaction are interpreted, so both variables (i.e., IV and moderator) will be manipulated (i.e., assigned representative values); correct? When I calculate and plot the average marginal effects, I am analyzing one side of the interaction. I think the side that best represents my hypothesized interaction is:
mimrgns, dydx(FamRel) at (FamHx=(0)1(4))
Is this correct? The above code produces ouput with dydx values, which I understand to be the AME on (i.e., first difference in) the probability of Past Month Alcohol Use for Family Relationship Quality at different values of Family History. These dydx values have p-values associated with them. If they are significant, is it the interaction of Family Relationship Quality and Family History of Substance Use or the main effect of Family Relationship Quality that is significant? If it's the former, am I interested in all values of family history? If it's the latter, does significance matter at all to answer my hypotheses about the interaction? I would have to compare non-significant main effects (AME) for an interaction too, correct? That comparison would be the test of second differences. (I'm not sure of it's utility if the above code provides information on the significance of the interaction already.) Should this comparison be between: the lowest (i.e., none, 0) and highest level of family history (4); low (1) and highest (4); or relatively lower (e.g., 2) and higher (e.g., 3) levels of family history? If it's the third or last option, are the adjacent (e.g., 1 and 0, 2 and 1, 3 and 2, 4 and 3) or other (e.g., 3 and 0, 3 and 1, 4 and 2...) contrasts more relevant to the hypotheses?
It produces this graph:
Title: AME of FamRel at 95% CIs
Y Axis:
Effect on Pr(Alcohol)
0 1 2 3 4
X Axis:
FamHx
In order to support my hypothesis, on the plot, do I need to see y values above 0 (i.e., increase in the probability of using alcohol), or a positive slope (i.e., difference in difference of probability of using alcohol) between two or all points? I hypothesize that higher Family Relationship Quality will be associated with higher Past Month Alcohol Use. Does that mean: a positive effect (i.e., positive y values); smaller negative effect of Family Relationship Quality (i.e., less negative, positive slope) between levels of Family History of Substance Use (x values); or a larger positive effect (more positive, positive slope) of Family Relationship Quality?
Thanks in advance for answers to these questions!
Chuck Huber (StataCorp) daniel klein