Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • multilevel model, which levels?

    Hi all,

    I've got a dataset measuring campaign activities (N=1850). Candidates are members of parties (party)(n=8) and nested in electoral districts (district) (n=29) and thus also in parties within districts (=party lists) (list) (n=8*29=232; not all lists in sample --> n= 195). I got hypotheses on the effects of variables on all these levels. For a single level regression my model goes as follows (fake dataex and regression):
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(pers gen ido chance2 nominc) byte DM float exps10 long x float(exps102 poliexp nominwhocat partydummies districtid)
     3 1 2 0 3  9  3 55   9 . 1  4 21
     4 0 2 1 3 10  3 49   9 0 0  6 20
     1 0 1 2 3 29 10 45 100 1 0  7 29
     0 0 0 1 0 22 10 45 100 1 0  7 23
     4 1 1 0 3 10 10 45 100 0 1  2  1
     1 1 2 1 2  9 10 45 100 3 0  6 21
     6 0 1 0 2 12 10 45 100 0 .  7 12
    end
    (Group identifiers are Variable: partydummies (party affiliation), districtid (electoral districts))


    Code:
    reg pers x DM exps10 exps102 i.chance2 i.nominc i.nominwhocat ido gen i.poliexp i.partydummies

    Dependent Variable pers: use of personalized campaign activities (individual level). Variable x (list level), exps10 (list level) and DM (district level) are on the context level. Rest are individual level variables.


    However, my concern is that due to the nested structure the errors might not be iid and thus inference from the standard errors limited

    What would be the appropriate levels for a multilevel model?
    For the null model in stata notation i was guessing (since list equals party-in-a-district): (1)
    Code:
    mixed pers(candidate level), reml || district: || party:
    However, when I am also interested in the effects of party affiliation itself, how would I incorporate this into a multilevel model?

    Perhaps like this?:
    Code:
    mixed pers i.party, reml || district: || party:
    The ICC of the null model (1) is rather low. 0.07 to be precise.

    Also, multi-level modelling would be quite disproportional to the scope of the analysis.

    Can I get any solid inferences from just the pooled OLS model? Perhaps with cluster-robust standard errors? On which level should I cluster?

    Kind regards
    Last edited by Markus Freitag; 01 Jul 2018, 06:10.

  • #2
    The mixed effects model you show has party nested within district. You're the content expert, but this strikes me as implausible. This means that each district has its own separate parties, and that parties in one district, even if they carry the same numerical code in your data, have nothing to do with the parties in any other district. That's a kind of political organization that I have never heard of before. I think it is more likely that parties are crossed with district (i.e. each party is active in every district), or perhaps a multiple membership model (each party runs in several districts, but perhaps not all of them.)

    That said, how many different parties are in your data? In your example data, there are four distinct values of partydummies, and those values range from 2 to 7. If altogether there are only a small number of parties, I do not think you will gain much by treating party as a random effect. In your example data, with 4 distinct parties, you have an N of 4 for estimating the variance component at the party level. That's really not an adequate sample for most purposes. Even an N of 7 is pretty weak.

    My inclination (unless the number of parties is in the dozens) would be to keep party as a fixed effect only, and try to account for the nesting of observations within district with a random intercept at the district level. So something like

    Code:
    mixed pers i.party || districtid:
    (My omission of the -reml- option should not be construed as disagreeing with your choice to use it: I'm not taking a position on that, just demphasizing the peripheral issues.)

    And apart from everything said above, including both i.party and || party: in the same model does not make sense.

    Comment


    • #3
      Thanks alot, Clyde!

      You are absolutely right with your remarks!
      My research of the past hours also leads me to the conclusion that the data structure is two-way cross-classified with interaction.

      (1)
      Code:
      mixed pers i.chance2 i.nominc i.nominwhocat ido gen i.poliexp DM x exps10 exps102 || _all: R.district || party: || districtXparty:,  reml
      The number of parties is quite low indeed (8). Also, checking the ICCs of the null model (intercepts only) only shows substantial correlation for the party-level and almost none for district and party list level.

      Hence, i think the model should be much simpler and, as you suggested, keep party as fixed effect only.

      Perhaps, even the pooled OLS model, which does not seem to lead to essentially different results than (1), is sufficient:

      Code:
      pers i.chance2 i.nominc i.nominwhocat ido gen i.poliexp DM x exps10 exps102 i.party

      How should i justify this model selection on an argumentative level? Report the null mixed effect model and the ICC; and...?

      As i would like to treat potentially remaining cluster effects rather as a nuisance and to account for heteroscedasticity (due to the skewed dependend variable) at once, perhaps cluster-robust standard errors would be good to report as a robustness-check?

      However, only the party-in-a-district level (the party list level) would have enough groups if one uses the 40-cluster heuristic as a rule of thumb. Should i cluster on this level?

      Kind regards

      Markus Freitag

      Comment


      • #4
        I agree with all your reasoning here. If the ICC is only 0.008, there is little to be gained from using random intercepts, and I would stick with OLS and fixed effefcts as you suggest. Using cluster robust VCE at the party-in-a-district level is also sensible here.

        Comment


        • #5
          Running the simple model now, I kind of realized i might have done some specification mistakes. The dependent variable is quite skewed (it is inverval on a scale from 1 to 10) and thus, heteroscedasticity is present. A simple Log-Transformation of the dependent variable does not alleviate the problem of heteroscedasticity. Residuals seem to be quite normal.

          Is OLS with cluster robust standard errors as a comparison model still viable? I want to avoid more difficult models, such as ordered logit, as its kind of beyond my statistical knowledge.

          Code:
           reg pers i.chance2 i.nominc i.nominwhocat ido gen i.poliexp DM x exps10 exps102 i.party
          Code:
           hist pers
          Click image for larger version

Name:	hist.png
Views:	1
Size:	8.6 KB
ID:	1451517




          Code:
           rvfplot
          Click image for larger version

Name:	rvf.png
Views:	1
Size:	14.2 KB
ID:	1451516



          Code:
           predict r
          qnorm r
          Click image for larger version

Name:	resid.png
Views:	1
Size:	9.9 KB
ID:	1451518



          Kind regards
          Last edited by Markus Freitag; 02 Jul 2018, 07:10.

          Comment


          • #6
            I would say there is no problem with what you have at this point. The cluster robust standard errors are, in fact, also robust to heteroscedasticity, so this is not an issue if you use -vce(cluster district_party_identifier).

            The distribution of the dependent variable is, itself, irrelevant. It is the distribution of the residuals that matters, if any distribution matters at all. In your case, you have a decent approximation to normality for those. Moreover, with N = 195, the central limit theorem would rescue the situation even if your residual distribution were fairly far from normal. The residual distribution really only matters in small samples.

            Some general remarks about log transformation, which you have wisely abandoned: when the variable contains 0 or negative values, a log transformation is not feasible: you will convert all of those non-positive values into missing values, which would, in this case, decimate your sample. Also, on a variable with limit range like 0-10, log transforming is unlikely to make a noticeable impact on any distributional issues. That really only happens with variables that range over several orders of magnitude.

            Comment

            Working...
            X