Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • cross sectional, world bank entreprise survey, logistic regression with fixed effect

    hello, i need your help. I want to estimate a simple logit on cross-sectional data. I am using the World Bank Business Survey. My sample consists of firms (i = 1 to 12540) from several countries (k = 1 to 27). Moreover, the firms are observed at a point without the possibility of repetition in the observations. for example, the survey was carried out in Ghana in 2013, Keyna in 2010, Central African Republic in 2011, Ethiopia in 2011, Cameroon in 2016, Chad in 2018, Gambia in 2018 ... etc. My dependent variable is a binary variable. I also want to control the indistrie, country and year fixed effects and Model cluster standard errors by country. I would like to know the command to use. since I can't use xtlogit.

    my equation is:
    Credit acess 𝑖,π‘˜= 𝛽0 + 𝛽1×𝐢orruption 𝑖,π‘˜ + 𝛽2Γ—demographic charact.𝑖,π‘˜ + 𝛽3Γ—firm charact.𝑖,π‘˜+ + fe (πΆπ‘œπ‘’π‘›π‘‘π‘Ÿπ‘¦,π‘–π‘›π‘‘π‘’π‘ π‘‘π‘Ÿπ‘¦,π‘¦π‘’π‘Žπ‘Ÿ)+ epsilon

    Your suggestions are very much appreciated.

    Thanks











  • #2
    I can't use xtlogit.
    I do not see why you cannot use xtlogit. As your observations are at the firm level, you can condition out the country or industry fixed effects (whichever has more levels) and add year dummies and dummies for either country or industry (with 12540 observations, there will be enough observations within each year and either each country or industry).

    Code:
    xtset country
    xtlogit credit_access ... i.industry i.year, fe
    The equivalent of the above using clogit is

    Code:
    clogit credit_access ... i.industry i.year, group(country)

    Comment


    • #3
      Thanks you Andrew Musau .

      Comment


      • #4
        I can't use xtlogit. Because Since the observations have not been repeated over time, I think it is not a panel.

        Comment


        • #5
          You do not have panel data, but you can use xtlogit or clogit as firms are nested in countries/ industries. Did you try what was suggested in #2 and it failed? In the xtset command line, you just declare a panel identifier without a time variable. This should work fine.

          Comment


          • #6
            when I launch the command, here's what I get:

            note: multiple positive outcomes within groups encountered.
            note: 2018.a14ya omitted because of no within-group variance.



            Iteration 0: log likelihood = -3284.7132 (not concave)
            Iteration 1: log likelihood = -3277.6572 (not concave)
            Iteration 2: log likelihood = -3276.9745 (not concave)
            Iteration 3: log likelihood = -3276.8441 (not concave)
            Iteration 4: log likelihood = -3276.8424 (not concave)
            Iteration 5: log likelihood = -3276.8417 (not concave)
            Iteration 6: log likelihood = -3276.8417 (not concave)
            Iteration 7: log likelihood = -3276.8417 (not concave)
            Iteration 8: log likelihood = -3276.8417 (not concave)
            Iteration 9: log likelihood = -3276.8417 (not concave)
            Iteration 10: log likelihood = -3276.8417 (not concave)
            Iteration 11: log likelihood = -3276.8417 (not concave)
            Iteration 12: log likelihood = -3276.8417 (not concave)
            Iteration 13: log likelihood = -3276.8417 (not concave)
            Iteration 14: log likelihood = -3276.8417 (not concave)
            Iteration 15: log likelihood = -3276.8417 (not concave)
            Iteration 16: log likelihood = -3276.8417 (not concave)
            Iteration 17: log likelihood = -3276.8417 (not concave)
            Iteration 18: log likelihood = -3276.8417 (not concave)
            Iteration 19: log likelihood = -3276.8417 (not concave)
            Iteration 20: log likelihood = -3276.8417 (not concave)
            Iteration 21: log likelihood = -3276.8417 (not concave)
            Iteration 22: log likelihood = -3276.8417 (not concave)
            Iteration 23: log likelihood = -3276.8417 (not concave)
            Iteration 24: log likelihood = -3276.8417 (not concave)
            Iteration 25: log likelihood = -3276.8417 (not concave)
            Iteration 26: log likelihood = -3276.8417 (not concave)
            Iteration 27: log likelihood = -3276.8417 (not concave)

            Comment


            • #7
              Convergence problems are common in maximum likelihood estimations. #2 of the following thread provides some pointers on what you can do:
              https://www.statalist.org/forums/for...elogit-command

              Comment


              • #8
                Thanks you.

                Comment


                • #9
                  I would like to know how to have similar results

                  Country FE YES YES YES YES
                  Industry FE YES YES YES YES
                  Year FE YES YES YES YES
                  Cluster country YES YES YES YES
                  Method LOGIT LOGIT LOGIT LOGIT

                  Comment


                  • #10
                    The following will do it:

                    Code:
                    clogit credit_access ... i.industry i.year, group(country) cluster(country)
                    where you have conditional country fixed effects and unconditional industry and year fixed effects.

                    Comment


                    • #11
                      Hello, I tried the code and got a result. my question will seem stupid to you but hey I ask it as well. how to make "Yes" appear.

                      Comment


                      • #12
                        Code:
                        ssc install estout, replace
                        Example:

                        Code:
                        webuse grunfeld, clear
                        gen industry=cond(inlist(company, 1,2,3), 1, cond(inlist(company, 4,5,6), 2, 3))
                        set seed 04252021
                        gen country=runiformint(1,5)
                        bys company (year): replace country=country[1]
                        gen outcome=runiformint(0,1)
                        *country indicators (i.country) will be dropped. Just needed for the output to indicate country FE
                        clogit outcome mvalue kstock i.industry i.year i.country, group(country)
                        esttab, indicate("Country FE=*.country" "Year FE=*.year" "Industry FE=*.industry")

                        Res.:

                        Code:
                         
                        . esttab, indicate("Country FE=*.country" "Year FE=*.year" "Industry FE=*.industry")
                        
                        ----------------------------
                                              (1)   
                                          outcome   
                        ----------------------------
                        outcome                     
                        mvalue          -0.000346   
                                          (-0.85)   
                        
                        kstock           0.000116   
                                           (0.14)   
                        
                        Country FE            Yes   
                        
                        Year FE               Yes   
                        
                        Industry FE           Yes   
                        ----------------------------
                        N                     200   
                        ----------------------------
                        t statistics in parentheses
                        * p<0.05, ** p<0.01, *** p<0.001

                        Comment


                        • #13
                          Hello Andrew Musau . I have a very similar problem.

                          I have the same database (from the WBES). I want to run a logit regression because my dependent variable is binary one at enterprise level. This variable is "fin11" and indicates if one company has collateral or not. My main explanatory variable is a variable at country level "n_outcome" which is a binary variable that takes value 1 if a country has some institution 0 otherwise. I added controls at firm level (like firm age or size) and other controls at country level too (Like GDP normalized at US value). I would like to add FE at country level, so my regression would be like this:

                          fin11𝑖,π‘˜= 𝛽0 + X𝑖,π‘˜ + Zπ‘˜ FE(country level) + e𝑖,π‘˜

                          Where X is a vector of enterprise variables and Z a vector of country variables.

                          This is an extract from my database


                          * Example generated by -dataex-. To install: ssc install dataex
                          clear
                          input str26 country_01 double firm_id float(fin11 age k11_sales GDP_GDPUSA) long n_outcome
                          "Argentina2017" 622481 1 20 .002667291 .4556047 0
                          "Argentina2017" 622318 1 20 .08750443 .4556047 0
                          "Argentina2017" 623036 . 8 . .4556047 0
                          "Argentina2017" 623248 . 20 . .4556047 0
                          "Argentina2017" 622448 1 24 .05040256 .4556047 0
                          "Argentina2017" 622727 . 17 . .4556047 0
                          "Armenia2020" 708831 . 10 . .2943979 0
                          "Armenia2020" 708699 . 21 . .2943979 0
                          "Armenia2020" 709055 . 5 . .2943979 0
                          "Armenia2020" 708729 . 21 . .2943979 0
                          "Armenia2020" 708889 1 21 . .2943979 0
                          "Armenia2020" 708856 . 16 . .2943979 0
                          "Armenia2020" 708909 . 3 . .2943979 0
                          "Armenia2020" 708827 . 6 . .2943979 0
                          "Armenia2020" 709153 . 6 . .2943979 0
                          "Armenia2020" 708958 . 2 . .2943979 0
                          "Armenia2020" 708676 . 21 . .2943979 0
                          end

                          However, if I use the codes that you post in #2, I obtain the next message:

                          1,617 (group size) take 1,369 (# positives) combinations results in numeric overflow; computations cannot proceed.

                          Furthermore, I do not know if make sense to use FE at country level when I am using control variables at country level.

                          Any advise?

                          I hope you can help me,

                          Regards,
                          Ibai



                          Last edited by Ibai Ostolozaga Falcon; 24 Sep 2021, 04:51.

                          Comment


                          • #14
                            If you have 30+ observations per country, just use logit and include country dummies (i.country). The numerical overflow problem is a limitation of clogit.

                            Comment

                            Working...
                            X