I would like some help with coding and generating faultline variable in Stata

Dion Sinack

Join Date: May 2019

Posts: 6
#1

I would like some help with coding and generating faultline variable in Stata

17 May 2019, 11:35

Hello everyone,

I'm a student and I'm currently having some issues with coding and generating the variables for my research. My research is in the area of Strategic Management and is about the moderating effect of CEO-TMT characteristics on the relationship between TMT faultlines and R&D spending.

I have collected the following data on TMT members:

- exective age
- executive gender
- executive tenure
- executive tenure on the top management team
- type of degree of executive

This is the first time I am using Stata to generate variables, so I don't know much about it.

I want to see if there exists any faultlines within the TMT, based on 4 characteristics: age, gender, tenure and education.

I would like to generate a faultlines variable using the following formula:

n = number of group members (In my dataset this is "TMT size")

p = number of characteristics (The characteristics that I have in my dataset are: age, gender, tenure and education)

S = total subgroups = 2* (number of TMT members) -1 - 1

Xijk = the value of the jth characteristic of the ith member of subgroup k.

X-j indicates the overall group mean of characteristic j.

x- jk indicates the mean of characteristic j in subgroup k.

g= all possible splits (=S?)

ng k denotes the number of members of the kth subgroup (k= 1,2) under split g.

Fau= Fau is calculated as the maximum value of Faug over all possible splits g = 1,2,…,S.

This is a preview of my data using dataex:

input int year byte(age gender tenure typeofdegree)
2006 68 1 . 2
2006 59 1 31 2
2006 58 1 36 2
2006 35 1 12 .
2007 69 1 . 2
2007 60 1 . 2
2007 59 1 . 2
2007 36 1 . .
2008 70 1 . 2
2008 61 1 . 2
2008 60 1 . 2
2008 37 1 . .
2009 71 1 . 2
2009 62 1 . 2
2009 61 1 . 2
2009 38 1 . .
2006 50 1 . 4
2006 51 1 . .
2006 51 1 . 3
2006 . 1 36 4
2006 54 1 . 2
2006 53 1 . 4
2006 51 1 26 2
2006 42 0 16 5
2007 51 1 . 4
2007 51 1 . 3
2007 . 1 37 4
2007 44 1 20 4
2007 53 1 . 4
2007 49 1 . 5
2007 43 0 17 5
2007 53 1 29 4
2008 52 1 . 4
2008 52 1 . 3
2008 . 1 38 4
2008 45 1 21 4
2008 54 1 . 4
2008 49 1 . 5
2008 44 0 18 5
2008 54 1 30 4
2009 53 1 . 4
2009 53 1 . 3
2009 . 1 39 4
2009 49 1 24 5
2009 46 1 22 4
2009 55 1 . 4
2009 50 1 . 5
2009 45 0 19 5
2006 48 1 . .
2006 59 1 . .
2006 . 1 . 5
2006 55 1 18 .
2007 49 1 . .
2007 60 1 . .
2007 . 1 . 5
2007 56 1 19 .
2008 50 1 . .
2008 61 1 . .
2008 . 1 . 4
2008 57 1 20 .
2009 51 1 . .
2009 62 1 . .
2009 . 1 . 4
2009 58 1 21 .
2006 68 1 . 3
2006 51 1 . 3
2006 61 1 37 4
2006 43 0 6 4
2006 59 1 . 4
2007 69 1 . 3
2007 44 1 . 3
2007 62 1 . 4
2007 44 0 . 4
2007 60 1 . 4
2007 38 1 9 4
2008 70 1 . 3
2008 45 1 . 3
2008 63 1 . 4
2008 45 0 . 4
2008 61 1 . 4
2008 45 1 . 4
2008 39 1 . 4
2009 71 1 . 3
2009 46 1 . 3
2009 40 1 . 4
2009 46 0 . 4
2009 62 1 . 4
2009 46 1 . 4
2006 57 1 . 4
2006 49 1 . 4
2006 50 1 . 4
2006 41 0 20 4
2006 51 1 30 2
2007 54 1 . 2
2007 51 1 . 4
2007 42 0 21 4
2007 52 1 31 2
2008 55 1 . 2
2008 52 1 . 4
2008 43 0 22 4
end

I would really appreciate any helpful tips on how to deal with this. Thank you in advance.

Dion Sinack
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

28 May 2019, 16:03

Welcome to Statalist. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stat output, and sample data using dataex. Look at postings on Blau index - they have a similar structure.

You can do the means using bysort x: egen meanz=mean(z). Create the squares and then you can do totals with egen and rowtotal.
Comment
Ayub UOM

Join Date: Feb 2018

Posts: 83
#3

17 Jan 2020, 10:33

Hello Stata list members, i am interested to construct Faultlines strength(FLS) and Faultlines distanse(FLD) by using the well known studies in Faultlines research, Bezrukova et al ,2009, Hutzschenreuter and Horstkotte 2013, and actually following Thatcher et al 2003, and Bezrukova et al 2009.Before this i have calculted board diversity Index by using BLUE Index, but now i want to calculat this Index,.I tried to do it in stata but i dont know how can i do it.Although some professors recommended me SAS Statistical Analysis Software, or R-Software for calculating it.
i would be happy if some one guide me how can i calculate it in stata,because i have no commond on.R or SAS.
For Your information this concept is
opposite of Board diversity.
Here we first divide a group of Top managment team into two sub groups i.e. group 1 and 2 ,( may be thisdivision is based on gender i.e. male vs female or other attributes ), then we find the simalirties of these members based on other attributes,i .e age, tenure and education) so within group there are homogenity, simalirties of attributes in group 1(male) and also simalirties in group 2(female), and there are hetrogenity between these two groups.
Actually if there are 4 members(n=4), then we have S ways i mean g=7 split, we calculat it as formula : 2 power n-1, then we subtract 1(2 power 4-1=3, minius 1=8-1=7 ways), K means lets suppose two subgroups Male and female, and J represent attributes (age, gender, tenure and education).mostely attributes used are dicotomus ,but if continous attributes i.e tenure(number of years)then we will scaled tenure by TMT members size in the focal company rangemaximum -minimum) and share ownership of members,we scaled it.by range(maximum number.of share -minimum number of share).
Then we calculat Faultlines strength for other attributes based on age,(it means now we divided group on the basis of age instead of gender) and repreat the process, in last we add all Faultlines strength i.e aggregate group wise value, and then take average. range of this value between 0 and 1, ,the higher this value the more homogenous is the within-subgroup Structure.so Faultlines strength(FLS) measure capture how NEARLY a group SPLIT into Subgroups.furthermore,FLS is computed as the portion of the total variance in TMT members characteristics explained by the group split by clustering algorithms(between subgroup variance over total group variance).
So we want to calculate two things one is Faultlines strength(simalirty within subgroups) and Faultlines distanse as the average Euclidean distanse between subgroups centroids (thevector of subgroup means of TMT characteristics) ) or defiened as (dissimilarity between groups or measures the extent of difference between two subgroups).formula is Faultlines distanse= square root of summation j=1 to P(Mean of X1j -Mean of X2j)2(square). where the Faultlines distanse measure consists of the distanse between the cluster centroids .where mean X1j is the mean of the jth variable for subgroup1, mean X2j is the mean of this characteristic for subgroup2.the square root is taken from the aggregated attribute differences between both subgroups.FLD can range from 0 to 3(the square root of 9).
in last we multimple Faultlines strength(FLS) with the recprocal of Faultlines disctance(FLD) as ( FLS*(1-FLD)) range 0 to 1. maximum value showd more simalirty in attribute in subgroups.references (van peteghem et al 2018).
If you need further information i will provid it,
Could you please guide me how can i do it, i hope it will ease my work and save my time, and maximum people will take help from this.
thank you in advance.
looking forward for your positive response.
Ayub

Last edited by Ayub UOM; 17 Jan 2020, 11:27.
Comment

Announcement

I would like some help with coding and generating faultline variable in Stata

Comment

Comment