Part V: creating the EQ-i 2.0 and EQ 360 2.0

South African Norms

EQ-i 2.0 South African Standardization

Standardization

NORMATIVE SAMPLE

Normative data for the South African Professional Norm sample (N = 1,200) were collected from August of 2011 to October of 2012. Data was collected by two methods: either as part of the data collection initiative intended specifically to create the South African norms (56.7%), or from the South African EQ-i 2.0 customer database (43.3%). The demographic composition of the normative sample is shown in Tables F.1–F.6. Demographic data available from customers included age, gender, and in most cases ethnicity. In addition, the South African norms included demographic information on geographic region, education level, and employment status.

The normative data was collected across four age ranges, evenly proportioned by gender within each age interval (see Table F.1), from a variety of geographic regions covering all nine provinces (see Table F.2). Regarding ethnicity (see Table F.3), most respondents indicated that they were Black (43.8%) or White (36.6%), with smaller proportions of Colored (9.7%) and Asian/Indian (9.9%) respondents. The majority of the sample (73.8%) had an education level higher than grade 12 (see Table F.4), and were mostly employed full-time (81.8%); there were no unemployed respondents in the normative sample (see Table F.5). The sample breakdown by occupation area is shown in Table F.6, with the largest proportions working in the areas of Business/Commerce/Management (24.4%), Human/Social Science (17.6%), Education/Training/Development (13.1%), and Manufacturing/Engineering/Technology (10.6%).

NORMING PROCEDURES

The first step in the preparation of the South African norms was to determine if any age or gender trends existed in the data. Large differences in scores between men and women, or across various age groups, would suggest a need to create an option for separate gender- or age-based norm groups. Conversely, a lack of such differences may dictate the use of a single norm group with genders and age groups combined. A series of analyses of variance (ANOVA; for the Total EI score) and multivariate analyses of variance (MANOVA; for the composites and subscales) were used to examine the relationships between gender and age with EQ-i 2.0 scores. To better control for Type I errors that might occur with multiple analyses, a more conservative criterion of p < .01 was used for all F-tests to test for statistical significance.

The Wilks’ lambda statistic generated from these analyses ranges from 0.00 to 1.00 and conveys the proportion of variance that is not explained by the effect (gender, age, or the interaction between gender and age) in the multivariate analyses. Most of these values were close to 1.00, suggesting that only a small amount of variance could be explained by the effects of these variables. Subscale lambda values for gender and age were .860 and .859, respectively, indicating slightly larger effects. F-tests revealed significant effects of gender and age (see Table F.7). Given these results, the univariate effects are described in detail below.

Gender Effects. Results of the gender analyses showed that men and women did not differ significantly on the Total EI score, indicating that overall emotional intelligence as measured by the EQ-i 2.0 is about the same for men and women. However, small effects were seen on a number of scales (see Table F.8 for effect sizes and Table F.9 for descriptive statistics and significance test results). The largest gender difference seen in the South African sample was on Empathy, with women scoring higher than men (d = -0.38). Smaller differences were found with women scoring higher than men on Emotional Self-Awareness (d = -0.22) and Emotional Expression (d = -0.25). Men scored higher than women with small effect sizes on the Decision Making (d = 0.21) and Stress Management (d = 0.24) composite scales, and on the Self-Regard (d = 0.25), Problem Solving (d = 0.30), Stress Tolerance (d = 0.32) and Optimism (d = 0.22) subscales.

Age Effects. Significant effects were found across age groups for the Total EI Score, as well as the Decision Making composite scale and the Interpersonal Relationships, Empathy, Reality Testing, and Stress Tolerance subscales, all with small effect sizes. See Table F.8 for effect sizes and Table F.10 for descriptive statistics and significance test results. For Total EI, the Decision Making composite, and the Empathy, Reality Testing, and Stress Tolerance subscales, statistical significance was reached and scores were lowest for the 18–29 year-old group. Statistical significance was also reached for the Interpersonal Relationships subscale, but the oldest group had the lowest scores.

Gender × Age Interaction. None of the scales showed a significant interaction effect between age and gender, or reached the minimum partial η2 criterion for a small effect size. Overall, age effects were largely consistent within men and women, and gender effects were largely consistent across age groups.

Ethnicity Effects. A series of analyses of covariance (ANCOVA; for the Total EI score) and multivariate analyses of covariance (MANCOVA; for the composites and subscales) were used to examine the relationships between ethnicity group and EQ-i 2.0 scores. Age was included as a covariate and gender as a factor in order to control for the effects of these demographic variables. To better control for Type I errors that might occur with multiple analyses, a more conservative criterion of p < .01 was used for all F-tests. Results showed a significant effect for the Total EI score, as well as for most composite scales and subscales; however, all effect sizes were small. See Table F.11 for descriptive statistics, significance test results, and overall effect sizes (partial η2).

Although the pattern of scores varied somewhat from scale to scale, for the majority of scales the Asian/Indian group scored the highest, followed by the Black group. The White and Colored groups scored similarly to each other, but both groups tended to score lower than the Black and Asian/Indian groups. Effect sizes were computed for each pairwise comparison between ethnicity groups (Cohen’s d; see Table F.12). For the Total EI score, the largest differences were between the Asian/Indian and both the Colored (d = 0.47) and White groups (d = 0.38), with higher scores for the Asian/Indian group. For the composite scales and subscales, the largest differences were between the Asian/Indian and Colored groups, again with higher scores for the Asian/Indian group: Decision Making composite (d = 0.67), Problem Solving (d = 0.65), Independence (d = 0.65), and Reality Testing (d = 0.53). There were a number of other differences but they were smaller in magnitude.

Differential Item Functioning across Gender, Age and Ethnicity Groups: Differential Item Functioning (DIF) is said to be present when individuals with the same standing on the latent trait of interest obtain different scores on the same item. This analysis assesses whether gender, age, or ethnicity group membership influences the probability of endorsing a particular response on an item.

Item Response Theory (IRT) was used in combination with ANOVA to investigate DIF. The Rasch model was used to examine probabilities in relation to person and item factors. Results are provided in Table F.13. Probability values (p) were examined for each scale in order to identify items reflecting DIF. The level of significance differs across scales due to the Bonferroni correction (i.e., a Bonferroni correction was applied to account for multiple analyses), so effect sizes are also presented as an indicator of the magnitude of DIF (values of .05 and higher are considered medium effects).

For gender, there were three items that showed statistically significant DIF, one each on the Happiness, Interpersonal Relationships and Self-Regard subscales, which suggests that there might be small variation with regard to the way that men and women respond to these items. However, for each of these items the effect size was small, meaning that the practical impact of these differences is negligible.

With regard to DIF across age groups, there were 11 items reflecting statistically significant DIF on five scales, which suggests that there might be slight differences in the way that participants in different age groups are responding to these items. Once again, however, for all of these items the effect sizes were small, meaning that the practical impact of the DIF is negligible.

For ethnicity, only Black and White respondents were sufficiently represented for inclusion in the DIF analysis. There were 17 items reflecting DIF on nine of the scales. Effect sizes were small, with the exception of one Optimism item (partial η2 = 0.06, item 90, “I have good thoughts about the future.”) and one Problem Solving item (partial η2 = 0.05, item 75, “I feel overwhelmed when I need to make a decision.”) with effect sizes in the medium range.

To further examine the impact of DIF at the scale level, Test Characteristic Curves (TCC’s) were constructed for the EQ-i 2.0 scales showing the most sizable DIF for ethnicity. There were five scales with DIF values large enough to be examined, and the TCC’s were superimposed to visually investigate the difference in expected scores for the two ethnicity groups (see Figures F.1 to F.5). The impact shown was small on the Reality Testing and Optimism scales, where Black respondents are expected to score at most 1 raw-score point more on higher levels of the scale. The TCC’s on the Interpersonal Relationships scale were almost identical. The differences in expected scores for the Problem Solving and Self-Actualization scales were somewhat more pronounced, where White respondents were expected to score at most 2 raw-score points higher at low levels of the scale and 2 raw-score points lower at higher levels on the scale continuum.

Overall, the results from the above analyses suggest that the scores on some of the EQ-i 2.0 scales may be marginally influenced by gender, age and ethnicity group membership. However, from a practical perspective the impact of the DIF appears to be fairly insignificant.

Norm Groups and Norm Construction. The age and gender analyses revealed a number of significant effects that were all small in size. Therefore, specific Age and Gender Professional Norms, as well as Overall Professional Norms (i.e., collapsed across ages and genders), were both developed. In contrast to the South African norm sample for the original EQ-i, in which no ethnicity effects were identified, the ethnicity effects that were revealed in this research were somewhat unexpected. The differences between the White and Black groups were mostly negligible to small in size; however, larger differences were found between the Asian/Indian and Colored samples. Both of these groups had relatively small sample sizes (less than 100 participants per group); therefore, more ethnicity data will be collected to further examine this trend.

Results revealed that skewness and kurtosis values were not large enough to suggest that a normalizing transformation was necessary (skewness values ranged from -1.29 to -0.42; kurtosis values ranged from -0.25 to 2.07), and an examination of the scale histograms did not reveal any significant departures from a bell-shaped (Gaussian) curve (Figure F.6 shows a histogram for the EQ-i 2.0 South African Total EI score). Actual construction of the norms was conducted in the same manner as the North American Norms, including the use of statistical smoothing (see Standardization, Reliability, and Validity for more information on the construction of the North American General Population Norms).

Comparison of South African Professional Norms to North American Professional Norms. The South African sample was compared against the North American Professional normative sample by computing standard scores for the EQ-i 2.0 scales with the North American Professional norms, and comparing these scores against a mean of 100. Mean differences ranged between 0.22 (Emotional Expression) and 6.20 (Self-Actualization) standard score points, with the South African sample scoring higher than the North American Professional norms on all but the Impulse Control and Flexibility subscales. Due to the large sample size, significant differences were observed between the South African sample and the North American Professional norms on most scales; however, effect sizes were small: Total EI (d = 0.20), Self-Perception (d = 0.35), Self-Regard (d = 0.24), Self-Actualization (d = 0.43), Emotional Self-Awareness (d = 0.36), Assertiveness (d = 0.37), Reality Testing (d = 0.34), Optimism (d = 0.29), and Happiness (d = 0.25). Results are presented in Table F.14.

Internal Consistency

Internal consistency, a measure of reliability, conveys the degree to which a set of items are associated with one another. A high level of internal consistency suggests that the set of items are measuring a single, cohesive construct. Internal consistency is typically measured using Cronbach’s alpha (Cronbach, 1951). Cronbach’s alpha ranges from 0.0 to 1.0 and is a function of both the interrelatedness of the items in a test or scale and the length of the test (John & Benet-Martinez, 2000). Higher values reflect higher internal consistency.

Cronbach’s alpha values for the EQ-i 2.0 scales for the South African normative sample are presented in Table F.15. Although there is no universal criterion for a good alpha level, informal cut-offs for evaluating alpha are typically .90 is “excellent,” .80 is “good,” .70 is “acceptable,” and lower than .70 is “questionable.” Most of the values found in Table F.15 demonstrate good or at least acceptable reliability, and these values are particularly favorable given the small number of items included in most subscales. For the overall sample, the alpha value of the Total EI scale was .96, values for the composite scales ranged from .84 to .88, and values for the subscales ranged from .71 to .85. Similar patterns were seen across the age and gender normative groups, including a Total EI alpha value of .95 or higher for each age by gender group. The high level of internal consistency found in the Total EI score supports the idea that the EQ-i 2.0 items are measuring a single, cohesive construct, namely emotional intelligence. The same can be said of the individual components of emotional intelligence that make up the EQ-i 2.0 (i.e., the composite scales and subscales).

Factor Validity

EXPLORATORY FACTOR ANALYSIS

Exploratory factor analysis (EFA) was used to determine whether the subscales established with the North American EQ-i 2.0 normative data empirically emerge from the South African normative dataset. Five EFAs were conducted, analysing the items within each composite scale separately. In each EFA, a three-factor solution was forced to examine whether the items corresponding to each subscale within the composite loaded together in the South African normative data. As with the North American normative data, principal axis factoring extraction was used, with direct oblimin (i.e., oblique) rotation, as the factors within each composite are expected to correlate with each other. Reverse scoring was applied to relevant items prior to the analysis. Factor loadings were considered significant if they reached at least ± .30, and an item was defined as cross-loading if it was significant on more than one factor and had loadings within .10 of each other on these factors.

For the Self-Perception Composite EFA, items for the Self-Regard, Self-Actualization, and Emotional Self-Awareness subscales loaded together as expected by the established factor structure (i.e., items loading significantly onto their respective factors, with no cross-loadings), with only one Self-Actualization item with a factor loading of .29, which is close to the cutoff.

For the Self-Expression Composite EFA, all but two items loaded significantly onto their respective factors for the Emotional Expression, Assertiveness, and Independence subscales, with no cross-loadings. The factor loading for one Assertiveness item and one Emotional Expression item were each .28, which is close to the cutoff.

For the Interpersonal Composite EFA, items for the Interpersonal Relationships, Empathy, and Social Responsibility subscales loaded onto their respective factors, with the exception of one Interpersonal Relationships item that loaded onto Empathy, and one Social Responsibility item with a factor loading of .29, which is close to the cutoff.

For the Decision Making Composite EFA, Problem Solving, Reality Testing, and Impulse Control items loaded onto their respective factors with no cross-loadings, with the exception of one Impulse Control item with a factor loading of .29, which is close to the cutoff.

For the Stress Management Composite EFA, all Flexibility, Stress Tolerance, and Optimism items loaded onto their respective factors with no cross-loadings.

To summarize, the EFAs generated solutions that strongly correspond to the established EQ-i 2.0 factor structure, with the items for each subscale empirically grouping together onto the expected factors.

CORRELATIONS AMONG EQ-i 2.0 COMPOSITE SCALES AND SUBSCALES

Correlations among the EQ-i 2.0 composite scales and subscales were examined, and it was expected that these correlations would generally be high, given that they are all measuring the same underlying construct of emotional intelligence, however they should not be so high as to indicate redundancy between the scales. Correlations observed in the South African normative sample are presented in Tables F.16 (composite scales) and F.17 (subscales). These results are similar to what was found with the North American normative sample.

The composite scale correlations ranged from r = .40 (Interpersonal/Decision Making) to r = .70 (Stress Management with both Self-Perception and Decision Making), with an average correlation of r = .59. For the subscales, correlations ranged from r = .04 (Independence/Empathy) to r = .75 (Happiness/Self-Regard), with an average correlation of r = .40. As highlighted in Table F.17, subscale correlations within composite scales ranged from r = .19 (Emotional Expression/Independence) to r = .62 (Self-Regard/Self-Actualization). These results support the notion that a single, underlying dimension is being represented in the EQ-i 2.0, yet the values are not overly high and there is enough variation in the correlations to provide clear evidence of the multidimensional nature of the assessment, and support the existence of composite scales and subscales.