Part V: creating the EQ-i 2.0 and EQ 360 2.0

North American Professional Norms

EQ 360 2.0 North American Professional Norms – Standardization

This section describes the psychometric properties of the EQ 360 2.0 North American Professional Norms, including standardization, reliability, and validity information.

All tables and figures representing detailed depictions of the analyses described in this chapter are available in *Appendix B*.

NORMATIVE SAMPLE

Data collection for the EQ 360 2.0 Professional Norm sample took place over two phases. Phase 1 took place from March, 2010 to May, 2010, as part of the full standardization process for the EQ 360 2.0. Data from Phase 1 comprises a subset of professional ratees (N = 1,200) from the EQ 360 2.0 General Population Norm sample. Phase 2 of data collection took place between January, 2013 and March, 2014. This sample included data from a randomly selected set of employed/self-employed EQ 360 2.0 customers (N = 1,200).

Respondents (“the raters”) were required to rate an individual (“the ratee”) on the EQ 360 2.0 and provide demographic information. Information was collected on each ratee’s gender, age group, geographic region, ethnicity, employment status, organizational level, education level, and occupation area. Information about the ratee (i.e., the person being rated) was provided by the rater (i.e., the person completing the assessment). Information was also collected about the type of relationship (i.e., manager, work peer, direct report, or family/friend/other), how long they have known each other, and how often they interact with each other. Note that some demographic information was not available for the Phase 2 customer data.

The EQ 360 2.0 Professional Norm sample includes 2,400 individuals from all regions of the United States (N = 2,147; 89.5%) and Canada (N = 253; 10.5%). The majority of raters knew the ratee for at least one year (91.4%; see Table B.12), and most of the raters indicated that they interacted with the ratee often or very often (80.1%; see Table B.13). The sample includes an equal number of men and women in each of three age ranges (i.e., 18–39, 40–49, and 50+), for each of the four rater types (i.e., manager, direct report, work peer, and family/friend/other). See Table B.14 for the age group by gender distribution of the sample, and Table B.15 for the distribution by rater type. The rated individuals in this sample were employed in a variety of professional occupations, with the largest proportions working in the areas of Business, Management, and Related occupations (30.4%), Medical and Health-Related occupations (11.6%), and Education, Training, and Library occupations (10.5%); see Table B.16 for a breakdown of employment areas. Additional information was available for the ratees in the first phase of data collection. Just over half of these ratees (51.3%) held positions at a management level within their organizations (7.4% Senior Executive, 9.1% Senior Manager, 34.8% Manager, 45.9% Non-Managerial Employee/Staff, and 2.9% Other). Most ratees had at least some post-secondary education (12.5% had some college or university completed, 13.2% had a trade certificate or college diploma, 40.4% had a university bachelor’s degree, and 28.4% had a post graduate or professional degree; 5.5% of raters indicated that they did not know the ratee’s education level). The sample had representation from various races/ethnicities (73.8% White, 10.2% Black, 7.1% Hispanic, 6.5% Asian, and 2.5% Native, Multiracial, or Other).

NORMING PROCEDURES

Similar to the EQ-i 2.0 Professional Norm, the first step in the EQ 360 2.0 Professional norming procedure was to determine if any demographic trends existed in the normative data. Large score differences between rater types (i.e., managers, work peers, direct reports, and family/friend/other) would suggest a need to create an option for separate rater type norm groups, while a lack of such differences would suggest a need to create a single norm option with the rater types combined. Similarly, large score differences between male and female ratees, or across various ratee age groups, would suggest a need to create an option for separate gender- or age-based norm groups. Conversely, a lack of such differences may dictate the use of a single norm group with genders and age groups combined.

A series of analyses of variance (ANOVA, for the Total EI score) and multiple analyses of variance (MANOVA, for the composites and subscales) were used to examine the relationships between EQ 360 2.0 scores and ratee gender, age, and rater type. To better control for Type I errors that might occur with multiple analyses, a more conservative criterion of p < .01 was used for all F-tests.

Gender Effects. Overall (and similar to results from the North American General Population EQ 360 2.0 data), gender effects were less pronounced in the EQ 360 2.0 Professional normative sample than they were in the EQ-i 2.0 Professional normative sample. Although there were some significant differences, the results showed that for most scales, there was no meaningful effect of gender; only Emotional Expression (d = -0.35) and Empathy (d = -0.25) reached a small effect size (women were rated slightly higher than men). In sum, gender effects were relatively small and represent only a few absolute standard score points. See Table B.17 for effect sizes and Table B.18 for descriptive statistics and significance test results.

Age Effects. Results showed that across all scales, there was no meaningful effect of age. Although a significant effect was observed across age groups for the Social Responsibility subscale, the minimum criteria for even a small effect was not reached. See Table B.17 for effect sizes and Table B.19 for descriptive statistics and significance test results.

Rater Type Effects. Although significant differences between rater types were observed on several scales, all effect sizes were small, and many did not reach the minimum criteria for a small effect size. The pattern that was typically observed was that slightly higher ratings were provided by the friend/family/other group. See Table B.17 for effect sizes and Table B.20 for descriptive statistics and significance test results.

Interaction Effects. No meaningful interaction effects were observed between Gender and Age, Gender and Rater Type, Age and Rater Type, or the three way interaction between these variables. Although significant Gender by Age effects were seen for a few scales, none of these reached the minimum criteria for even a small effect size.

Norm Groups and Norm Contstruction. Overall, the effect sizes (i.e., d and partial η² values) found in the normative data suggest negligible or small effects of ratee gender and age. The scarcity of meaningful effects suggested that it was not necessary to create specific gender- or age-based norms for the EQ 360 Professional Norms. Although there were several small differences between rater types, these were not considered large enough to require specific rater-type norms, and on the whole it was preferable to be consistent with the norm options used for other EQ 360 2.0 releases. Therefore, only overall norms were developed. Accordingly, some sensitivity may be required in interpreting results obtained from different types of raters. For instance, ratings provided by friends/family/other might be expected to be higher than those obtained from other rater types.

These norms were created using the same procedure as the EQ-i 2.0 Professional Norms. Standard scores for all scales were computed with a mean of 100 and standard deviation of 15. Results revealed that skewness and kurtosis values were relatively small (skewness values ranged from -1.22 to -0.26, indicating a slight negative skew across scales; kurtosis values ranged from 0.00 to 1.37). Examination of the scale histograms did not reveal any significant departures from a bell-shaped (Gaussian) curve. Therefore, artificial transformation of scores to fit normal distributions was deemed unnecessary. A histogram for the EQ 360 2.0 Total EI score is provided in Figure B.2.

Comparison of Professional Norms to General Population Norms. The North American Professional Norm sample was compared against the North American General Population normative sample by computing standard scores for the EQ 360 2.0 scales with the General Population EQ 360 2.0 norms, and comparing these scores against a mean of 100. Significant differences were observed on all scales, and all but one scale (Emotional Expression) reached at least a small effect size (i.e., Cohen’s d ≥ 0.20). Mean differences ranged between 2.3 and 7.8 standard score points; on all scales, the Professional norm sample obtained higher scores compared to the General Population norms. The largest differences were observed on the following scales: Self-Actualization (d = 0.60), the Self-Perception composite (d = 0.54), Stress Tolerance (d = 0.50), Social Responsibility (d = 0.49), Total EI (d = 0.48), the Decision Making composite (d = 0.47), the Stress Management composite (d = 0.46), Problem Solving (d = 0.43), Reality Testing (d = 0.42), Self-Regard (d = 0.41), and Optimism (d = 0.40). The differences observed on the other scales were smaller in magnitude. Results are presented in Table B.21.

Internal Consistency

Internal consistency, a measure of reliability, conveys the degree to which a set of items are associated with one another. A high level of internal consistency suggests that the set of items are measuring a single, cohesive construct. Internal consistency is typically measured using Cronbach’s alpha (Cronbach, 1951). Cronbach’s alpha ranges from 0.0 to 1.0 and is a function of both the interrelatedness of the items in a test or scale and the length of the test (John & Benet-Martinez, 2000). Higher values reflect higher internal consistency. Although there is no universal criterion for a good alpha level, informal cut-offs for evaluating alpha are typically .90 is “excellent,” .80 is “good,” .70 is “acceptable,” and lower than .70 is “questionable.”

Cronbach’s alpha values for the EQ 360 2.0 scales for the Professional Norm sample are presented in Table B.22. These values demonstrate good to excellent reliability, and are particularly favorable given the small number of items included in most subscales. The alpha value of the Total EI scale was .98, values for the composite scales ranged from .89 to .96, and values for the subscales ranged from .79 to .94.

The high level of internal consistency found in the Total EI score supports the idea that the EQ 360 2.0 items are measuring a single, cohesive construct, namely emotional intelligence. The same can be said of the individual components of emotional intelligence that make up the EQ-i 2.0 (i.e., the composite scales and subscales).

Factorial Validity

EXPLORATORY FACTOR ANALYSIS

Exploratory factor analysis (EFA) was used to determine whether the subscales established with the North American EQ 360 2.0 General Population normative data empirically emerge from the Professional normative dataset. Five EFAs were conducted, where the items within each composite scale were analyzed separately. In each EFA, a three-factor solution was forced to examine whether the items that corresponded to each subscale within the composite also loaded together in the Professional normative data. Principal axis factoring extraction was used, with direct oblimin (i.e., oblique) rotation, as the factors within each composite are expected to correlate with each other. Reverse scoring was applied to relevant items prior to the analysis. Factor loadings were considered significant if they reached at least ± .30, and an item was defined as cross-loading if it was significant on more than one factor and had loadings within .10 of each other on these factors.

For the Self-Perception Composite EFA, items on the three subscales (Self-Regard, Self-Actualization, and Emotional Self-Awareness) loaded significantly onto their respective factors as expected by the established factor structure, with no cross-loading items.

For the Self-Expression Composite EFA, all items on the Emotional Expression, Assertiveness, and Independence subscales loaded onto their respective factors, with the exception of one Assertiveness item that fell a bit below the cut-off with a factor loading of .25.

For the Interpersonal Composite EFA, most items on the Interpersonal Relationships, Empathy, and Social Responsibility subscales loaded onto their respective factors, with the exception of two Interpersonal Relationships items that loaded onto Empathy.

For the Decision Making Composite EFA, all Problem Solving, Reality Testing, and Impulse Control items loaded onto their respective factors, with one Problem Solving item that cross-loaded with Impulse Control, and one Reality Testing item that cross-loaded with Problem Solving.

For the Stress Management Composite EFA, most items on the Flexibility, Stress Tolerance, and Optimism subscales loaded significantly onto their respective factors. One Stress Tolerance item fell just below the cut-off with a factor loading of .29, and two other Stress Tolerance items cross-loaded—one each with Optimism and Flexibility.

To summarize, the EFAs generated solutions that strongly correspond to the established EQ 360 2.0 factor structure, with 114 out of 118 items empirically grouping together onto the expected factors.

CORRELATIONS AMONG EQ 360 2.0 COMPOSITE SCALES AND SUBSCALES

Correlations among the EQ 360 2.0 composite scales and subscales were examined. These correlations were expected to be generally high, given that they all measure the same underlying construct of emotional intelligence; however, they should not be so high as to indicate redundancy between the scales. Correlations observed in the Professional normative sample are presented in Tables B.23 (composite scales) and B.24 (subscales).

The composite scale correlations ranged from r = .67 (Self-Expression/Interpersonal) to r = .87 (Stress Management/Decision Making), with an average correlation of r = .79. For the subscales, correlations ranged from r = .30 (Impulse Control/Assertiveness) to r = .87 (Optimism/Happiness), with an average correlation of r = .61. As highlighted in Table B.24, subscale correlations within composite scales ranged from r = .34 (Emotional Expression/Independence) to r = .81 (Interpersonal Relationships/Empathy). These results support the notion that a single, underlying dimension is being represented in the EQ 360 2.0, yet the values are not overly high, and there is enough variation in the correlations to provide clear evidence of the multidimensional nature of the assessment, and support the existence of composite scales and subscales.