Part V: creating the EQ-i 2.0 and EQ 360 2.0

Standardization, Reliability, and Validity

EQ 360 2.0 Pilot Study and Standardization

This section describes the EQ 360 2.0 standardization procedure, including the method of data collection, the properties of the normative sample, and the effects of age and gender on the results.

Data Collection

Data collection for the EQ 360 2.0 followed multiple stages between July 2009 and August 2010. More than 4,000 participants completed the EQ 360 2.0 over this time period.

PILOT PHASE

The first stage of data collection, the collection of pilot data, took place between July 2009 and November 2009. Raters were required to provide demographic information of the individuals they rated (i.e., “ratees”) along with EQ 360 2.0 ratings. The ratees (N = 759) were 59.2% female, the majority were White (74.3%), and there was good representation across several age groups (Table A.35). These data were collected to ensure the basic functionality of the EQ 360 2.0 (e.g., instructions, response options, administration time) was adequate.

NORMATIVE PHASE

The second phase of data collection that included the collection of data for the normative sample, as well as reliability and validity data, took place between March 2010 and August 2010. Data were gathered from all 50 U.S. states and the District of Columbia, as well as from all 10 Canadian provinces. Raters were sent an email invitation to participate in the EQ 360 2.0 data collection process. The data collection and authentication procedures were identical to those used for the EQ-i 2.0 (see EQ-i 2.0 – Data Collection – Normative Phase in the EQ-i 2.0 Pilot Study and Standardization section. The following section focuses on a description of the normative samples; see the Reliability and Validity sections on this page for more information on the reliability and validity samples.

In order to create representative normative samples, specific demographic (i.e., age, gender, race/ethnicity, and geographic targets), guided by recent Canadian and U.S. Census information (i.e., Statistics Canada, 2006; U.S. Bureau of the Census, 2008), were utilized during the data collection procedure. Information was collected on each ratee’s gender, age, race/ethnicity (Asian/Pacific Islander, Black/African-American/African-Canadian, Hispanic/Latino, White, Multiracial, and Other), employment status (employed/self-employed, unemployed, retired, and other), and geographic location (state/province and country). For ease of presentation, race/ethnicity groups are referred to in this manual as follows: Black, Hispanic/Latino, White, and Other. For the EQ 360 2.0, this information was provided about the ratee (i.e., the person being rated) by the rater (i.e., the person completing the assessment). Information about the type and strength of the rater-ratee relationship was also collected.

Standardization

The standardization process for the EQ 360 2.0 was similar to that of the EQ-i 2.0. A second normative dataset was collected for the EQ 360 2.0, requiring separate norms and statistical analyses.

NORMATIVE SAMPLE

Normative data for the EQ 360 2.0 were collected concurrently with the EQ-i 2.0, during March 2010 and April 2010. Data for the EQ 360 2.0 required raters to rate an individual (“the ratee”) on the EQ 360 2.0 (including the collection of various demographic information about both themselves and the ratees). During this time period, 3,413 participants provided EQ 360 2.0 data for standardization purposes. From these data, a demographically and geographically representative database of 3,200 ratees was selected as the EQ 360 2.0 normative sample. Statistical analyses showed no strong differences between U.S. and Canadian participants in EQ 360 2.0 scores (Table A.36); therefore, data from both countries were included in the normative sample.

Rater Description. The sample of 3,200 raters (i.e., the participants providing the ratings) was 59.2% female, with a mean age of 46.8 years, (SD = 13.5 years). The sample was primarily White (81.2%), 5.2% were Black, 3.7% were Hispanic/Latino, and 9.9% were of other races/ethnicities. Approximately one-third of the sample was from the U.S. South (33.7%), while 22.0% was from the U.S West, 20.5% was from the U.S. Midwest, 16.1% was from the U.S. Northeast, 5.6% was from Central Canada, 0.9% was from the Canadian West and Prairies, and 0.3% was from the Canadian East. More than half of the raters had at least a college/university education (54.7%), 27.8% had some college/university education, and 17.6% had a high school diploma or less. The majority (90.4%) of raters knew the ratee for over a year (see Table A.37) and over half of the raters stated that they knew the ratee “Well” or “Very Well” on a four-point scale ranging from Not Very Well (0) to Very Well (3; see Table A.38). Therefore, the raters knew the ratees for long enough, and well enough, to provide valid EQ 360 2.0 ratings.

Ratee Sample. The normative sample was stratified to match the Census based on the ratee’s (i.e., the person being rated) demographic characteristics. The sample included an equal ratio of males to females, stratified equally across four rater types: direct report (i.e., the ratee is the rater’s manager), manager (i.e., the ratee is the rater’s direct report), work peer, and friend/family member (Table A.39). Participants were proportioned similarly across most of the age groups, although there were relatively fewer at the lower age range, as an attempt was not made to collect direct-report data for managers under the age of 25 (Table A.40) as they are relatively rare in the population. Race/ethnicity was stratified by Census figures within rater type, given that these distributions differed slightly across rater type (Table A.41). The normative sample met each of these targets within 3%, and was within 1% in most cases. Finally, there was good representation from all U.S. and Canadian geographic regions (Table A.42).

Focus on Effects Size. The effects of gender, age, and rater type were examined in the EQ 360 2.0 normative data. As with the EQ-i 2.0 data, the large EQ 360 2.0 normative sample size dictates that effect sizes should be considered more strongly than significance tests (see the Effect Size section). Cohen’s d values are reported to describe the size of gender effects, and partial eta-squared (partial η2) values are used to describe the effects of age and rater type.

NORMING PROCEDURES

Similar to the EQ-i 2.0, the first step in the EQ 360 2.0 norming procedure was to determine if any demographic trends existed in the data. Demographic effects were examined using an analysis of covariance (ANCOVA) for the EQ 360 2.0 Total EI score and two separate multivariate analyses of covariance (MANCOVA) for the composite scales and subscales. Rater type (direct report, manager, work peer, family/friend), gender, and age group were examined using race/ethnicity (White vs. non-White) as a covariate. In an attempt to control for Type I errors that might occur with multiple analyses, a more conservative criterion of p < .01 was used for all F-tests. Results at the multivariate level revealed significant effects of gender, age, and rater type for both the composites and the subscales (Table A.43); the only significant interaction at the multivariate level was for the interaction of age and rater type for the subscales. Given these results, the univariate effects are described in detail next.

Overall, gender and age effects were less pronounced in the EQ 360 2.0 normative sample than they were in the EQ-i 2.0 sample (see Table A.44 for effect sizes and Tables A.45 through A.47 for descriptive statistics and significance test results). There were no gender differences that reached even a small effect size for the Total EI score or for any of the composite scales. At the subscale level, only Emotional Expression reached a small effect size, with females being rated higher than males. With respect to age, Independence, Social Responsibility, Impulse Control, and Flexibility reached small effect sizes. For Independence, Social Responsibility, and Impulse Control, the effect was attributable to lower scores among 18–29-year-olds. For Flexibility, scores decreased in the older age groups. Very few meaningful differences were found across rater types. No meaningful differences were found across rater types for the EQ 360 2.0 Total EI score (i.e., partial η2 = .00). Some minor differences were found across rater types for the composite scales and subscales, but all were small effect sizes (i.e., partial η2 lower than .06). None of the age × rater type interactions reached significance at the univariate level, with the exception of Problem Solving (F [12, 555.80] = 2.55, p = .002); however, the effect size was very small (partial η2 = .01).

Overall, the lack of meaningful demographic effects suggested it was unnecessary to create specific rater type-, age-, or gender-based norms for the EQ 360 2.0. Therefore, only overall (General Population) norms are available for the EQ 360 2.0. These norms were created using the same procedure as the EQ-i 2.0 General norms, but without the smoothing process (given no age groups were utilized). Standard scores (with a mean of 100 and standard deviation of 15) were computed for all scales. Skewness and kurtosis statistics (e.g., -0.42 and -0.34, respectively, for Total EI; see Figure B.2) were not large enough to suggest a normalizing transformation was necessary for the EQ 360 2.0 scores.

Standardization Summary

More than 4,000 assessments were collected between 2009 and 2010 in the standardization of the EQ 360 2.0. A sample of 3,200 participants was chosen as the EQ 360 2.0 normative sample. The sample was evenly distributed by gender and rater type, and matched to the census based on race/ethnicity. Statistical analyses revealed a lack of meaningful differences in EQ 360 2.0 scores across gender, age group, or rater type. Therefore, a single normative group was created. The norming process resulted in standard scores with means of 100 and standard deviations of 15 for the Total EI score, composite scales, and subscales. The following sections describe the psychometric properties (i.e., reliability and validity) of the EQ 360 2.0.