Part V: creating the EQ-i 2.0 and EQ 360 2.0

Standardization, Reliability, and Validity

Effect Size

When analyzing data from an extremely large sample (such as the ones described on this page), the proper interpretation of what constitutes a significant result is important. There will be several instances throughout this page where tests of significance (e.g., F-tests) will be reported. As Thompson (2002) noted, significance tests do not inform as to the importance, or practical significance of the test result. Significance tests are greatly influenced by sample size; that is, the larger the sample, the more likely a test will be statistically significant (Thompson, 2002). With a normative sample size of 4,000 in the EQ-i 2.0 and 3,200 for the EQ 360 2.0, it is therefore necessary to examine the practical significance of all analyses, in addition to the statistical significance.

In order to accomplish this, estimates of effect size (e.g., Cohen’s d) that estimate the strength of the effect are provided for analyses where appropriate. Effect sizes permit the comparison of results across studies, in which sample sizes may differ dramatically. For example, Cohen’s d illustrates the difference between two means in terms of pooled standard deviations (i.e., a value of 1.00 means that the mean scores from the two groups differ by one pooled standard deviation). Standard criteria, which are not influenced by sample size (Cohen, 1988), are available for determining small, medium, and large effect sizes. For instance, marker values for interpreting small, medium, and large effects with Cohen’s d are .20, .50, and .80, respectively.

Correlations are also commonly reported on this page. Although the interpretation of correlation coefficients varies depending on how you are using them, for the data reported on this page, ranges for interpreting small, medium, and large effects with the correlation coefficient (r) are .10, .30, and .50 (absolute values), respectively.

Partial eta-squared (η2) is used to summarize differences between multiple categorical groups or to summarize non-linear differences between groups (e.g., age groups). This statistic is preferable to d in analyses where differences between more than two groups are examined (e.g., racial/ethnic groups), or where a non-linear effect is expected, such as the EI age trends, where scores increase up to a point and then decrease over the life span. Partial η2 is also used to quantify interaction effects between multiple variables (e.g., between age groups and gender). Cutoffs for evaluating partial η2 as small, medium, and large are .01, .06, and .14, respectively (Cohen, 1988).