Pass Points - What You Should Know
It is evident that there have been a relatively large number of significant public safety court cases and rulings in the past few years regarding selection exams. Most recently, a federal appeals court has ruled that the Chicago Fire Department must hire 111 African American candidates who passed the firefighter entry level entrance exam back in 1995. Additionally, the department must pay out tens of millions of dollars to over 6,000 applicants who also passed the exam, but were not hired. As someone involved in testing, it is important for you to understand what happened in this situation.
In 1995, a test was developed and administered to entry level firefighter candidates. A cut score was initially established at 65 and above. Sixty-five was chosen as it fell one standard deviation below the mean of 75. This cut score resulted in 22,000 candidates in the "passing" pool. It was determined that those who scored at this level were qualified to be firefighters. It was also determined by the City that those who scored an 89 or higher were more highly qualified. The City then randomly selected for hire, only those candidates who scored in the 89 or above range. The effect was that an 89 pass point resulted in severe disparate impact on African American candidates.
A reason the City has been handed such a strong verdict is that there was no proof that candidates with scores of 89 or higher were more qualified for the position of firefighter than those candidates scoring between 65 and 88. It might seem intuitive that a person scoring higher on a test would be more qualified, but this not always true or defensible. This dovetails very closely with setting minimum standards. Tests can be designed to evaluate minimum standards or minimum competence of a candidate. Tests can also be designed to predict performance on the job at all ranges of performance, as is needed to justify the setting of higher passing scores.
An easy example is reading ability. While firefighters need to be able to read at a high level, a doctoral level of reading ability is not required and would not necessarily predict performance. Constructs such as teamwork and customer service, however, often have an impact on job performance throughout the entire range of performance. This must be supported by documented evidence of the relationship between test scores and job performance. This leads us to the familiar process of criterion validation. A demonstrated significant statistical relationship between test scores and job performance (determined during criterion validation) can enhance ability to defend using a test in a top down capacity, setting bands or other methodology that would have different cut scores.
So let's revisit the Chicago case. The test was not criterion validated. Ultimately, there was no theoretical or documented case as to why those with an 89 would do better than those with a 65. Again, the outcome of this case doesn't mean that higher pass points can't be used, but what it does mean is that there must be justification when they are. There are situations when higher pass points can be justified and that is when there is documented proof that those who scored higher on the exam are more qualified than those who scored below them.
All of Ergometrics' video-based simulation exams are criterion validated. The criterion validation studies for each exam demonstrate strong correlations between test scores and supervisor evaluations of performance. These findings validate the capability of these exams to predict actual job performance. Further, Ergometrics tests are designed to be maximally predictive of performance, as is the intent of a job simulation. Theoretical and statistical defensibility related to cut scores is the strongest with job simulation testing.