Conducting Empirical Validation and Adverse Impact Analysis
Yellow Blaze Candle Shops provides a full line of various types of candles and accessories such as candleholders. Yellow Blaze has 150 shops in shopping malls and strip malls throughout the country. Over 600 salespeople staff these stores, each of which has a full-time manager. Staffing the manager's position, by policy, must occur by promotion from within the sales ranks. The organization is interested in improving its identification of salespeople most likely to be successful store managers. It has developed a special technique for assessing and rating the suitability of salespeople for the manager's job.
To experiment with this technique, the regional HR department representative reviewed and rated the promotion suitability of each store's salespeople. They reviewed sales results, customer service orientation, and knowledge of store operations for each salesperson and then assigned a 1–3 promotion suitability rating (1 = not suitable, 2 = may be suitable, 3 = definitely suitable) on each of these three factors. Customer service orientation was rated based on supervisor and coworker observations of work behavior. These ratings incorporated evaluations of how often salespeople asked customers how they could help, how effectively salespeople were able to suggest products that matched customer requests, and how well salespeople ensured that customers were happy with their intended purchases at the end of the encounter. In most cases, ratings of customer service orientation were similar across managers and assistant managers, but there were some discrepancies. Knowledge of store operations was evaluated based on a standardized exam, consisting of a variety of questions related to facts ranging from managerial practices and procedures to refund and exchange policies to record-keeping requirements. A total promotion suitability (PS) score, ranging from 3 to 9, was then computed for each salesperson.
The PS scores were gathered for all salespeople but were not formally used in promotion decisions. Over the past year, 30 salespeople were promoted to store manager. Now it is time for the organization to preliminarily investigate the validity of the PS scores and see whether their use results in adverse impact against women or minorities. Each store manager's annual overall performance appraisal rating, ranging from 1 (low performance) to 5 (high performance), was used as the criterion measure in the validation study. The following data were available for analysis:
Using the data above, calculate:
1. Average PS scores for the whole sample, males, females, nonminorities, and minorities.
2. The correlation between PS scores and performance ratings, and its statistical significance (r = .37 or higher is needed for significance at p<.05).
3. Adverse impact (selection rate) statistics for males and females, and nonminorities and minorities. Use a PS score of 7 or higher as a hypothetical passing score (the score that might be used to determine who will or will not be promoted).
4. Average performance rating scores for the whole sample, males, females, nonminorities, and minorities. For each group, evaluate whether the performance rating scores are different for subgroups of employees. Also evaluate if the magnitude of these differences is sizable enough to warrant concern for Yellow Blaze.
Using the data, results, and description of the study, answer the following questions:
1. Is the PS score a valid predictor of performance as a store manager? Do you see any potential reasons why either the customer service orientation or knowledge of store operations measures might be problematic? In answering, consider issues related to reliability and validity.
2. With a cut score of 7 on the PS, would its use lead to adverse impact against women? Against minorities? If there is adverse impact, does the validity evidence justify use of the PS anyway?
3. What limitations do you see in the current study design? Do you think that the conclusions you would reach based on this sample of individuals who were promoted to store manager would generalize to the population of all salespeople who are being evaluated for promotion potential? Do you think that the method of rating performance is sufficient as a criterion, and if so, why? If not, what additional steps would you take to ensure that performance is measured adequately?
4. Would you recommend that Yellow Blaze use the PS score in making future promotion decisions? Why or why not? If you do think the company should use the PS score system, can you think of anything they could do to make these measures even better than they are already? If you do not think the PS score system should be used, can you think of any ways that this system might be improved?
5. One employee has raised questions regarding whether the performance ratings themselves are biased. This employee has not made a formal legal complaint against Yellow Blaze yet, but the organization wants to evaluate whether there is reason for concern. Based on the calculations you made regarding the differences for performance evaluation ratings for women relative to men, and for minorities relative to non-minorities, do you believe that there is reason for the organization to be concerned regarding this issue? In other words, do the data suggest that there is, in fact, a substantial difference in performance evaluation ratings for different groups of employees? How should the organization respond to this individual employee's concerns?