ÃÛÌÒÊÓƵ

College Assessment

Inter-rater Reliability

You and each IA will participate as raters. You as the lead faculty in the course will be the primary rater. Each IA will be listed as Rater 1, Rater 2, etc.

Choose the assessment in your current course you wish to evaluate through the IRR process. Pull two student submissions submitted during a previous semester. One submission should illustrate superior student performance, and one should illustrate poor student performance. You and each IA should evaluate the student samples using the rubric for your current class. After all the sample values have been inputted and an agreement percentage generated, identify any samples rated under 90% agreement. Visit with each of the raters if your assessment agreement falls below the 90% threshold of agreement. Look for any small disparities in scoring and offer feedback to the rater and clarify any parts of the rubric that may be confusing. Continue to consult with your raters until you build an agreement of at least 90%.

In cases of substantial disagreement, any disputed rubric construct should be revisited and possibly revised based on rater feedback.

Consider these questions:

  1. Was the rubric able to differentiate between high and low student performance?
  2. Did the rubric provide actionable feedback for students?
  3. How many consultations did it take for you to come to consensus with your raters?

for each course in the part of term in which you are teaching.

Trustworthiness and Fairness

Trustworthiness and fairness are addressed before determining content validity. They can also be addressed after determining content validity to help explain scores. Use the assessment instructions and rubrics to examine:

  • Internal credibility (i.e., truth value, applicability, consistency neutrality, dependability, and credibility of interpretations/conclusions)
  • External credibility(i.e., the degree to which the results can be generalized across the candidates in the program). This is accomplished using Guba and Lincoln's (1989) authenticity criteria.

For each item, review the following and present evidence/method to increase credibility. Also list any comments you might have regarding your assessment. Fill out the provided worksheet and upload your findings.

for each assessment you are evaluating.

Content Validity

This will help you and a review team work through content validity for your assessment rubric (it could be used to vet content validity on an objective exam as well). Your review team (at least 3 but no more than 8 members) should examine the rubric and rate each how essential each item is. You should then plug their ratings into the worksheet for each rubric line or concept and the worksheet will calculate a content validity ratio (CVR) value.

A score of less than 0 (a negative value) is not valid and indicates there is not agreement between the reviewers on how essential the item is. This item should be revised or eliminated. A range of 0 to .75 indicates that at least half of the reviewers thought the item essential. While valid, it would be best to discuss items in this range to see if the item can be refined to come to better agreement. A range of .76 to 1 indicates a valid item. The values will be color coded as per level of validity.

Once you have completed the worksheet, please complete and upload your results.

EAC Data Submission

Use to run assessment data in Blackboard using the EAC Visual Data tool. Download the overall Excel sheet from EAC containing the data for your assessment.

You will be evaluating four primary areas in the EAC Data:
  • Overall Chronbach's Alpha score
  • Point Biserial Correlation
  • Chronbach Alpha with Deletion
  • Student performance for any aligned goals

Overall Chronbach's Alpha score for the instrument should be +.50 or higher for high stakes mastery.

Point Biserial Correlation (PBC) is a reliability measure. Students who score well on the exam should do well on the question or rubric line. Students who struggle on the exam should struggle on the question or rubric line. Shoot for a value greater than =.30 for high stakes assessments. Value less than +.09 are poor. Values ranging +.09 to +.30 are acceptable to good.

Chronbach Alpha with Deletion values indicate that the assessment is more reliable without the question or rubric line. If the Chronbach Alpha with Deletion is greater than the overall Alpha score, the exam is more reliable without it.

EAC can show student performance for goals aligned to a rubric or test. The pass threshold defaults to 60% but can be adjusted up or down.

Complete  for the data set that you have run. Evaluate each data point according to the parameters of the form. Also discuss what changes you have made (or plan to make) as a result of your analysis.