Educational Psychology Interactive: Assessment, Measurement & Evaluation

ASSESSMENT, MEASUREMENT, EVALUATION & RESEARCH
Evaluation

Citation: Huitt, W. (2004, August). Assessment, measurement, evaluation, & research: Evaluation. Educational Psychology Interactive. Valdosta, GA: Valdosta State University. Retrieved [date] from http://www.edpsycinteractive.org/topics/measeval/evaluation.html

Return to | EdPsyc Interactive: Courses |

Evaluation includes the process of making judgments about the value of data collected through observations and descriptions. It is closely related to the concept of assessment, which is defined as "the process of collecting, interpreting, and synthesizing information in order to make decisions" (Gage & Berliner, 1992, p. 568). However, evaluation involves the comparison of data to a standard it order to make judgments about worth, value, goodness, etc. It is generally agreed that it is better to base judging and decision making on quantitative data as much as possible.

Hummel and Huitt (1994) use the term WYMIWYG (What You Measure Is What You Get) as a way to discuss the importance of evaluation. That is, whatever it is that you intend to measure becomes the most important aspect of your actions.

In general, there are two important issues, in addition to reliability and validity, that are important in the area of evaluation. The first is the timing of evaluations; the second is the standard used to make judgments. Evaluation, relative to instruction, can be done before, during, or after instruction. If it is done before or during (that is, the assessment is done in order to make decisions about instruction) it is referred to as formative. A diagnostic evaluation is a formative evaluation done before instruction begins for the purpose of determining a student's prerequisite skills or readiness for instruction. Formative evaluations can be either informal or formal and could include asking questions during lectures, giving pop quizzes, having students write in journals or any other activity designed to elicit information for the purposes of guiding instruction.

Summative evaluation, on the other hand, is meant to provide data relative to past learning and instruction with no intention on reteaching that material. End-of-unit exams, projects, and other forms of evaluation that serve as summations of learning are examples of this type of evaluation.

There are three types of validity associated with test scores.

Content validity: The degree to which an achievement test's content contains aa representative and appropriate sample of the content (subject matter) contained in the instructional objectives whose attainment the test is intended to measure.
Criterion (predictive) valdity: The degree to which the score on a test predicts the individual's score or performance in some other area. Example: If correlated, a test of scholastic achievement can be used to predict job success.
Construct validity: The degree to which a test mesures the construct, or psychological concept or variable, at which it is aimed (e.g., intelligence, anxiety). Inferred from all of the logical arguments and empirical evidence available.

Classroom Assessment

Three issues are important for classroom assessment, or data collection with regards to student learning, that is under the control of the teacher. The first relates to what data we will use for making judgments (assessment and measurement). A second issue revolves around the reference to be used for making evaluations, and the third relates to how we will communicate our judgments to others (authentic assessment and grading).

Standardized Testing

One of the most important issues related to standardized testing, or evaluation where someone other than the classroom teacher is responsible for developing the test, is the difference between criterion- and norm-referenced testing. In general, criterion-referenced testing is done when we want to know how much a student knows vis-a-vis a standard and norm-referenced testing is done when we want to know how one student or group of students compares to other students in terms of the content being tested.

References

Gage, N. L., & Berliner, D. C. (1992). Educational psychology (5th ed.). Boston: Houghton Mifflin.
Hummel, J., & Huitt, W. (1994, February). What you measure is what you get. GaASCD Newsletter: The Reporter, 10-11.

| Internet Resources | Electronic Files |

Return to:

All materials on this website [http://www.edpsycinteractive.org] are, unless otherwise stated, the property of William G. Huitt. Copyright and other intellectual property laws protect these materials. Reproduction or retransmission of the materials, in whole or in part, in any manner, without the prior written consent of the copyright holder, is a violation of copyright law.

ASSESSMENT, MEASUREMENT, EVALUATION & RESEARCH Evaluation

ASSESSMENT, MEASUREMENT, EVALUATION & RESEARCH
Evaluation