Assessment, Measurement and Evaluation
(Undergraduate Version)

Source: Huitt, W. (1996). Assessment, measurement and evaluation: Undergraduate version. Educational Psychology Interactive. Valdosta, GA: Valdosta State University. Retrieved [date], from http://www.edpsycinteractive.org/edpsyc/edpmsevl.html


Return to | Graduate version | EdPsyc Interactive |


Overview

There are a variety of ways of knowing that something is true or not. Science is one way and a set of rules and methodology have been established by which truth is verified. The process of science generally follows a paradigm which defines the rules and describes procedures, instrumentation, and methods of interpretation of data. The results of science are formulated into a hierarchy of increasing complexity of knowledge: facts, concepts, principles, theories, and laws. When engaged in the process of science, scientists formulate hypotheses or educated guesses about the relationships between or among different facets of knowledge.

Assessment, measurement, evaluation and research are part of the processes of science and issues related to each topic often overlap. Assessment refers to the collection of data to better understand an issue, measurement is the process of quantifying assessment data, evaluation refers to the comparison of that data to a standard for the purpose of judging worth or quality, and research refers to the use of that data for the purpose of describing, predicting, and controlling as a means toward better understanding the phenomena under consideration. Measurement is done with respect to "variables" (phenomena that can take on more than one value or level). For example, the variable "gender" has the values or levels of male and female and data could be collected relative to this variable. Data on variables are normally collected by one or more of four methods: paper/pencil, systematic observation, participant observation, and clinical.

The collecting of data (assessment), quantifying those data (measurement) and developing understanding about the data (research) always raise issues of reliability and validity. Reliability attempts to answer concerns about the consistency of the information (data) collected, while validity focuses on accuracy or truth. The relationship between reliability and validity can be confusing because measurements (e.g., scores on tests, recorded statements about classroom behavior) can be reliable (consistent) without being valid (accurate or true). However, the reverse is not true: measurements cannot be valid without being reliable. The same statement applies to findings from research studies. Findings may be reliable (consistent across studies), but not valid (accurate or true statements about relationships among "variables"), but findings may not be valid if they are not reliable. At a miniumum, for a measurement to be reliable a consistent set of data must be produced each time it is used; for a research study to be reliable it should produce consistent results each time it is performed.

When we discussed the Transactional Model of the Teaching/Learning Process, we also discussed the importance of focusing during instruction on those outcomes you intend to measure. Dr. Hummel and I refer to this concept as "What You Measure Is What You Get" (WYMIWYG). In American education this generally means achievement in basic skills such as reading and mathematics. Although there are other outcomes that are important due to the changing requirements of the information age, basic skills achievement is nevertheless important and this discussion of measurement and evaluation issues will primarily deal with that topic.

Classroom Assessment

Three issues are important for classroom assessment (data collection with regards to student learning that is under the control of the teacher.) The first relates to what data teachers will use for making judgments (qualitative or quantitative); a second issue revolves around when they will collect data (formative vs. summative assessment.) A third issue revolves around the reference to be used for making evaluations (criterion- versus norm-referenced); a fourth relates to how teachers will communicate their judgments to others (authentic assessment, portfolios, and grading).

Standardized Testing

One of the most important issues related to standardized testing, or evaluation where someone other than the classroom teacher is responsible for developing the test, is the difference between criterion- and norm-referenced testing. In general, criterion-referenced testing is done when we want to know how much a student knows vis-a-vis a standard and norm-referenced testing is done when we want to know how one student or group of students compares to other students in terms of the content being tested.


| Internet Resources | Electronic Files | Additional articles | Additional books |


Return to: