A key indicator of institutional effectiveness is the level of attention given to and the quality of student learning (Chun, 2002). A variety of metrics are commonly used to evaluate institutional effectiveness such as student persistence and graduation, student satisfaction, and faculty credentials. However, student attainment of stated course and program outcomes remains the central and most pivotal measure of institutional effectiveness because it is inextricably linked to higher education’s principle mission of enabling students to be successful in their chosen field of study.
While efforts to assess and advance student learning have yielded a wide range of assessment strategies, relatively few institutions articulate a conceptual framework and operational guidelines for their assessment initiatives (Ewell, 1991). Consequently, assessment efforts are often hampered by imprecise, inefficient, or inconsistent practices. An effective assessment framework must address the processes surrounding assessment as well as the instruments used to assess student learning. The purpose of this 5 part article series is to outline and describe the underlying model behind and recommended guidelines for the development of performance assessments.
Outcomes-based education emphasizes successful demonstration of learning as the criteria for advancement and graduation (Spady, 1994). The two most integral aspects of outcomes-based education are the outcome statements that define expectations around what students should be able to demonstrate, and the assessment of the quality of student learning against those outcomes. Effective outcomes embody integrative, enduring knowledge and skills that are essential to a given profession or field of study. The second aspect, assessment, is a function of outcomes. That is to say, reliance on outcomes as a condition for advancement necessitates reliable, valid assessment methods. While objective methods play an important role in outcomes-based education, they are generally less capable of directly evaluating integrative applications of knowledge, skill, and ability. As such, performance assessments play an important role in the battery of assessment methods used in higher education.
Performance assessment is an evaluative instrument in which students show their ability to apply knowledge, skills, and abilities in a setting designed to emulate real-life contexts or conditions (National Council on Measurement in Education, 2013). The most common form of performance assessment involves the creation of an artifact for evaluation such as a document or multimedia presentation. In other cases, however, students may be observed and evaluated while performing a specific activity.
Research suggests that properly designed performance assessments improve student motivation to learn (Hmelo-Silver, 2004; Norman & Schmidt, 1992), increase transfer of learning (Hmelo-Silver, 2004; Wiggins, 1998; Wiggins & McTighe, 1998), accommodate a wider range of learner and instructor characteristics (Darling-Hammond & Pechone, 2009; Niemi, Baker, & Sylvester, 2007), and offer a more accurate picture of student abilities in ill-structured, highly dynamic situations where knowledge and skill must be applied in an integrative way (Newmann and Wehlage, 1995; Wiggins, 1998).
Newmann and Wehlage (1995) characterized performance assessments along three dimensions: construction of knowledge, use of disciplinary inquiry, and value beyond the educational setting. The first dimension refers to the process of building upon prior knowledge by organizing, synthesizing, interpreting, or evaluating information. Disciplinary inquiry refers to the use of a base of knowledge as a foundation for exploring, analyzing, interpreting, and communicating ideas and findings. The third dimension, relevance beyond the educational setting, speaks directly to the fidelity and authenticity of the task. The hallmark of effectively designed performance assessments is the degree to which they emulate real-world processes, activities, or performances (Kolb, Boyatzis, & Mainemelis, 2000; Newmann & Wehlage, 1995; Wiggins, 1998). Higher education has long valued the idea that knowledge and skill should be applied to real world problems both as a product of and a means for facilitating learning (Dewey, 1938). Various iterations of this philosophy have manifested in education over the past century, the most recent of which appear in calls for more problem-based learning, experiential learning, cognitive apprenticeship, and authentic assessment (Hansman, 2001; Kolb, Boyatzis, & Mainemelis, 2000; Miettinen, 2010).
The term authentic assessment was first used by Wiggins (1989) to describe assessments with a high degree of fidelity to real-world activities and tasks. However, the term is often inappropriately used to refer to written or constructed-response assessment methods in general. This confusion is particularly problematic because while many real-world activities entail the development and use of text-based communication mediums, relatively few rely upon the generation of a written paper in the academic sense. This confusion also has the effect of reducing the overall battery of assessment methods at one’s disposal when designing an assessment. The mechanism by which the assessment is delivered and submitted is of less importance than the characteristics of the activity itself (Wiggins, 1998). For example, a computer-scored objective assessment requiring students to accurately code various medical transcripts would represent a highly authentic assessment because it accurately embodies an activity that a medical transcriptionist would be expected to perform. Similarly, the use of a computer simulation for assessing a pilot’s ability to make effective decisions is a far more authentic and reliable form of assessment than a written response assessment. Consequently, evaluations of the authenticity of an assessment should be made in reference to the alignment of the assessment delivery method with the stated outcomes and should not privilege any specific assessment instrument.
In 1990 George Miller proposed a model for assessing medical students and residents that closely resembles Dale’s (1969) cone of experience in its emphasis on high fidelity experiences. Miller advocated for focusing assessment on the top two levels which he characterized as the domains of action or performance. He contended that demonstration of competence in these domains strongly implies that a student has acquired the prerequisite knowledge (know) and ability to apply that knowledge (know-how). Miller’s was not suggesting that lower level knowledge and skills should not be assessed but rather that inferences drawn about students’ overall competency are more valid when drawn from the performance domain. Some would argue that validity is the central challenge facing assessment professionals because it acknowledges that there is a level of inaccuracy in any given assessment event (Messick, 1996). No measurement instrument can ever measure that skill with absolute precision or certainty due to the inordinate amount confounding variables that affect performance on the assessment. The goal of assessment development is to select and construct assessment instruments with higher levels of validity. In cases where the constructs under investigation represent performances, the validity of the assessment is improved significantly when the task more closely approximates the activity as it is performed in the professional setting.
- Chun, M. (2002). Looking where the light is better: A review of the literature on accessing higher education quality. Peer Review, 4(2/3), 16-25.
- Darling-Hammond, L., & Pecheone, R. L. (2009). Reframing accountability: Using performance assessments to focus learning on higher-order skills. In Meaningful measurement: The role of assessments in improving high school education in the twenty-first century Washington, D.C.
- Ewell, P. T. (1991). Assessment and public accountability: Back to the future. Change: The Magazine of Higher Learning, 23(6), 12-17.
- Hmelo-Silver, C. E. (2004). Problem-based learning: What and how do students learn? Educational Psychology Review, 16(3), 235-266.
- Kolb, D. A., Boyatzis, R. E., & Mainemelis, C. (2000). Experiential learning theory: Previous research and new directions. In R. J. Sternberg and L. F. Zhang (Eds.), Perspectives on cognitive, learning, and thinking styles (pgs. 2-40). New Jersey: Lawrence Erlbaum.
- Miettinen, R. (2010). The concept of experiential learning and John Dewey’s theory of reflective thought and action. International Journal of Lifelong Education, 19(1), 54-72.
- Miller, G. E. (1990). The assessment of clinical skills/competence/performance. Academic Med, 65(9), 63-67.
- Messick, S. (1996). Validity of performance assessments. In G. Phillips (Ed.), Technical issues in large-scale performance assessment (pp. 1–18). Washington, DC: National Center for Education Statistics.
- National Council on Measurement in Education (2013, July). Glossary of important assessment and measurement terms. Retrieved from http://ncme.org/resource-center/glossary/
- Newmann, F., & Wehlage, G. (1995). Successful school restructuring: A report to the public and educators by the Center on Organization and Restructuring of Schools. Madison, WI: Wisconsin Center for Education Research.
- Spady, W. G. (1994). Outcome-based education: Critical issues and answers. American Association of School Administrators, Arlington, VA.
- Wiggins, G. (1998). Educative assessment. Designing assessments to inform and improve student performance. San Francisco, CA: Jossey-Bass Publishers.
- Wiggins, G. (1989). A true test: toward more authentic and equitable assessment. Phi Delta Kappan, 70(9), 703-711.
- Wiggins, G., & McTighe, J. (1998). Understanding by design. Alexandria, VA: Association for Supervision and Curriculum Development.