Liberal Education

Facilitating Innovation in Science Education through Assessment Reform

About the PKAL Series

Intended to challenge the higher education community to think strategically about how best to advance the learning and success of all students in science, technology, engineering, and mathematics (STEM), this series of articles presents a broad array of perspectives on cutting-edge issues affecting contemporary undergraduate education in the STEM fields.

The full series is available online at

For over two decades, two pervasive themes have informed the discourse on undergraduate science education in the United States. The first emphasizes the role of the nation’s science and technology enterprise in meeting critical economic and societal challenges in the twenty-first century. Science is better positioned than ever before to address important societal issues such as food security, environmental health, and sustainable energy (National Research Council 2009). However, realizing this potential to address pressing societal problems requires attracting and retaining new generations of creative and versatile scientists who are well prepared to participate in fast-paced, information-rich, collaborative forms of science that are increasingly pursued on the cusps between disciplines. The foundation of the “sci-tech” enterprise is its well-trained workforce, which is sustained by tapping the broad and diverse talent pool of students who are interested in science (National Research Council 2011). Addressing twenty-first-century challenges also requires a citizenry that is equipped to understand the science that informs controversial issues—such as climate change and alternative energy development—that directly affect their lives and communities.

The role of science education in this regard is clear, yet seemingly contradicted by the second pervasive theme: undergraduate science education in the United States is not as effective as it needs to be in translating student interest in science into optimal preparation either to enter the science workforce or to participate as literate citizens in an increasingly global society. Given the potential for science to address important problems, undergraduate programs ought to be functioning as busy portals for engaging students’ innate fascination and developing their understanding of the nature and practice of science. Instead, recent studies suggest, the opposite is true: over half of the students who enter college with an interest in science do not persist in their training beyond the first year or two of introductory coursework (National Research Council 2011). Further, while underrepresented minority (URM) students aspire to major in science at rates similar those of white and Asian students, their completion rates are even lower than their non-URM counterparts. Students who transfer out of science programs report reasons ranging from lack of preparation to perceptions that science courses are unengaging, impersonal, or irrelevant to their interests (Aronson 2002; Felder, Felder, and Dietz 1998; Sevo 2009). Whatever the reason, the implications are clear: negative experiences in undergraduate science courses may have the effect of either eroding student motivation or turning students away from science altogether.

In recent years, multiple efforts within the academic, federal, and private sectors have focused attention on the state of undergraduate science education with respect to success in developing science literacy and preparing students to pursue advanced careers in science fields. The need to engage the “minds and talents” of all Americans in order to improve science literacy and to support scientific research and innovation in the twenty-first century is well documented (AAAS 2011). Most recently, the President’s Council of Advisors on Science and Technology (PCAST) concluded that, in order to meet US science workforce needs, one million additional STEM-capable graduates will be needed in the next decade (PCAST 2012).

The question remains how to improve science education in order to increase student persistence and success. An emerging convergence of purpose and strategy among concerned scientists and educators is evident in several published reports. A decade ago, the landmark Bio2010 report called for more active learning approaches, greater emphasis on quantitative skills, and improved connections among biology, chemistry, and physics (National Research Council 2003). The 2011 Vision and Change report called on higher education institutions and scientific communities to support reforms that lead to the broad adoption of student-centered learning approaches, and that organize biology education around core concepts, competencies, and skills (AAAS 2011). In 2009, a joint committee of the Association of American Medical Colleges and the Howard Hughes Medical Institute was convened to define specific scientific competencies for medical education. The committee recommended that undergraduate science curricula support the development of scientific competencies for premedical students (AAMC-HHMI 2009). The recent PCAST report (2012) proposes broad changes to undergraduate science education, including specific improvements in science instruction and the integration of research experiences for all science students. The recommendations made by these reports converge on three key points:

  • Align what is taught with the leading edges of disciplinary knowledge and research.
  • Develop learning objectives focused on conceptual understanding instead of content, and provide context for learners to develop skills and habits of mind that are required for disciplinary practice.
  • Practice pedagogies and empirically tested instructional strategies that are based on research on how learners learn.

Improving teaching and learning in undergraduate science classrooms

So, what about undergraduate science instruction needs to change? The central critique is that lecture-based instruction, which is by far the most common mode of science instruction in our colleges and universities, does not effectively engage student interest or help students develop the conceptual understanding and skills they need. Current standards that inform instructional practice privilege the retention of content over the development of skills. Standard teaching practices in undergraduate classrooms emphasize didactic lectures delivered by faculty experts, often in large lecture halls that present challenges to implementing interactive engagement. The instructor’s role is to present large amounts of information in a way that is efficient for the presenter, but not necessarily effective for the learner. Since conceptual organization is performed by the instructor, students may not get the chance to practice this skill themselves, and they may not have many opportunities to develop additional skills beyond memorization.

This “teacher-centered” approach to science instruction emphasizes tasks in which learner success is based primarily on the instructor’s ability to organize and present information in ways that enable students to learn it. Students demonstrate their ability to access the content they have retained and to apply it in context-appropriate ways. Typical learning assessments rely on summative examinations that are heavy on the testing of content knowledge and light on the evaluation of analytical skills and higher-order activities, such as evaluation or synthesis. Effective formative assessments that could provide ongoing feedback about student progress are used infrequently or not at all. Yet, if students are to develop the skills and habits of mind appropriate to science, they must have opportunities to practice them, and they need to get feedback on their progress. How might standards for undergraduate science instruction be changed in order to help students develop the concepts and competencies needed for twenty-first-century science practice?

In contrast to the traditional teacher-centered classroom, a “learner-centered” context challenges students to organize knowledge and develop skills through forms of active engagement. A variety of active learning strategies have been shown to be effective in engaging students in both knowledge construction and skill development (AAAS 2011). Problem-based learning and case studies challenge students to apply their knowledge in order to solve problems or analyze data in the context of authentic scientific, environmental, or societal situations. Active learning approaches are optimally supported by the studio-style classroom. However, various strategies have been developed to incorporate active forms of learning into the traditional lecture-based classroom as well. Instructional and information technologies can be used to support “learning before lecture” approaches, which free up class time for students to engage in interactive exercises with instructor feedback (Moravec et al. 2010). Learner-centered teaching methods are now more commonly practiced by undergraduate science instructors, thanks, in part, to faculty development opportunities—such as those provided by the National Academies of Science Summer Institute, the American Society for Microbiology’s Biology Scholars Program, the National Science Foundation–sponsored Faculty Institutes for Reforming Science Education, and the American Association of Physics Teachers’ New Faculty Workshop. Yet, these research-validated approaches do not yet inform widespread institutional approaches to undergraduate science instruction. Thus, attention is now shifting to focus on the identification of barriers to the broad adoption of research-validated teaching approaches in undergraduate science education. Here, we focus on the assessment of student learning as a key lever for overcoming the stagnation in science learning and student success.

About Project Kaleidoscope

Since its founding in 1989, Project Kaleidoscope (PKAL) has been a leading advocate for building and sustaining strong undergraduate programs in the fields of science, technology, engineering, and mathematics (STEM). With an extensive network of over seven thousand faculty members and administrators at over one thousand colleges, universities, and organizations, PKAL has developed far-reaching influence in shaping undergraduate STEM learning environments that attract and retain undergraduate students. PKAL accomplishes its work by engaging campus faculty and leaders in funded projects, national and regional meetings, community-building activities, leadership development programs, and publications that are focused on advancing what works in STEM education.

In 2008, the Association of American Colleges and Universities (AAC&U) and PKAL announced a partnership to align and advance the work of both organizations in fostering meaningful twenty-first-century liberal education experiences for all undergraduate students, across all disciplines. This new partnership represents a natural progression, as nearly 75 percent of campuses with PKAL community members are also AAC&U member institutions. Together, AAC&U and PKAL apply their collective expertise in undergraduate learning, assessment, leadership, and institutional change to accelerate the pace and reach of STEM transformation.

For more information, visit

A brief history of assessment in science education

Over the last fifty years, the prevalent approach to assessment in science education has had a very particular form. The emphasis has been on tests designed to elicit brief, exact answers that focus on factual information in a format that requires the student to choose an answer from a set of options within a specified time frame (Klassen 2005). These tests are usually situated at the end of an educational time period (such as a semester) as final evaluations of knowledge acquisition. This approach to assessment reflects a definition of knowledge as comprised of small, static, and explicit pieces of information that can be acquired through memorization as discrete, decontextualized items organized in series and accumulated progressively. Learning is defined as the ability of the student to recall specific items of information at a given time. Historically, this understanding of learning is associated with behaviorist models that posit that knowledge should be divided into small pieces to be mastered individually and organized into a logical order of acquisition (Resnick and Resnick 1992).

Critiques of these concepts of knowledge and learning assessment have paralleled and informed the discourse on science education reform over the past two decades. The critiques focus on the limitations of traditional approaches in measuring conceptual understanding, higher-order thinking, the ability to perform novel tasks and solve problems, the ability to reason, the applicability of acquired knowledge, and the integration of information (Gardner 1992; O’Neill 1992; Reeves and Okey 1996; Resnick and Resnick 1992). The actual form of the test, with its emphasis on what are termed selected-response questions (e.g., multiple-choice questions) or short written answers to specific questions, dictates the types of knowledge that construct the test. But learning in science is not best understood as a decontextualized activity focused on the memorization of factual information organized in a linear progression; rather, like the generation of scientific knowledge itself, it is a creative, constructive process that builds upon the student’s prior understanding and evolves beyond what is explicitly taught (Glasersfeld 1990). This cognitive and constructivist approach characterizes the progression of learning in terms of an active, dynamic interaction between the student and the knowledge that is being taught. Students are not seen as passive recipients of established knowledge, but rather as active agents in the construction of their own knowledge.

With this shift in the characterization of learning, new approaches to assessment have emerged that aim to find ways of collecting data on the higher-order-thinking aspects of science activity. Authentic approaches to assessment allow science educators to capture the complexities of thinking and action that are inherent to science. Within these contexts, learning is defined as the ability to think and act like a science professional as demonstrated through actual science activity. These approaches encompass diverse knowledge types, such as those defined by Bloom et al. (1956) as a hierarchical taxonomy of levels of learning objectives (knowledge, comprehension, application, analysis, synthesis, and evaluation), by Miller (1990) as a pyramid of competence (knows, knows how, shows how, and does), and by Hanauer, Hatfull, and Jacobs-Sera (2009) as types of scientific knowledge (physical knowledge, representational knowledge, cognitive knowledge, and presentational knowledge).

A much broader range of potential learning objectives for science education can be addressed once the form of assessment is changed. For instance, approaches such as performance assessment (the collection of data on the student’s ability to conduct various specified scientific tasks under test conditions) and portfolio assessment (the collection of a series of student products and documents from within the educational context over a period time) take a far more contextualized and naturalistic approach to the collection of data on student learning in science. These approaches provide evidence on how students actually conduct, apply, reason, and understand different aspects of scientific activity (Berenson and Carter 1995; Ruiz-Primo and Shavelson 1996). The basic principle of these forms of assessment is that they are closely related to the actual activities of science and try to model authentic, real-life aspects of being a scientist. This is a very different basis from which to consider evidence of learning outcomes from that produced by multiple-choice tests of factual knowledge recall.

Educational assessment attempts to gather information concerning student learning in terms of the acquisition of knowledge within a defined educational setting. At its core is a decision concerning the choice of assessment tools and the ability of those tools to collect data that is relevant to the types of knowledge and learning that the educational program aims to develop. Thus, the choice of assessment tools is not an educationally neutral choice to be based on historical or pragmatic reasons, but rather it must be informed by inherent understanding of the nature of knowledge and learning. It is this aspect of assessment that is ultimately crucial for deciding how assessment should be conducted in science education.

Reforming learning assessment in undergraduate science education

When learning objectives are refocused on the development of skills and conceptual understanding, a different instructional context supports student learning. In contrast to the traditional lecture in which the student’s role is limited to knowledge acquisition through listening, students are actively engaged in using information and practicing defined skills in a context that provides feedback to guide their progress. The role of the instructor shifts from lecturer to facilitator of student participation in activities that enable them to make progress toward defined learning objectives. Just as faculty must develop new instructional skills to support active learning in the science classroom, so too must they develop new capacity when it comes to the active assessment of student learning. In a “scientific teaching” approach, instructional practice both draws on evidence-based teaching methods and is informed by ongoing assessment of student progress. The instructor’s role in active learning assessment becomes central.

The direction of innovation and reform in science education is moving toward a student-centered learning paradigm that is organized around core concepts and competencies, and engaged through empirically supported instructional practices. What forms of assessment would support these educational aims? Several characteristics of a compatible assessment program may be considered:

  • Constructively Aligned Assessment. A very basic requirement of any form of assessment is a close relationship between the choice of assessment tools and the desired learning outcomes of the educational program. The assessment program needs to yield systematic data that provide evidence that can be used to assess whether specific learning outcomes have been achieved.
  • Summative and Formative Assessment. Any educational program geared toward student-centered learning must have both summative and formative assessment components so as to allow statements concerning final outcomes and an informed feedback loop for the student and instructor to emerge. While summative assessment necessarily defines the final status of student learning, ongoing formative assessment is also crucial, though often underutilized in science classrooms. Formative assessment provides feedback that enables individual students to improve their learning outcomes, as well as information that enables teachers to modify their instruction to affect student learning.
  • Assessment of a Range of Knowledge Types. An educational program designed to develop understanding of core concepts and competencies in science will need to be able to assess a range of different knowledge types. By definition, competencies integrate knowledge of different types through their emphasis on the application of scientific knowledge within problem-solving settings. Accordingly, the assessment program needs to enable the collection of evidence on the different types of knowledge that are being developed through the educational program and on the integration of these knowledge types.
  • A Range of Assessment Tools. In order systematically to assess the different types of knowledge involved in the teaching of core concepts and competencies, a range of assessment tools will be needed. Rather than thinking of assessment based on a single tool (such as a test), assessment should be thought of as a collection of different tools, each serving a specific purpose in the collection of evidence concerning student learning.
  • Ability to Address Higher-Order Thinking. A central aspect of the revision of many science education programs is the aim to teach and facilitate those higher-order thinking skills that are crucial to the working life of the science professional. Accordingly, the most appropriate forms of assessment are those that capture evidence relating to the student’s ability to use higher-order thinking skills.
  • Ability to Address Real-World Contexts. Science education ultimately aims to produce students who not only know science, but more importantly students who will go on to use this knowledge within real-world, professional settings. Accordingly, meaningful and authentic assessment should be able to make statements about the ability of a student to function in the real world while conducting professional tasks.

Forms of assessment determine the types of inferences that can be made concerning student learning and, more importantly, inform the design of the educational program itself. Both students and faculty are familiar with the notion that assessment parameters define the boundaries of what is to be learned. “Will this be on the test?” is a common refrain in undergraduate courses, and it reflects a succinct (and aggravating) student understanding of how knowledge is valued in a course. Furthermore, washback effects—the reverse engineering of an educational program to match the boundaries of a standardized, high-stakes test—mean that educational programs ultimately tend to transform themselves in order to model the type of learning and knowledge inherent in the test itself (Alderson and Wall 1993; Hamp-Lyons 1997). Thus, curricular reform is dependent on transitions in relation to assessment as well as instructional practice.

Assessment development as a lever for reform

There is broad consensus among stakeholders that undergraduate science education needs to be reformed in order to address student persistence and to help students develop the skills needed for twenty-first-century science practice. Improved instructional practices should enhance student-centered teaching, which is focused on the development of higher-order thinking skills through scientific inquiry, utilizes scientifically informed educational methods, and is organized around core concepts and competencies. Reforming undergraduate science education in this way involves the alignment of effective instructional practices with assessment methods that are appropriate to the objectives for student learning.

Several experiments in active assessment design are currently underway. For example, through the National Experiment in Undergraduate Science Education (NEXUS), a project of the Howard Hughes Medical Institute, teams of university faculty are developing instructional approaches for interdisciplinary, competency-based curricula for premedical students. Through a collaborative approach, NEXUS teams are building capacity to develop learning assessments that are aligned with the goals of competency-based instruction. NEXUS faculty are also developing assessment programs that move beyond testing students’ knowledge of facts to formal assessments of their ability to analyze data, integrate information in interdisciplinary contexts, and design experimental approaches. The Pittsburgh Phage Hunters Integrating Research and Education (PHIRE) program is an innovative model in which students learn the process of scientific inquiry through participation in research in order to discover and characterize novel mycobacteriophages (Hanauer et al. 2006). Active assessment strategies are employed to evaluate the impact of this authentic science environment on student learning. Both NEXUS and PHIRE have integrated faculty development in assessment in order to support curricular reform.

Effective science teaching at the undergraduate level requires that faculty learn and adopt effective instructional practices as well as methods for the active assessment of student learning. Yet, formal training in learner-centered instruction and assessment is seldom part of the preparation science faculty receive prior to entering the undergraduate classroom. Faculty development in learning assessment can enhance the effectiveness of efforts to reform science education, and it is a crucial component of institutional capacity for science education reform.


AAAS (American Association for the Advancement of Science). 2011. Vision and Change: A Call to Action. Washington, DC: American Association for the Advancement of Science.

AAMC-HHMI (Association of American Medical Colleges-Howard Hughes Medical Institute). 2009. Scientific Foundations for Future Physicians. Washington, DC: Association of American Medical Colleges.

Alderson, J. C., and D. Wall. 1993. “Does Washback Exist?” Applied Linguistics 14 (2): 115–29.

Aronson, J. M., ed. 2002. Improving Academic Achievement: Impact of Psychological Factors in Education. San Diego, CA: Academic Press.

Berenson, S. B., G. S. Carter. 1995. “Changing Assessment Practices in Science and Mathematics.” School Science and Mathematics 95 (4): 182–6.

Bloom, B. S., M. D. Engelhart, E. J., Furst, W. H. Hill, and D. R. Krathwohl. 1956. Taxonomy of Educational Objectives: The Classification of Educational Goals; Handbook I: Cognitive Domain. New York: Longmans, Green.

Felder, R. M., G.N. Felder, and E.J. Dietz. 1998. “A Longitudinal Study of Engineering Student Performance and Retention. V. Comparisons with Traditionally-Taught students.” Journal of Engineering Education 87 (4): 469–80.

Gardner, H. 1992. “The Rhetoric of School Reform: Complex Theories vs. the Quick Fix.” The Chronicle of Higher Education 38 (35): B1–B2.

Glasersfeld, E. von. 1990. “Environment and Communication.” In Transforming Children’s Mathematics Education: International Perspectives, edited by L. P. Steffe and T. Woods, 30–38. Hillsdale, NJ: Erlbaum.

Hamp-Lyons, L. 1997. “Washback, Impact, and Validity: Ethical Concerns.” Language Testing 14 (3): 295–303.

Hanauer, D. I., G. F. Hatfull, and D. Jacobs-Sera. 2009. Active Assessment: Assessing Scientific Inquiry. New York: Springer.

Hanauer, D. I., D. Jacobs-Sera, M. L. Pedulla, S. G. Cresawn, R. W. Hendrix, and G. F. Hatfull. 2006. “Teaching Scientific Inquiry.” Science 314: 1880–1.

Klassen, S. 2006. “Contextual Assessment in Science Education: Background, Issues and Policy.” Science Education 90 (5): 820–51.

Miller, G. E. 1990. “The Assessment of Clinical Skill/Competence/ Performance.” Academic Medicine 65 (9): 63–67.

Moravec, M, A. Williams, N. Aguilar-Roca, and D. K. O’Dowd. 2010. “Learn before Lecture: A Strategy that Improves Learning Outcomes in a Large Introductory Biology Class.” CBE-LSE 9 (4): 47–81.

National Research Council. 2003. BIO2010: Transforming Undergraduate Education for Future Research Biologists. Washington, DC: The National Academies Press.

———. 2009. A New Biology for the 21st Century. Washington, DC: The National Academies Press.

———. 2011. Expanding Underrepresented Minority Participation: America’s Science and Technology Talent at the Crossroads. Washington, DC: The National Academies Press.

O’Neill, J. 1992. “Putting Performance to the Test.” Educational Leadership 49 (8): 14–19.

PCAST (President’s Council of Advisors on Science and Technology). 2012. Engage to Excel: Producing One Million Additional College Graduates with Degrees in Science, Technology, Engineering, and Mathematics. Washington, DC: Executive Office of the President, files/microsites/ostp/pcast-engage-to-excel-final_2-25-12.pdf.

Reeves, T., and J. Okey. 1996. “Alternative Assessment for Constructivist Learning Environments.” In Constructivist Learning Environments: Case Studies in Instructional Design, edited by B. Wilson, 191–202. Englewood Cliffs, NJ: Educational Technology Publications.

Resnick, L. B., and D. P. Resnick. 1992. “Assessing the Thinking Curriculum: New Tools for Educational Reform.” In Changing Assessments: Alternative Views of Aptitude, Achievement and Instruction, edited by B. R. Gifford and M. C. O’Connor, 37–75. Boston: Kluwer.

Ruiz-Primo, M. A., and R. J. Shavelson. 1996. “Rhetoric and Reality in Science Performance Assessment.” Journal of Research in Science Teaching 33 (6): 569–600.

Sevo, R. 2009. “Literature Overview: The Talent Crisis in Science and Engineering.” Assessing Women and Men in Engineering.

David I. Hanauer is professor of English at Indiana University of Pennsylvania and assessment coordinator for the Phage Hunting Integrating Research and Education Program at the University of Pittsburgh. Cynthia Bauerle is senior program officer in precollege and undergraduate science education at the Howard Hughes Medical Institute and a member of PKAL’s Faculty for the 21st Century (‘98).

To respond to this article, e-mail, with the authors' names on the subject line.

Previous Issues