Curated Collection: Assessment

The COVID-19 pandemic has shaken loose the cobwebs in all corners of the discussion about assessment in tertiary mathematics education. Faculty are being creative about assessments in new media, are being forced to acknowledge long-standing inequities and harm in the status quo, and are being given or are demanding the freedom to make changes that may have seemed impossible a year ago.

While the pandemic has certainly been horrific, I hope that one silver lining of 2020 will be a permanent shift in our assessment practices as a community as well as a sustained openness to critiquing and revising our practices for many years to come. In service of this vision, I offer this first PRIMUS Curated Collection, a new editorial format that allows me to tie together existing PRIMUS papers with current themes of interest, much like an archival Special Issue of the journal. The goal of this piece is to help you find useful resources as you reflect on your assessment practices, and to push you to continue being daring in your future efforts to design more productive and just learning experiences.

I built this collection by rereading all titles and many abstracts of papers in PRIMUS and searching the journal website for keywords. I took an open stance toward including papers in this first pass and then narrowed it a bit to give this editorial more focus. Amusingly, the most promising title in the first pass, “Portfolio Analysis”, is not even about assessment; it’s about assessing stock portfolios. Not every paper that could be mentioned will be listed explicitly, but the list at the bottom of this post is close to comprehensive.

Several major themes related to rethinking assessment emerged from my searching. Authors explored ways to change the stakes of assessments, often with outcomes-based grading and testing structures. They explored changing the structure of assessments by considering media beyond written exams and by configuring them as group activities. They explored ways to change the focus of their assessments by asking students to synthesize and engage messy applications/projects and by making metacognition part or all of the assessment activities. And they explored issues around assessment as part of a larger system by considering approaches to grading in general, alignment with other course activities, workload, and perceptions and expectations about assessment in mathematics.

Stakes

PRIMUS has published a significant number of papers about building courses without one-time, high-stakes testing. Many of these papers use terms such as “mastery grading” or “mastery testing”, “specifications grading”, and “outcomes-based assessment”. Educators have expressed some concern with the term “mastery” because the word has unwanted implications and connections, so I will use the term outcomes-based assessment (OBA) for this categorical name, though the existing terms do not all mean exactly the same thing. A common axiom of outcomes-based assessment is the assertion that grades should only represent demonstrated learning. In context, this statement is speaking back to at least two other options. One concern that brings faculty to OBA is the experience of students being passed through courses in which they never really understood anything because of an accumulation of partial credit; instead, these faculty want course grades to represent only the understanding and skill that students can employ accurately and completely. Another concern that brings faculty to OBA is the observation that course structures that accumulate “points” in a weighted average over the course of a term reinforce privilege, both the privilege of having the resources to “participate” consistently and the privilege of entering the course with more aligned preparation, which leads students to excel earlier in the flow of the course through time. In particular, these faculty want grades to represent students’ learning by the end of the term rather than differentiating students based on how or when that learning happened.

The outcomes-based assessment papers in PRIMUS offer excellent examples of assessment and course designs to meet these concerns. I suggest you start reading the editorial for the recent PRIMUS Special Issue on Mastery Based Grading, which lays out the significant terms and differences in OBA in more detail than I can here and which organizes and describes the papers in the Special Issue, which are also included in the lists below. This Special Issue is a large collection of OBA papers, but there are papers outside this collection, including papers about developing professional mindsets, benefits of grading without points, and more examples of OBA in the context of intro-to-proofs, abstract algebra, and across the curriculum. Many of these papers show impressive impacts on student achievement, including in ways that level the playing field for students, as was hoped for in the motivation for OBA, such as differences in re-assessment practices related to gender and stereotype threat and impacts on anxiety.

While I value these kinds of results, I struggle with OBA in two ways. First, while I agree that these kinds of shifts engage issues of privilege in our classrooms, they seem to shift them into new mechanisms. For example, while it can open doors for students who might otherwise have had no productive pathways forward after a rocky start, this learning can take a lot of time, so students who have that time during the semester, especially residential students who are not working full-time, and students who enter the course already able to demonstrate understanding and skill with the course objectives are still at a significant advantage. Second, in the models of OBA that I know, course goals tend to need to be discretized into the kinds of outcomes that are measurable in quizzes, and this does not match well with the way that I view these understandings and skills as a by-product of our work on more holistic (often cross-cutting and epistemological) goals. Taken together, I’m saying that outcomes-based assessments are an excellent and highly pragmatic step away from a broken system, but for me many OBA structures maintain the assumption that test-like environments are the context in which we must assess students. The remaining sections will question other assumptions about assessment.

Structure

PRIMUS has published a significant number of papers about building courses without individual, written tests. These papers are more diffuse than the outcomes-based assessment papers in part because the ways that faculty have approached this goal are quite diverse; moreover, these kinds of ideas are often embedded in papers that are not overtly about assessment, so the list of papers for this section’s theme is likely less complete.

Connected to work of changing the stakes, authors have suggested ways to change students’ access to resources during testing with open-book exams, or the more provocative “Let your students cheat”. This change in structure emphasizes assessing what students can do with tools rather than in isolation, often motivated by observations that students will mostly apply their learning in future contexts where they can access resources. Readers will have noticed that this is likely to feel like it lowers the stakes for students, and it certainly also shifts the focus of the assessment, showing how the themes I identified above are interrelated. Similarly, authors have suggested making homework and mid-term exams comprehensive, rather than reserving that high-stakes structure for a final exam; both of these papers found especial benefits for students not well-served by structures that reserve comprehensive assessment for final exams.

Multiple authors discussed other media for assessments. The most common medium was oral exams (A, B, C). These papers resonate with my personal experience that oral exams allow an instructor to seek student competence rather than feeling bound to take their static, written work as a representation of their best understanding. Another paper explored the use of posters with presentations, like those at academic conferences, for assessment. These choices bring communication, and more diverse modes of communication, into the assessment process more overtly, which can allow students to shine who would otherwise have struggled to demonstrate their skills.

Other authors discussed ways to make assessments collaborative, such as with think-share-write group quizzes and group exams. These last two sub-themes — other media for communication and group assessments — both require a caveat. Simply changing the medium or group structure is not enough alone to achieve the change in assessment that these authors are seeking. On top of past traumatic experiences in mathematics courses that leave many students anxious, students can be apprehensive about oral communication and about group work. I believe that we need to build two other elements into our course design for these choices to be more equitable and effective than our past experiences. First, we need to make sure that our courses are not now designed so that some new skill is a bottle-neck for students to demonstrate their learning. And second, we have to teach the associated skills, specifically the oral communication and group collaboration that are needed in these two respective sub-themes. As I mentioned above, I find oral exams effective because I can seek student competence directly through the live interaction. I tell students this when we are talking about a potential oral exam. A few students express some anxiety about talking about math, but these students regularly express relief after the first oral exam saying that they get what I mean about seeking competence. But more importantly, we spend every class period of the term practicing and getting feedback on oral communication in mathematics and preparing for the oral exam task explicitly. Similarly, if we are building toward group assessments, we need to practice every day. For papers related to group course learning structures, I suggest that you read about process-oriented guided inquiry learning (POGIL), team-based learning (TBL), and team-based inquiry learning (TIBL).

I believe that we also need to teach the associated skills for solo-written exams, but I would assert that this is under-discussed in the conversation about assessment because of the normative power given to solo-written exams in mathematics. Imagine being asked in a faculty review: “It looks like you are using high-stakes, solo, written exams. Can you point to the ways that you are teaching students to do their best work in this context in class, with a focus on how students get to do this in low-stakes contexts, get detailed individual feedback, and get to revise that work?” My real point here is that lots of the discussion around assessment lets this kind of assessment through uncritically by letting it be the “default”. A vital step in the path forward to more just and effective assessment practices is us requiring that even practices that are labelled as “traditional” be subject to critical analysis.

Focus

PRIMUS has published a significant number of papers about building courses that focus on knowledge and skills beyond a shallow version of computation. As Kung and Speer documented, students can arrive at accurate computational results without having developed the ways of thinking that we are hoping to build in our courses. As a result, many authors in PRIMUS have written about ways that they shifted the focus of assessment toward these ways of thinking and away from declarative knowledge and computational results.

One large sub-theme is a focus on assessing students explaining their thinking and process. Many of these authors direct students: don’t just tell me what the computer told you. This focus on explaining computations is common with lower-division courses (A, B, C). Video and audio threading technologies are helping facilitate this kind of work beyond writing, and I expect to see more of this in the journal moving forward. This approach has also been used in complex analysis and courses for future teachers, with a focus on “attending to precision” (the Common Core State Standard for mathematical practice) with a focus on the role of definitions. Within this sub-theme, two other interesting points emerge. First, some mathematicians feel that teaching writing is outside of their training. For these concerns, I suggest you explore the PRIMUS Special Issue on Writing and Editing in the Mathematics Curriculum, parts I and II, as well as these discussions of evaluation of writing in mathematics courses (A, B). Second, readers will have noticed that the line between learning experiences and assessment practices is blurring, as students write to learn while simultaneously providing evidence of that learning.

A second large sub-theme is a focus on synthesis and application. Some papers discuss summative portfolios (A, B), which allow the focus to shift toward connections between course ideas; portfolios can also lower the stakes and encourage ongoing revision work by asking students to curate the best examples of their work. Other papers bring collaboration into these portfolios by asking students to write something like a reference textbook for the course together (A, B); I expect that other faculty will continue expanding on these ideas because of remote collaboration technologies and the need for asynchronous but interactive elements of courses. More broadly, the PRIMUS Special Issue on Capstone Courses is filled with ideas about course design related to synthesis.

Similarly, there are a large number of papers about projects that serve as integrative learning and assessment experiences. Some are smaller in scale, growing out of weekly journals or challenge problems in courses, while others envision organizing a course or the whole curriculum around projects. For more like this, I suggest you explore the PRIMUS Special Issue on Project-Based Curricula; furthermore, many of the ideas in the PRIMUS Special Issues on Service Learning in Mathematics and on Mathematics for Social Justice bring in integrative learning experiences that also engage issues of justice. It is common to see claims that these integrative experiences require higher-order thinking skills; it is my experience that both faculty designing these kinds of projects for the first time and students being asked to do this work for the first time need to clarify what they mean by higher-order. Thankfully, PRIMUS has a paper about using Bloom’s Taxonomy to do this clarification.

The last large sub-theme is a focus on metacognition. Authors asked students to do exam corrections, to do self-reflective grading, and to decenter their thinking with critiquing tasks, while others asked students to write the assessment problems themselves. One paper focuses on concept tests, which are often multiple choice tasks that help students self-assess about concepts without lengthy computation. For a connection between synthesis and metacognition, I recommend this paper about self-assessment tasks to be included in portfolios. Stepping outside of students thinking about their own work, many authors ask students to learn about concepts and learn about writing by participating in peer review work, such as in these papers that offer examples of cross-institution peer review, student-led feedback protocols in history of math, using a research-based framework called “Peer-Assisted Reflection”, and peer review to learn about problem-solving.

Systems

Assessment takes place as part of a system inside each course and as part of education as a system. PRIMUS has published a significant number of papers about the interaction between assessment and other pieces of these systems. To change the stakes, structure, and focus of assessments, many of the ideas discussed above require other course structures to function appropriately. Three connections loom large in the papers in PRIMUS: time and space for formative feedback, grading, and workload.

If a course is conceived as long periods in which an expert transmits information to novices and then opportunities for novices to prove whether or not the information was received, most of the ideas in this post will seem useless. Instead, the authors in PRIMUS generally see learning as something that students do with the support of their instructors. This shifts the goals of assessment dramatically, and it makes most instructors focus on the learning experiences that they have built into courses that lead up to these assessments.

For ideas about lower-stakes course work, I suggest reading the PRIMUS Special Issue on The Creation and Implementation of Effective Homework Assignments (parts I and II). The journal is also packed with ideas for learning experiences that allow for student growth and formative feedback. A common concern with employing these ideas is making the time for them in class, and one approach to this challenge is flipped course design, in which students encounter ideas before class meetings in order to reserve synchronous sessions for higher-order, collaborative tasks. For readers new to the ideas of flipping a course, I suggest starting with the PRIMUS Special Issue on the Flipped Classroom (parts I and II). And here’s an example of a way to embed formative assessment in videos as part of a flipped course. Of course, just wanting to make time for learning does not mean that it will work, as this analysis of a lethal mutation of a flipped course design demonstrates. We’ll return to themes of making time and space as well as themes of perception and affect below.

It will not surprise readers that grading is strongly connected to themes of assessment, in terms of both evaluating individual pieces of student work and combining those assessments into grades. Both of these themes were seen in the discussion of outcomes-based assessment, with the evaluation of individual pieces often being simplified into a binary of “demonstrated/not yet demonstrated”, and with course grades often being determined by the collection of objectives that students had demonstrated by the end of the term.

Many of the papers discussed above related to writing discuss ways of evaluating student writing. In addition, PRIMUS has published papers about evaluating student presentations and proofs, tools that are also useful for metacognition.

Other papers focus on grading systems more directly. Faculty acknowledge that grades have significant impacts for students and as a result grading systems often serve as frameworks for motivation by incentivizing certain behaviors (A, B). Many faculty also acknowledge that grading systems often fail to incentivize approaches we value. In particular, here is a paper that notes some friction between assessments and student inquiry and seeks to address it by engaging students in the meaning-making part of the grading. Here are two examples of papers that seek to design grading systems that support equitable outcomes for diverse student populations (A, B). Another paper goes further, seeking the abolition of grading until the larger university system requires it at the end of a course.

Some readers may be concerned about the time and energy it takes to lead classes with these grading and feedback systems. In PRIMUS, you will find papers about benefits to student learning from collaborative online grading, evaluating writing-intensive assignments, discussing homework with students at the board to shift the responsibility and focus to feedback, how to grade 300 essays and live to tell the tale, and how formative assessment saved one author from a midlife crisis. And for an older paper that critiques these foundations of grading, I recommend “A written test that promotes good teaching: Is that possible?”.

Most of this section has focused on courses as systems, but courses live inside the system of an institution, and the people move between these course contexts. And frankly, universities are conservative places that value continuity, and faculty are often evaluated by systems that assume that teaching means lecturing and giving high-stakes assessments. Faculty, especially junior faculty, need the support of department and institutional leadership as they learn to teach with more student-centered pedagogies; senior faculty will find ideas in the PRIMUS Special Issue on Leading a Department in the Mathematical Sciences. Similarly, students need support in learning to participate in student-centered classrooms, especially where their expectations about assessments are subverted (A, B, C).

The last piece of this system that I’d like to discuss is placement of students in their first mathematics course at the institution. Just like the discussion around assessment within courses has shifted away from high-stakes tests, so has the discussion around course placement. PRIMUS has published a significant number of papers on this topic, and I would like to draw your attention to two of them. First, there has been a shift away from asking “What are students ready for?” and toward “How are we ready for students?” at the same time as a shift from identifying “gaps” in students’ skills and toward identifying their assets. These shifts are critical for inclusive placement practices that avoid reifying fixed mindset messaging. Second, increased attention is being paid to pathways and supports for incoming students, replacing more fatalistic approaches that take the broken “weed-out” systems in older lower-division course structures as given or necessary.

And to end this section with a meta-comment, in addition to assessing students, we need to develop sources of formative feedback for our departments and individual faculty. If we assert that it must be part of learning for students, how can we not apply it to our own growth? If this use of the term “assessment” is a dirty word in your department, I suggest starting by reading this paper that engages common objections to program evaluation.

Conclusion

Change has been coming for years to assessment practices in tertiary mathematics, as can be seen in sources like the recent MAA Instructional Practice Guide, the publications in PRIMUS, and discussions of ungrading. I have avoided defining “assessment” in this editorial because I think this definition is where most of the change is happening. When I was an undergraduate, the implicit message was that assessment was a way to determine for sure, at the end of a course, if I had learned what my instructors had tried to tell me. Now, assessment is about gathering information about learning in progress and about focusing course work on activities that support that learning. Viewed this way, assessment must be low-stakes and valuable, must be structured by a theory of how learning happens, must be focused on our highest priority goals, must help learners and the teachers who support them make choices about how to proceed, and must take into account the systems in which we teach. As a community, we are still experimenting with approaches that help achieve this loftier new vision, and I hope this summary of some of the resources in PRIMUS will help you on your next steps in this journey.

=========

References

Taylor & Francis has agreed to make all of the papers linked below free for anyone to read through the end of April 2021.

These papers are listed in the same order as their links appear above.

Changing Stakes:

Changing Structure:

Changing Focus:

Changing Systems:

Leave a Reply