In recent years, generative artificial intelligence (GenAI) has had a transformative and disruptive effect on higher education. For educators, AI challenges the foundations of how we design, deliver and assess learning. This presents difficulties for universities and teachers as both try to navigate policy changes and apply regulatory frameworks, while maintaining high standards of academic integrity and assessment authenticity.
In a recent paper “Integrating Artificial Intelligence into Higher Education Assessment” (Williams, 2025), I explored these challenges from the perspective of a teacher and presented a case for supporting further integration of AI technologies into assessment design.
In another recent paper, Corbin, Dawson and Liu (2025) explore these challenges from a distinct but complementary perspective – “Talk is cheap: why structural assessment changes are needed for a time of GenAI”.
Corbin et al. argue that most institutional responses to AI have been “discursive”. In other words, they focus on rule-setting and policy without altering the mechanics of assessment. The authors argue that this cultivates an “enforcement illusion” that is often unfeasible to police. They argue for a different approach, with a shift towards structural assessment redesign.
This supports my contention that AI can be integrated constructively into assessment design, promoting authenticity, critical thinking and digital literacy. My aim is to outline the key themes from both papers, highlight some recent contributions to this topic, and to propose some practical recommendations for educators navigating the current AI challenge.
Validity and the Limits of ‘Chat’
The extent to which an assessment measures genuine student accomplishment is now seriously threatened by recent GenAI models, such as ChatGPT 3o, Gemini 2.5 and Claude Opus 4. As Corbin et al. emphasise (and other have repeated in the past), students can now generate submissions using AI without demonstrating genuine capability (or indeed much effort!).
The threat to academic integrity is clear – if ChatGPT can write a first-class essay then what does the grade actually represent? Why are tutors even bothering to mark student’s work?
To combat this, institutions have introduced “traffic light” systems that designate where AI use is prohibited (red), conditionally allowed (amber) or embraced (green).
UCL has one such system, against which all modules and assessments must be assigned a category:
Category 1: GenAI tools cannot be used
Category 2: GenAI tools can be used in an assistive role
Category 3: GenAI has an integral role
While these frameworks aim to provide clarity, Corbin et al. argue they rely too heavily on student compliance and therefore create an “enforcement illusion”. These discursive applications of GenAI rules are easily ignored or misapplied or are too difficult to police.
Even if a student is suspected of using substantial GenAI content as their own work, it’s very difficult to unequivocally prove.
In contrast, structural modifications to the design of the assessment itself may provide a better solution to the GenAI problem. For example, oral defences (vivas), real-time problem-solving and iterative assessments could reduce AI’s functionality as a replacement for student’s own work.
Corbin et al. therefore advocate for assessment structural reform and while I commend and applaud structural reform in university assessment, perhaps the approach championed by Corbyn and colleagues falls a little short of the full potential AI can offer modern higher education.
To me, this is just another form of mitigation, rather a comprehensive integration of AI technologies into assessment design. This mitigation strategy falls within the ‘curious but cautious’ mindset, rather than the ‘optimistic and progressive’ mindset that fully embraces AI into assessment frameworks.
However, Corbin et al., introduce a very welcome conceptual distinction between discursive changes to assessment (modifications relying solely on instructions students can ignore) and structural changes (modifications that reshape the underlying assessment task mechanics). This is a valuable distinction and provides a robust framework that institutions and educators can use to implement assessment reform.
From Threat to Tool: AI as a Learning Partner
Where Corbin et al. emphasise assessment redesign to protect validity, my philosophy encourages embracing AI as a catalyst for deeper learning. If the traditional essay is no longer fit for purpose, let’s stop lamenting its decline and start adopting better alternatives.
In a postgraduate immunology course, I piloted an AI-integrated assessment in which students generated a draft outline using ChatGPT, engaged in peer critique, and then refined their work independently. This blended and iterative approach supported student awareness of academic integrity, encouraged critical thinking and supported AI literacy.
Students appreciated AI as a tool that helped with idea generation, structure and grammar, but noted its limitations regarding lack of specificity, nuance and subject-specific depth. Importantly, they increased their AI literacy, acknowledged the limitations of AI and valued their own academic judgment.
This model moves away from an “avoid or mitigate” strategy, toward an “embrace and educate” strategy. This I consider more sustainable and future-facing.
Instead of banning AI, mitigating for its capabilities or policing its use via institutional “discursive” frameworks, an integrative approach promotes reflective learning and skills associated with analysis and evaluative thinking. Despite nuanced differences in our emphases, there are obvious parallels between Corbin et al., and my own approach. We both advocate for an assessment design approach, rather than a top-down AI policing approach.
However, institutional regulations on the permitted use of AI in assessment and assessment reform that integrates AI are never going to be mutually exclusive.

Academic Integrity: Beyond Policing
Another parallel is that we both challenge the efficacy of AI detection tools. Research has shown that popular platforms like Turnitin and GPTZero are often inaccurate, flagging genuine work as AI-generated (false positives) and failing to catch AI generated content (false negatives). Policing AI use with unreliable tools risks false accusations and undermines trust.
Entering a digital arms race with AI technologies is likely a futile exercise, as well. Detection tools are never going to keep pace with advances in GenAI models.
Instead, both parties agree that a more sustainable and progressive direction for university assessment is to redesign. AI surveillance alone is never going to be the solution. For example, combining AI-assisted output with oral justifications, peer feedback or reflective comment promotes accountability through design.
Furthermore, requiring students to disclose AI use openly can help mitigate academic integrity issues, if placed within a support framework where students are presented with opportunities to engage in responsible AI use. In my case study, students were trained to reflect on their AI use. Rather than being penalised, they were evaluated on how effectively they understood and critiqued their AI-generated drafts. In this way, academic integrity becomes a normal part of pedagogy.
Higher-Order Learning in an AI World
Can students develop critical thinking, creativity and problem-solving skills if ChatGPT does all the hard work?
Again, assessment design is key.
Corbin et al., argue that we must move away from one-shot summative tasks and toward iterative, process-driven assessments. I agree, that if integrated thoughtfully, AI can support higher-order learning by allowing students to focus on analysis, interpretation and critique.
AI performs best on factual or procedural questions but less well on complex, context-rich problems (for now!!). Assignments that require judgment or multi-modal reasoning (e.g., combining sources, visuals and data) remain firmly human qualities.
Recent commentary from Owen and Kay (2025, HEPI) on engaging with AI in assessments also highlight the potential for promoting critical thinking, problem-solving, creativity and entrepreneurialism through critique and adoption of AI technologies. They also question the defensive strategies employed by some HE institutions and acknowledge that students are using AI anyway. If students are using it then why don’t we embrace it?
The ultimate goal is not to prohibit AI use but to ensure students use AI as a positive learning tool.
As Mackenzie and Spaeth recently commented in a European University Association (EUA) post, AI can also promote a more inclusive and accessible education environment, although this is associated with risks (check out the recent EU AI Act!). Students are increasingly exploring AI tools for a variety of study tasks, including note-taking, text summarisation, translation and knowledge acquisition.
AI-literate graduates must know when to trust the tool, when to question it and how to intervene when it goes wrong. Misinformation, bias, shallow subject-specific recall and hallucinations (we should call it bullshit), remain significant issues for large language models (LLMs).

AI Literacy and Graduate Competency
There is a collective emphasis on the urgent need to embed AI literacy into curricula as a core competency. As GenAI becomes ubiquitous in professional contexts, students need guidance and practice in using it ethically, strategically and appropriately.
AI literacy includes:
- Understanding how LLMs generate text (and their limitations)
- Knowing how to prompt effectively (prompt engineering)
- Knowing how to effectively evaluate output
- Recognise bias, misinformation and hallucinations (i.e. bullshit!)
- Crediting AI contributions appropriately (referencing)
- Reflecting on the role of automation in human reasoning
In addition, academics require professional development in AI-literacy skills. For the successful design of new or transformed assessments, adequate training and knowhow is essential.
Professional retooling takes time and is often subject to technophobia or institutional barriers. These challenges must be overcome, and rapidly, otherwise higher education is going to fall short.
Ultimately, universities have a responsibility to train their students in AI technology, otherwise we will be sending graduates into AI-rich workplaces ill-prepared.
Structural Change: What This Looks Like in Practice?
So what does a structurally sound, AI-cognisant assessment look like?
Examples include:
- Viva-style defences: students discuss how they used AI, what they changed and why.
- Patchwork assignments: staged tasks where earlier AI use is scaffolded and critiqued in later phases.
- Authentic simulations: real-world projects where AI is a tool but not the answer (e.g., medical case analyses, problem-solving tasks).
- Comparative analysis tasks: students critique AI-generated outputs versus their own.
- AI reflection statements: integrated into assessment cover sheets (not just to declare use, but to analyse its impact on learning).
Such models blend both Corbin’s structural principles and my integrationist approach. These are select examples, as many possibilities exist to integrate AI into assessment practice.
Final Thoughts: Designing assessments for the future
Corbin et al., warn us not to confuse policy frameworks with pedagogical redesign.
I echo their call for structural innovation but argue that this redesign should fully embrace AI as a positive learning tool.
Instead of building assessments that merely avoid or mitigate for AI, I’m championing a more progressive approach of fully integrating AI into the assessment design, assessment practice and assessment output.
This represents an authentic assessment strategy – students get to use AI in a safe and controlled environment; they receive guidance and training on its appropriate use; and the assessment fulfils the course learning outcomes within an assessment for learning framework.
As Owen and Kay point out, such assessment designs could include collaborative, iterative and group projects, which are not easily reproduced by current GenAI tools. I further agree with the authors that we should recognise that most students are curious and want to explore concepts and themes. AI can help build on this ‘enquiring mindset’.
It’s our responsibility to prepare students to use AI responsibly and critically. Assessments that integrate AI are authentic and can support students to succeed in an AI-driven society.
References
Williams, A. (2025). Integrating Artificial Intelligence Into Higher Education Assessment. Intersection: A Journal at the Intersection of Assessment and Learning, 6(1), 128–154. https://doi.org/10.61669/001c.131915
Corbin, T., Dawson, P., & Liu, D. (2025). Talk is cheap: why structural assessment changes are needed for a time of GenAI. Assessment & Evaluation in Higher Education, 1–11. https://doi.org/10.1080/02602938.2025.2503964
Owen and Kay (2025). Transforming higher education learning, assessment and engagement in the AI revolution: the how. https://www.hepi.ac.uk/2025/07/14/transforming-higher-education-learning-assessment-and-engagement-in-the-ai-revolution-the-how/
Mackenzie and Spaeth (2025). Using AI to serve inclusive education. https://www.eua.eu/our-work/expert-voices/using-ai-to-serve-inclusive-education.html