Good afternoon, a Chathaoirligh, and esteemed members of the committee. I am a 17-year-old student attending Coláiste Chiaráin secondary school in Croom, County Limerick. I am honoured to present my project, "VerifyMe: A new approach to authorship attribution in the post-ChatGPT era," which I presented at the BT Young Scientist and Technology Exhibition 2024 at which I was fortunate to be named the BT Young Scientist and Technologist 2024.
The rapid advancements in AI, particularly in natural language processing, have significantly blurred the lines between human- and AI-generated texts. This challenge became the focus of my project, as conversations with my teachers revealed their struggle to distinguish between student-written work and text generated by AI tools like ChatGPT. They often claimed they could recognise if a student used AI because the writing did not match their unique writing style or ability. This insight highlighted a central flaw in current AI content detection systems in that they do not verify against the author's individual writing style.
VerifyMe addresses this gap by employing a structured methodology to analyse stylometric "signatures" for authorship verification. Stylometry, the study of linguistic style, leverages an author's distinct word choice, syntax and sentence structure patterns to create a textual "fingerprint". My project extracts these stylistic markers from submitted texts, including lexical, syntactic, readability and vocabulary richness features, and uses an AI model trained to evaluate the similarities between sample texts and the presented piece of writing. Unlike existing AI content detection systems, VerifyMe is not designed to identify AI-generated text. It is an authorship verification system that predicts whether a presented text matches the specific writing style of the given author, regardless of whether the text was written by a friend, plagiarised, AI-generated, or a mixture of both.
The urgency of this issue is underscored by recent developments in AI content obfuscation tools such as Humanizer, which makes AI-generated text undetectable by existing AI content detectors. These tools highlight the need for a robust authorship verification system such as VerifyMe, which can discern true authorship by comparing texts to the unique writing style of the author presenting the work.
Current AI content detectors have been shown to be unreliable, with instances of human-written work being falsely flagged as AI-generated and vice versa. This has caused significant issues in academic settings, as evidenced by cases at the University of California, Davis, where students faced false accusations of cheating. Moreover, the absence of independently audited evaluation data sets or official oversight further questions the reliability of these systems.
As a further example, as of the 20 July 2023, OpenAI decommissioned its AI content detection system, Classifier, due to low accuracy. Upon evaluation on its English text challenge set, Classifier correctly identified - true positives - 26% of AI-written text as likely AI-written, while incorrectly labelling human-written text - false positives - as AI-written 9% of the time. Additionally, it was discovered that Classifier disproportionately negatively affected writers who had learned English as a second language and people whose writing followed a more predictable and algorithmic structure. This research stands in stark contrast to the numerous AI content-detection systems that claim evaluation accuracies upwards of 95%.
VerifyMe demonstrates consistent performance in authorship verification, even against advanced AI systems such as GPT-4, by analysing multiple texts from a given author. This stylometric approach holds promise for maintaining academic integrity in the face of evolving AI technologies. Looking forward, it is crucial to address the challenges posed by rapid AI advancements in education. This includes enhancing AI literacy among teachers and students, addressing resource gaps in schools and promoting ethical AI usage. Additionally, developing innovative assessment methods and fostering industry partnerships can help integrate AI effectively into education while ensuring academic honesty.
In terms of future opportunities, AI-enhanced learning tools can greatly assist in personalising education, providing custom-tailored learning experiences for students. Teachers can benefit from AI by reducing their administrative workload, allowing them to focus more on teaching and student interaction.
Furthermore, AI can be a powerful tool for creative collaboration and ideation, aiding students in developing critical thinking and problem-solving skills. However, we must also consider the ethical implications of AI in education. It is essential to teach students about the responsible use of AI and emphasise the importance of academic honesty and integrity, especially in subjects such as English where it is crucial for students to be able to cultivate their own unique writing style.
I also believe that by collaborating with AI industry leaders and integrating their insights into policy and research we can create further opportunities for the application of AI in education. I recommend that we pilot new approaches to examinations and assessments that address AI-related challenges and initiate research projects aimed at developing authorship verifications systems such as VerifyMe, trained on more extensive datasets. My evaluations found VerifyMe achieved an accuracy of 85% and it was trained using the British academic written English corpus and further fine-tuned using the Reuters_50_50 corpus. For background, the British academic written English corpus consisted of approximately 3,000 samples of student writing. It was collected between 2004 and 2007 in the UK and it consisted of approximately 1,000 unique authors. The Reuters_50_50 corpus I used has a curriculum-learning training approach. It is more specialised and focused on financial writing from the Reuters news agency with 50 authors and 1,000 unique texts. With more extensive datasets, we would be able to train a more capable system in authorship verification, especially as these AI content detection systems continue to advance.
The advancements in AI present both challenges and opportunities for education. By embracing these changes, thoughtfully and responsibly, we can enhance the learning experience while safeguarding the principles of academic integrity. I thank the committee for its time and attention. I look forward to discussing how VerifyMe can contribute to the future of education and academic integrity in our changing world.