Key Takeaways
- Focus on Human Skills: Collaboration, creativity, and critical thinking are now essential in hiring.
- AI Cheating Risks: Tools like ChatGPT challenge assessment integrity, especially in text-based and video formats.
- Assessment Vulnerability: Logical reasoning is low-risk; verbal, numerical, and situational assessments are high-risk.
- Mitigation Strategies: Use video-based delivery, proctoring tools, and final verification testing.
- Adapt to AI: Hiring processes must evolve to balance AI advancements with fair candidate evaluation.
Introduction
GenAI is changing what we need to measure and how.
Amidst a world already changed forever by digitalisation, we stand on the precipice of the latest wave of change driven by Generative AI (Gen AI).
As roles evolve and AI helps with repetitive and formulaic tasks, the quintessentially human aspects of work such as collaboration, creativity, and critical thinking not only become paramount but also become the focal point of assessment in hiring processes.
This leads us into a paradox, where the human edge battles to assert its worth amidst AI's relentless tide.
Alongside this, Gen AI affords candidates much greater opportunities for help, and indeed cheating, when completing assessments. In talent acquisition, this gives us a great deal of change to manage both in what we need to look for in candidates, and how we maintain the integrity of decision-making in the face of Gen AI.
The opportunity for a 15-30% productivity surge across various roles isn’t merely a statistic; it points to the need for aseismic shift in our perception of ‘talent’ and its acquisition.
Recognising and Valuing Uniquely Human Traits
In this burgeoning AI age, the human edge derives its potency from inherent skills and attributes that elude even the most advanced algorithms.
The legacy mindset for ‘what good looks like’, once tethered to academic prowess and linear problem-solving, is not just being revised - but overhauled. In doing so, it is giving precedence to 21st century skills that are irreplaceable by AI. Below this transformation lie five critical human differentiators: Collaboration, Creativity, Adaptability, Complex Problem Solving, and Multi domain Critical Thinking. These five pillars have catapulted into prominence, emerging as critical dimensions of talent in the new AI-infused work environment. Let's consider each:
Collaboration
With technology becoming part of the glue that connects teams, the capability to navigate across diverse groups, forge connections, and cultivate collective ingenuity becomes vital. It’s not merely about cooperation but about fostering an environment where varied perspectives intertwine to spark innovation.
Creativity
As AI shoulders the more repetitive tasks, the human ability to envisage, innovate, and traverse beyond the conventional pathways becomes vital. Creative thinking propels organisations into novel territories, uncovering unprecedented solutions and pathways.
Adaptability
An attribute interlaced with resilience and foresight, adaptability encompasses the capacity to morph strategies, learn and unlearn swiftly, and steer through the ebbs and flows carved out by technological advancements.
Complex Problem Solving
While Gen AI excels in linear problem navigation, human discernment becomes paramount in disentangling complex challenges, weaving through intricacies, and sculpting balanced, innovative solutions.
Multi-domain Critical Thinking
In an era drenched in information, dissecting, evaluating, and forging decisions becomes pivotal. Multi-domain critical thinking facilitates ethical, informed decision making, navigating through informational chaos with acumen.
As AI simplifies tasks, the consequential human skills that become critical for performance are embedded in these attributes, necessitating a recalibration of assessment strategies in talent acquisition around ‘what we measure’.
The Assessment Conundrum: A Battleground against AI Cheating
However, as assessments pivot towards identifying these human traits, the dark side of AI creeps into the arena, threatening the authenticity of assessment processes. Technologies like ChatGPT, while designed to simulate human-like responses, inadvertently provide a mechanism to help ‘cheat’ the system, allowing individuals to parrot sophisticated, AI generated responses in assessments and recorded one-way interviews.
The dilemma here isn't merely about the existence of technology that can replicate human responses but the ethical and procedural conundrums that arise in talent assessment and acquisition.
Two primary challenges surface: ensuring the credibility of assessments and safeguarding them from the menacing shadow of AI-generated cheating. The industry, caught between denial and panic, with some asserting the robustness of their existing test content and others capitulating to unvalidated alternatives, is desperately seeking enlightenment through an evidence-based approach.
ChatGPT and similar LLM chatbots pose tangible risks to some types of talent assessment. We have faced new technology and concerns around cheating before, but the scalability offered by Gen AI is the big risk. Previously a candidate would need to find a suitably talented friend to complete an assessment for them, but now there is an incredible resource available to everyone on their phone or laptop via Gen AI.
To put it plainly, ChatGPT is constantly evolving and improving, and the pace at which it does so will only accelerate. For example, October 2023 saw an update to the way in which it can interpret images; a change which has huge implications for assessment.
So, to shed genuine light on this matter, we need to be evidence-based to understand the true scale of the challenge. To achieve this, the Sova team conducted their own ‘hackathon’ of different types of assessment content to understand the risks and identify mitigations.
The dilemma here isn't merely about the existence of technology that can replicate human responses but the ethical and procedural conundrums that arise in talent assessment and acquisition.
So What Did our 'Hackathon’ Discover?
Verbal reasoning and situational judgement
Using Chat GPT4 as the Gen AI tool, it was clear that certain types of content presented a high level of risk. Verbal reasoning and situational judgement questions are both unsurprisingly very verbal in nature. Where there is an image, Chat GPT can now read and interpret that image. In these question types, for frontline roles such as customer service, Gen AI was very capable at providing good answers. This is clearly a risk for volume assessments for simpler roles. For graduate level questions the Gen AI answers were smart enough to pass, often scoring in the medium percentile range.
Numerical reasoning
With regards numerical reasoning, ChatGPT is now adept at recognising and drawing accurate and detailed observations from graphic information, the most significant impact on assessment from the October 2023 update. In these question types, Gen AI is capable at providing good quality answers, and these assessments should be classified as medium risk.
Logical reasoning
Conversely, Gen AI is not yet competent at logical reasoning assessments. For now, ChatGPT struggles to correctly recognise patterns presented in an image format, and therefore has difficulty drawing any accurate observations from them. It would be challenging for candidates to describe the information verbally, and as such, logical reasoning assessments remain low risk.
Video interviews
Video interviews were also problematic, and we have already heard of anecdotal evidence in the real world where employers have received multiple identical answers from different candidates, with the same answers delivered almost verbatim. Answers were cross-checked and found to be very similar to the answers provided by Chat GPT.
Recorded one-way interview may make it easier for candidates to seek ChatGPT’s help in answering interview questions as they can swiftly enter the question in a Gen AI tool and generate a script to read out. This can partially be managed by limiting interview preparation time, recording any preparatory time also and using proctoring tools, but regardless, it does present significant challenges.
The focus needs to be on taking the right steps to make it impossible or very difficult to cheat, so the confidence of hiring managers in talent acquisition can be maintained. The key step for organisations is to harden and secure their assessment journeys.
Moving to a Seamless, Digitised Approach
In summary, the lowest risk assessment modalities include logical reasoning, video based situational questions and live observed assessments. Personality assessments represented a moderate risk because although they are verbal, sophisticated prompting is required to identify answers that will be the best fit to a role.
Conversely, the highest risk assessment types were text-based assessments including verbal or numerical reasoning, situational judgement, and recorded one-way video interviews. The key to managing these risks hence relies on robust mitigation and hardening assessment processes. There are four key ways in which this can, and needs to be, addressed:
- Minimising risk on primary device: Assessments can include periodic 1-way video to check what candidates are doing, copy/paste block on the assessment application, and other tools like keystroke analysis and proctoring software. All these make life hard for anyone trying to cheat, but don’t 100% solve the issue.
- Minimising risk on a second device: This is important as a second step, given someone can simply take a photo of the first device’s screen (e.g. a laptop) using a second device(e.g. their phone). The image can be quickly interpreted by ChatGPT and prompted for an answer. More sophisticated proctoring can help here, but the key mitigations are around assessment design.
- Redesigning the way assessments are presented: The key vulnerability lies in text and images. Pragmatic mitigations can include switching situational judgement and verbal content into video-based delivery, where captioned text is included in video form so ChatGPT cannot process this content. Additionally recording the assessment session and limiting interview preparation time on video interviews is key.
- Reshaping the assessment process: Many who recall the introduction of internet-based testing first time round, will recall the concept of final verification testing. Within hiring processes, there is likely to be more focus on verifying someone’s capabilities at the final selection or induction stage, to be sure the candidate really can show the performance seen remotely. This may only be needed for higher stakes roles but has the advantage that this is relatively scalable, as it would only be required for the final hires.
Conclusion: Adjusting to an AI-Driven World
Gen AI has undoubtedly redefined work, amplifying the need to reassess and recalibrate what is sought in candidates. Technological change is driving a revolution in professional and managerial work. What good looks like has fundamentally shifted. It is critical to refresh the profile of early hiring – the future leaders of tomorrow –and experienced hiring, with the right capabilities for this new landscape.
Alongside this, how we assess must change given the utility of Chat GPT in helping candidates pass some assessment types. It is now an urgent priority for employers to refocus and harden their approaches to satisfactorily mitigate these risks.
The future, while potentially bringing many exciting opportunities, mandates a meticulous, ethical, and philosophical exploration into how we evaluate and understand human potential amidst the ubiquity of artificial intelligence.