Book a Demo
S4 H Ep 42 Social Tiles Website 5

AI hiring legislation- What you need to know! With Dr. Richard Landers & Mark Girouard

In this episode I welcome two different but complimentary perspectives on the legal aspects of employment testing that are happening in an ambiguous and seemingly ever-changing environment. While the conversation is oriented towards the ever controversial and somewhat frustrating NYC algorithmic hiring law that still does not have a firm date for enforcement, it is really a much deeper and headier conversation about the use of AI models in employment decision making. More specifically, about the legal responsibilities of those who provide and use these tools.

There is no one better to hash over these important topics than my two guests- Dr. Richard Landers, a professor of IO psychology at the University of Minnesota who also specializes in computer science and legal audits of selection systems. And- Mark Girouard- a lawyer at Nilan Johnson Lewis who specializes in employment law as it relates to hiring and the use of predictive decision-making tools.

I have worked with both of my guests on various projects related to our topics today and can assure you that they are on the top of the heap when it comes to expert, practical, and credible advice on these extremely complex and ever evolving issues.

To really absorb all the wisdom on the controversial but critical topic of legal compliance in today’s crazy world, you really have to listen to this episode. So, what are you waiting for, tune in now!

Who Will be Affected by This New Legislation?

It is really worth noting here how very important these topics are for all parties involved, vendors, employers and job seekers. We are all in this together and as technology evolves – we will continue to find things moving increasingly towards Gordian Knot territory. But at the end of the day, we are (mostly) sharing the same goal- to help ensure people have a fair chance to be hired to do jobs for which they are qualified using accurate and efficient tools, an action that results in more productive organizations with happier workers.

Which is exactly why entities such as New York City have begun the herculean task of creating oversight and regulation. Here in 2023 the pandora’s box has been ripped right open and we are now in the thick of it, whatever “it” really is.

Today we seek to get our listeners some practical advice to help them cut through the fog. We begin by discussing the NYC law itself as we have all been inundated by nervous inquiries from vendors and employers alike about what this means and what they should be doing.

What Are the Next Steps?

When it comes to the NYC law, comprehension, and planning is a moving target- even in the weeks since this was recorded, new changes are happening. For those listeners who are interested you can track the dialogue happening amongst experts and those impacted in many places online.

But the ambiguity remains a constant, and we really can't be sure when the law will actually go into effect or what it will say.

For now, what should we do? The answer is a bit different for vendors and employers.

For vendors- no amount of expert whitepapers or third-party audits of your tools will satisfy requirements for an audit because one part of this law that we don’t expect to change is the fact that the audits must be done locally by employers. The best thing vendors can do is to make the effort to build good tools that are based on sound training data and to ensure their teams include credible IO psychologists and other experts on fair and unbiased hiring practices relative to the EEOC’s Guidelines.

For employers- this means following the same best practices we have been mandated to follow for decades. These practices include regular localized validation studies and bias audits.

For both parties the difficulty comes in the fact that things are now infinitely more complex and there are more stakeholders involved now that an explosion of predictive hiring vendors are entering the scene with a litany of AI based tools that are much harder to understand and now have computer and data scientists at the helm.

The Importance of Transparency

Beyond following best practices, my guests and I agree that the common thread of compliance to the law, whatever it ends up saying, is transparency. For employers this means informing their job applicants of what tools are being used and for vendors it means opening the kimono as to how their tools were created, how they work, and the results they have shown to whomever is interested. This is nothing more than an extension of the trend of data privacy and ethical use of AI that has permeated pretty much every sector of the global economy.

The key nowadays, and the reason I convened these guests, is to bring all these parties to the table to hammer out a shared responsibility that can be placed into practical action. I see this as transcending any discussion of the specifics around the NYC law.

Key Takeaways from Our Discussion

Mark suggests that issues such as the ones raised by the NYC law will continue to drive federal efforts to intervene in compliance.

But now in the Biden administration, we've seen the EEOC really focus again on artificial intelligence, and they just released their strategic enforcement plan for the next three years. And artificial intelligence and machine learning is woven through that entire plan. So, it's pretty clear that that's going to be a significant area of focus through the EEOC as well in terms of kinds of commonalities between these different schemes.

At the same time Dr. Landers explains why such oversight is a real challenge and that it is not going to get easier anytime soon.

I don't think that there's really ultimately anything different other than it's more complicated than it used to be. Well, a lot of what we're seeing now is what I would call a moral panic of many other technology related moral panics we've had over the years where even at the dawn of the internet, there was concern, oh, children will never learn ever again because all information is on the internet. It's almost like an AI uprising concern. That's not how this will end, but the path to get to wherever it's going to end is murky and windy and unclear. The questions for me right now are what do we do in the meantime as we're working toward that? How do we ride the waves?

Mark explains the critical importance of the responsibility vendors have to their side of the equation.

The OFCCP has said, we don't really care what the new technology is. You've still got to validate it consistent with the uniform guidelines. Now, there are practical challenges to doing that to some of these for some of these tools. And I think the biggest challenge is that many of the tool providers will say, we don't need to do a job analysis study because basically our, we'll feed it all your data and it'll figure out what's using. The trouble with this type of thing is that it's going to make correlations and effectively it's valid because it's valid. In general, there has to be some local study and then some sort of theoretical basis for why the characteristics that the tool is screening for our valid characteristics. If the response they get from the vendor is, ‘We don't have to do that because our tool is valid because it's valid and we don't have to do local studies because it's valid for all jobs.’ That's when I raise the red flags and say, I don't want to have to stand up in court and defend that because that's not what the Uniform Guidelines say.

And transparency and compliance also come down to the lens through which it is being looked at. This is a critical aspect of things that is often not considered. Richard explains why differences between the semantics of how computer scientists and IOs interpret the world are contributing to the current state of things.

Part of the cause of the trouble <laugh> that people run into is again, that difference in language when you ask, is this valid to the average data scientist? That means did it predict the thing it was supposed to predict end of problem. When they create a model that reaches some threshold that they've internally defined as being a successful model, it is by definition valid within many data science frameworks. There was a distinct educational component of a lot of that auditing work to say that, well, these words mean different things to the people you're trying to sell this product to. You need to have a better understanding of that as well as how that interacts with the legal requirements that you're trying to operate within.

Mark weighs in on related tactics that can have a major, and unfortunate result for vendors when it comes to compliance with hiring bias.

They're actually talking about not creating a tool that has less potential for bias, but creating a system where once someone has completed the tool, you'll change their score to make sure that the score is not biased. And I haven't seen a lot of that, but I've seen a few where there's a little bit of a thumb on the scale post use as opposed to a refinement. Then you're crossing over into a tool that may have disparate impact to a tool that is built in disparate treatment. Because you've said, we have decided to change the scores of non-diverse candidates or to change the scores of male candidates to make sure that they are not creating bias in our system.

People in This Episode

Catch Dr. Richard Landers and Mark Girouard on LinkedIn.

Read the Transcript

Announcer:

Welcome to Science 4-Hire with your host, Dr. Charles Hander. Hiring is hard. Pre-hire talent assessments can help you ease the pain. Whether you don't know where to start or you just want to stay on top of the trends, Science 4-Hire provides 30 minutes of enlightenment on best practices and news from the front lines of the employment testing universe. So, get ready to learn as Dr. Charles Handler and his all-star guests blend old-school knowledge with new wave technology to educate and inform you about all things talent assessment.

Dr. Charles Handler:

Hello and welcome to the latest edition of Science 4-Hire. I am your host, Dr. Charles Handler, and I have two guests today, not just one, but two guests that are experts in the field of litigation support and bias audits and all that good stuff. And we're going to talk about a smorgasbord of topics, but we're going to focus on the New York City algorithmic hiring leg legislation as kind of the first point of contact jump off into what I expect to be a really interesting conversation. So, I'll go ahead and let my guests introduce themselves and then we'll get going. So, Richard?

Dr. Richard Landers:

Sure. Hi. Yeah, I'm Richard Landers. So, I'm a professor of industrial organizational psychology at University of Minnesota, and I conduct research on technology broadly, but particularly AI at the moment. Also, president of Landers, workforce Science and we do mostly audits of products that are used for hiring using ai.

Dr. Charles Handler:

Cool. And Mark.

Mark Girouard:

Great, thanks Charles. So, I'm Mark Girouard. I'm an attorney at Nilan Johnson Lewis. We're a law firm based here in Minneapolis. My focus is on labor and employment law, in particular representing management in employment law issues including a range of issues around selection and screening of applicants for employment, which touches on this artificial intelligence space. So, I've been advising both employers on their compliance with the New York City ordinance and other AI focused laws. And then I also have some clients who are assessment providers, and so advising them as well on what they have to do to help their clients comply with the laws.

Dr. Charles Handler:

Very cool. And I kind of sit in the middle of these guys and connect. I've actually worked with both of you all on various projects for little touchpoints and have really admired both your opinions and work. So, I feel lucky to have you on here today. And we're just going to take a little journey and I think the journey's going to start with the New York City legislation stuff. And my phone started to ring six, eight months ago, and all the clients that I advised saying, what in the heck are we supposed to do with this? What does it mean for us? How can we make sure we're doing things correctly? And I came up with some answers, but the answers, they were just so vague because the ordinance itself was relatively vague on some pretty critical topics. I'm curious from both your perspectives, what your various clients and constituents had been thinking about and how you were able to advise them. Again, we know that there was enough gray area that things have been thank goodness tabled while we kind of work it out. Air quotes what that means, who knows? I was not surprised to see it suspended for a minute, and I'm actually pretty happy because when I tell people how to comply, certainly there's a prescription there, but you can't know for sure if you're getting it right. So, I'm just opening it up. What did you see about it and what were your thoughts?

Dr. Richard Landers:

I'd open by saying that there's very different concerns between the different sort of constituencies here. My major clients tend to be consulting firms that are providing these hiring tools to other companies. Oh,

Dr. Charles Handler:

Okay.

Dr. Richard Landers:

Certainly, the ones that are being asked, what are we supposed to do to soothe the fears of these companies that are using our products? And so, I have less experience on the internal kind of side, but in general there's certainly been a lot of frustration trying to figure out how to do that. The original framing of the New York City law seemed to include pretty much anything that used numbers. It was kind of my

Dr. Charles Handler:

Interpretation. <laugh> exactly, me too.

Dr. Richard Landers:

So, it seems in the most recent revision, and there's still more revisions probably common, but the most recent, that's at least a little bit tighter, but it definitely is making people nervous.

Mark Girouard:

Picking up off of Richard's point there, most of the questions I've been getting from clients, and this is both clients who are employers and clients who are assessment providers, are our tools covered? And as Richard alluded to the ordinance as drafted, basically if math is used in the decision-making process or conceivably in the tool development process, that was enough to call it an automated employment decision tool. Yeah, I think one of the positive changes we saw with the first set of proposed rules, which is carried over to the second set of proposed rules that was issued in December, is there's now a definition of what is machine learning or artificial intelligence that really does seem to narrow the field and frame in on the things that I think most of us would think of as AI or the use of AI in hiring. So, it's where the tool itself is defining the inputs, it's using sets of training data, those kinds of things.

Where you think of that is artificial intelligence as opposed to a traditional personality test that may have been validated using regression analyses to determine and refine the scoring algorithms, which under the broader definition of the ordinance could have been swept in and basically any pre-employment test could have been swept in. So, I think that's been a positive change and has given clients some direction. I also think there's been more clarity, which was really welcome on what is the threshold for something being a decision tool. So, there was some question is if your tool is one data point in the decision-making process, is that enough to get you under the ordinance? The rules and revised rules make clear that it's really when it's determinative or at least the most important factor that would trump other factors or trump human judgment. So, I think that that's been a positive change as well. Now I should say from the hearing that happened just this, I think it was just this Monday, January 23rd, there was feedback from commentators saying they felt that that has narrowed the ordinance too much now and they want to broaden it back out. So, there's definitely a push and pull going on here between what I would call the plaintiff's council, the folks who are really focused more on the, I'd say openness issues on the one side and then the employers and consultants on the other side.

Dr. Charles Handler:

And I would say those are the exact points I would bring up when I had advised folks, I was basically saying, look, if your tool or the tool that's being used here does fall under this, if you want to be a conservative interpretation here means anything for both of the two things that you were talking about, even though it may not be an automated non-com compensatory hurdle. And even though it may be a locally validated thing that is all done by humans to set it up. So, I think that's good motion. I would honestly prefer it to be somewhere right around where the direction it is right now based on the latest round of stuff. The interesting thing I'll add from my experience too is that, and now being part of a vendor, which is a new thing for me, people reaching out, the vendor themselves can do nothing.

I mean, basically having somebody come in and do an audit of how we do things and say, this is how they do things, won't let you pass that law. It has to be a third party that comes in and takes the data and looks at the data and says, yeah, this is kosher here, this is going to work. So, the vendors themselves are somewhat powerless and you can get all the position papers and opinions and things you want certainly can't hurt. But in my opinion, if you're actually in court or in a litigious situation, those papers don't really mean that much other than maybe making you a credible witness or something. So that's an important thing I think to think about. And my old firm, we did third party audits all the time, that was a big piece of what we did.

Dr. Richard Landers:

Yeah, it's something that's really changed. So, audits of course are a much older concept. So, we traditionally did IO psychology-based psychometrics audits. So more or less concerned with the validity of a measure, does it really do the things you're claiming? And then we help provide white papers and documentation of that and that fundamentally is still useful <laugh> assessment development. So that's not really,

Dr. Charles Handler:

Yeah,

Dr. Richard Landers:

Always. But yeah, the question now, is there a legal burden to prove something more specific than that rather than just being a service to show to a customer, oh look, we have science behind what we're doing. So that change is certainly an important one. I will say that I'm actually not quite so optimistic that regression-based tools are not covered by the current version. It depends a little bit on the interpretation of some specific words that are not defined very clearly. So, at the language differences between the data science community and the traditional assessment development community mm-hmm. Become a really critical issue here. So, the last draft of the law that I saw, <laugh> has, for example, the terms cross-validation that yeah, this only applies if cross-validation issues. Well, that means very different things to different people. So, if you take a pure data science perspective than any amount of using a dataset to regress performance on our tool when developing an algorithm and then we applied it with some clients, that wouldn't really be cross validation in the data science sense, but it depends on who's reading that line and it's just not defined very clearly.

Dr. Charles Handler:

So yeah,

Dr. Richard Landers:

I think there's still a lot of ambiguity and the interplay between all of the different people who have a vested interest in these tools is fascinating to watch. I will say if you read through the comments that have been left on the various versions of law, you can see the constituencies clearly with groups saying, we absolutely should only define very carefully exactly what AI is and they should only apply to them. You often see that message from the more traditional assessment kind of companies in groups, but you also see these more ethics watchdog groups saying this is just the first step in having this kind of transparency with literally every hiring instrument, every assessment used anywhere. And they're clearly pushing for a broadening of implications. So, I don't know where it'll end up, but it is certainly not settled at this point.

Dr. Charles Handler:

And again, from my standpoint of advising, and this is an interesting bias I probably have, is I advised, look, you should have your audit done by a firm that understands testing and this paradigm of IO psychologist. But I know there's other firms, Kathy O'Neill, I had her from Orca, she was a guest here, and she takes a very different approach. It's stats heavy, but it's also very just looking at the big picture. And so, auditing can also mean a lot of different things and a lot of different ways to do things. And I, it'd be so interesting to see the same data set audited a more pure data science type third party auditor versus someone like myself or yourself, Richard, that well, you have way more computer science skills and stuff than I do. So, we're a lot more limited my <laugh> how I do things.

Dr. Richard Landers:

You

Dr. Charles Handler:

Might, I mean you might trip over both.

Dr. Richard Landers:

Now there are certainly examples of companies that have done that. Some of my clients actually a few have even had Kathy do audits for them and I also did audits for them. I don't know I can reveal any of that information, but I know it's been done precisely because the companies are in the same position. They don't know which of these is really the better information for them, or maybe they're complimentary or it's just unclear at this point.

Dr. Charles Handler:

But Mark, what's your experience been with audits and what thoughts, or advice do you have related to that kind of thing?

Mark Girouard:

So, getting back to Richard's point, I think there are what we traditionally think of as audits, which is looking at is the tool valid? Is it doing what it says it's supposed to do? The audits in the context of the New York City ordinance, it's really more about looking at is their adverse impact? And this is where I think the ordinance is unique. Then publishing that information online,

Dr. Charles Handler:

I know

Mark Girouard:

Saw this, and I think what's particularly concerning for many of our clients is, well, first of all, there's a lot of gray areas still despite the second set of proposed rules about what the data are that you're using to assess adverse impact. If you're using historical data, when is that appropriate? If you're using test to data, which is not really defined other than the fact that it's not historical data, when is that appropriate? But at the end of the day, employers are going to be putting information online that is going to and or suggest that a tool doesn't have adverse impact where it does because of the way that the tool or the ordinance has defined that bias audit reporting requirement. So, it's I think, not dissimilar from a lot of the legislative action we're seeing. There's really a focus on transparency, and I think that runs through the Illinois Video Interview Act. A lot of the biometric information privacy legislation we've seen that folds in artificial intelligence and then this act as well, New Hampshire just proposed a law that basically mirrors the New York City ordinance that will have a bias audit component. All of them have this idea of transparency floating through them. But what I worry is that the information that the New York City ordinance is requiring to be published in these bias audits isn't going to lead to any greater transparency and lesson until there's more direction about what that actually looks

Dr. Charles Handler:

Like. Maybe the transparency thing is just kind of a reaction to the black box idea. It's like if you got a network or a giant black box, we don't know what's going on in there. We want transparency, dammit. So, show us what you can get. I think also from the audit, I mean we've all done that, those kind of work. I mean companies that is some of the most tightly held privileged information because let's face it too, a lot of times when we do these things, we don't have enough people to fill these cells, if you know what I'm saying. There's not enough combinations of people to be able to get a read on if there's adverse impact at all, because you then look at that and go, well, maybe there's a systemic problem here because we don't have any plaid people in here to look at. So, the idea that they would be publishing that stuff is probably pretty abhorrent to them. And then the burden of just we do validation studies, it's so hard to even get the data it takes forever and it's a lot of cajoling to get the data set of even getting people to sit for the test and a concurrent situation to think they have to do that yearly. Sure. Organizations would feel like, oh my god, that's a lot of extra crap that we have to do. So, I can see those things. So, what about the groundswell from this, right? Mark, you started mentioning other states. I mean, I'm waiting for California kind of the leader in liberal person protecting laws to come up with something. So, what y'all seen in a broader sense, these are harbingers of potentially more stuff to come.

Mark Girouard:

Yeah, and I think we've been talking about the New York City ordinance. As I mentioned, New Hampshire has some proposed legislation. The District of Columbia's considering an ordinance, although that's much more focused on personally identifiable information than on privacy concerns. The California Department of Fair Employment and Housing has proposed regulations that basically take their current regulations and say this when we say X, it also includes artificial intelligence. So, they're just making clear. And then I think most importantly, this is an issue that the E E O C has had been interested in during the Obama administration. It was the subject of a couple of hearings and then things came to a halt during the Trump administration. But now in the Biden administration, we've seen the E O C C really focus again on artificial intelligence, and they just released their strategic enforcement plan for the next three years. And artificial intelligence and machine learning is woven through that entire plan. So, it's pretty clear that that's going to be a significant area of focus through the E E O C as well in terms of kinds of commonalities between these different schemes. And they do vary from state to state and locality to locality, but I'd say some common themes are, again, transparency. And that's both letting the candidate know that AI is being used in the first instance, and then many of them also say, and then you also have to say what the characteristics are that the tool is screening for.

And that opens up a really interesting can of worms in terms of artificial intelligence that I'm sure Richard can speak to better than I can, but oftentimes employers and even tool providers don't know what the characteristics are that their tools are screen screening for. The third part I'm seeing often running through these is the idea of informed consent. So, this is the flip side of transparency, but not only do you have to be informed that AI is being used, but you either have to consent to its use or be given an opportunity to opt out from its use. And then finally, I would say earlier it seemed that the focus was much more similar to the focus under the uniform guidelines on employee selection procedure for sort of more traditional selection procedures that the focus would be on disparate impact based on race and gender. We have seen, I'd say, an increasing focus on disability issues. The E E O C issued some guidance this year on the use of artificial intelligence and disability. Part of that really comes back to transparency to employers providing enough information about the tools they are using so candidates can make an informed decision whether they need to request an accommodation or not. But it gets really complicated with when you're talking about sort of neurodiverse candidates who may have a medical condition or disability that impacts, for example, how they would perform on a timed test.

Dr. Charles Handler:

Right.

Mark Girouard:

Does that mean you now have to say this is a timed test? The EOC would say yes. In fact, the EOC goes farther and says that employers should affirmatively identify the disabilities that the tests they're using may implicate, which I think that's a step too far. But I think it's another good indication that this transparency theme runs through all of these statutes and whether it's letting folks know that AI is being used, but I think increasingly letting them know how AI is being used so they can make an informed decision either to opt out if the statutory scheme allows that, or in the disability setting they can make an informed decision whether they need to request accommodation

Dr. Charles Handler:

Or suitable in the New York thinking was they get a suitable alternative. So, I'm like, okay, what's that structured interview? That's what I said. So, Richard, one of the differences I think between yourself, and Mark and I is that you do research. So have you done some interesting research lately that kind of ties into any of anything you want to share from that or

Dr. Richard Landers:

So well, my focus has mostly been on how AI can actually help or hinder or is even different from what we've already been doing. And the conclusion that we generally come to is this is really just an evolution of techniques that we already have and that we've been using for some time. I mean, many of the problems that we've actually just been talking about aren't really new. So there have been concerns, for example, even about personality testing. How do my answers on this questionnaire result in something that tells you actually something about me? And there's already sort of an obscurity layer where for the people consuming the tests don't have a complete understanding of exactly how it works, but the difference being that you could still simplify to that level to say, oh, you have numbers. We calculated an average, and now you have a score.

And that's at least understandable even without any expertise. Now when we're moving to things like computer vision models where you're trying to predict from videos or pictures, there is a much more complicated process to get to that point where instead of questions, we're now breaking down your videos into blocks of color and there's color patterns that are being used to create predictions. And it's just much harder to understand from a layperson perspective, but it's just an evolution of this problem. We've already had <laugh> at the end of the day. So that challenge is one that I don't think is going to go away. We tend to conceptualize in terms of classes of thing. So, if we're looking at computer vision, we ask questions like, well, are we capturing facial expressions in this computer vision model? Is there evidence that we're capturing whether the person has glasses or facial hair?

What is actually being picked up by these models? And the only way to diagnose it is too often to send fake data or sample data, or for example, faces that have been generated with without glasses to see what it does. There's all, there are these ways to poke it models to try to understand what they do, but we can't use the explanations we used to of you put numbers on a page and we've added them together. It's always going to be more complex than that. Yeah, I don't think that there's really ultimately anything different other than it's more complicated than it used to be. Well, a lot of what we're seeing now is what I would call a moral panic of many other technology related moral panics we've had over the years where even at the dawn of the internet, there was concern, oh, children will never learn ever again because all information is on the internet.

That's clearly not how it's worked out. So, the same sort of situation's happening here, there's not this concern that, oh, there's going to be computers magically making decisions and no one will know what they're doing. And it's almost like an AI uprising concern. That's not how this will end, but the path to get to wherever it's going to end is murky and windy and unclear. Right now, we're going to see more and more state laws, maybe see some federal action. E E O C is going to get increasingly involved, and that will all come to a head at some point in the near future. The questions for me right now are what do we do in the meantime as we're working toward that? How do we ride the waves?

Dr. Charles Handler:

Yeah, I think that's a good point. I mean, dust bowl and empiricism is kind of just empirically things, and then making predictive models has been around for a really long time. Like you said, it's more complicated. I think I fall back on an easy answer even in these complex times, which is do things by the formula that we know works and that we know is good for compliance. So, what are you measuring a job analysis, your content validation, your empirical validation, just following the rules, and they're not bad rules. Actually, the flip side of it is predictive accuracy, lack of bias. I mean, they're not arbitrary rules that people just made up. So, from that perspective, it's somewhat easy. I think it just gets harder and harder when the predictors are so insanely complicated and different than what we might have seen before. And it gets cold.

When you think about if you look at facial recognition and just the error rate of that in any place they use it, that's a very easy poster child to say, look, this is how wrong this can be. So, any other place it is can be wrong. And I think that's an important thing to keep in perspective. I have found companies being pretty hesitant. The ones I've worked with that are global enterprise companies, they often have an innovation lab or something that wants to trial some of these tools, but they typically are a little bit shy of using this type of stuff. What is your experience, mark? I know you counsel a lot of companies. I mean, they come to you and ask, should we be using this? Is it dangerous? What kind of dialogue in general do you typically have with folks in that regard?

Mark Girouard:

It's a great question, Charles, and I'd say it's evolved over time. There was probably a, let's go back maybe five or six years, I was getting calls from clients saying, we're putting our assessments out for RFP, and we've got these two providers that are AI-based tools, and their pricing is coming in at a quarter of what the traditional assessments are. And so, we really want to look at them. And I would say, that's great. Can you get a copy of what their uniform guidelines compliant validation study reports look like? And they would ask the vendor and there would be basically a deer in the headlights moment where the vendor would have no idea what they were talking about because the vendor was coming entirely from a data sciences space, didn't have any IO psychologists, and if they had attorneys, they were concerned about protecting their intellectual property.

They weren't knowledgeable about this space. The evolution I've seen is most of these firms are now savvy to the fact that there is a compliance regime out there and that there are things they are going to have to do to help their clients comply with their legal obligations. I've seen many of the AI-based selection firms now bring in house IO psychologists so they can do that work. And I'm getting a lot less of that deer in the headlights look. And to be honest, this was Charles coming back to your point about the things we've been doing do work. The OFCCP has said, we don't really care what the new technology is. You've still got to validate it consistent with the uniform guidelines. Now, there are practical challenges to doing that to some of these for some of these tools. And I think the biggest challenge is that many of the tool providers will say, we don't need to do a job analysis study because basically our, we'll feed it all your data and it'll figure out what's using again, that sort of ible and empiricism, it's going to make correlations and effectively it's valid because it's valid.

So, trust us, just based on my own experience with E E O C investigators and my reading of the guidelines, I say, we've got to do some sort of local study. We have to know what characteristics are important for success in the role and the guidelines. As you say that there are some types of things you can kind of assume are important, so things like attendance, but in general, there has to be some local study and then some sort of theoretical basis for why the characteristics that the tool is screening for our valid characteristics. And at that point, when if you can plug into that kind of framework and write something that looks like a guidelines, compliant validation study report, I'd find my clients are much more comfortable adopting an AI based tool if the response they get from the vendor is, we don't have to do that because our tool is valid because it's valid and we don't have to do local studies because it's valid for all jobs. That's when I raise the red flags and say, I don't want to have to stand up in court and defend that because that's not what the informed guidelines say.

Dr. Richard Landers:

I, I'll say that matches my experience precisely <laugh>. So right. A big source of clients in the beginning for me actually for the auditing work was when a vendor would go to a company with some iOS in it and they would ask for, where's your white papers? Where's your technical manuals? And they would say, we've never heard of those things. What are you talking about? And so, they would say, maybe you should go talk <laugh> to Richard. So that was the source of a lot of people in the beginning that has become a little less common. But I wanted to throw in that part of the cause of the trouble <laugh> that people run into is again, that difference in language when you ask, is this valid to the average data scientist? That means did it predict the thing it was supposed to predict end of problem. When they create a model that reaches some threshold that they've internally defined as being a successful model, it is by definition valid within many data science frameworks. There was a distinct educational component of a lot of that auditing work to say that, well, these words mean different things to the people you're trying to sell this product to. You need to have a better understanding of that as well as how that interacts with the legal requirements that you're trying to operate within.

Mark Girouard:

Yeah. The other thing I'd say is that I think a big part of the push for these tools has aligned with the increased focus on de and I for many employers and the vendors are saying, this tool is great because it is not biased. Unlike a human decision maker, it doesn't have any implicit biases. This is going to help with your de and i goals, et cetera, et cetera. I think as we all know, depending on the learning data and who's in that, you feed to a tool, it's entirely possible to create a terrific bias replication machine. And for some of the vendors, I'm really concerned that when they talk about de-biasing, they're, they're actually talking about not creating a tool that has less potential for bias, but creating a system where once someone has completed the tool, you'll change their score to make sure that the score is not biased. And I haven't seen a lot of that, but I've seen a few where there's a little bit of a thumb on the scale post use as opposed to a refinement. Then you're crossing over into a tool that may have disparate impact to a tool that is built in disparate treatment. Because you've said, we have decided to change the scores of non-diverse candidates or to change the scores of male candidates to make sure that they are not creating bias in our system,

Dr. Charles Handler:

Kind of tan 'em out to quotas and that kind of stuff that you're not supposed to really do. So, one of the other things I think is important is what's used in lieu of job analysis sometimes by these things is just a, they Hoover job descriptions and try to say they know everything about a job from the job description, which we know is famously garbage most of the time. And so, when you're starting on it from that situation, it's not necessarily really good. The other thing, and these tools are very, that the assessment market's always been very fragmented. Now I just open it up to say anything that's trying to predict and predict how someone will do in a job and advise on that. Part of those are sometimes just more of the recruitment process, automation systems, which have a lot of times built in a recommendation engine where recruiters or whomever is clicking, I like this candidate, I like this candidate, I like this candidate's teaching that thing. So, talk about baking bias into that whole process. I mean, you're just opening it up to people's subjective views on who's good. Maybe there's truth in that, but it's no different than the problem we've already had. It's just amplified and then it just learns and perpetuates itself. So those are the things that kind of scare me, not the formal assessments as much

Dr. Richard Landers:

That also interacts with some of the requirements. Even in the New York City law, if you have an AI tool that does, who knows what and gives a prediction at the other end, as long as the human on the other end of that is fully making an independent decision, that seems to not be an automated tool. So, it really depends on how these things are defined.

Dr. Charles Handler:

Yeah, yeah. No, there's a lot of gray area. Another interesting thing, just coming back at the E E O C, I remember, I think it's been a couple of years at SIOP and even talking to folks is from the E E O C who said, we're after we're now ready to force you to open up your black box. We could come after you and say, show us what's in the black box at any time. So beware, I don't know how much that scared people or if they ever even really did that, not that it's a bad idea. And I guess from a computer science standpoint too, can you open up a black box and really figure out what's going on in there? If it's like could, can an external person audit that and understand it? Sometimes I've heard they're so complicated that that's impossible.

Dr. Richard Landers:

I mean, the unsatisfying answer to that is it depends. So, there are many choices that a modeler can make, the algorithms, they generate more or less interpretable. So, the good news too is that when you have the kind of dataset sizes that we typically have in selection when you're working with data sets under 500 people, then you're not going to find a lot of value in the extraordinarily complicated approaches that you don't find in the less complicated approaches. And the less complicated ones are much more interpretable. So, if that decision is made prudently, then they are quite interpretable, at least as interpretable as regular regression-based approaches are. You have coefficients, you have numbers tied to test scores, and you can interpret that. But as soon as you get into the vision space, as soon as you get into natural language processing, things like that, the number of what are more or less arbitrary decisions by modelers as they pursue good prediction just skyrockets and it becomes increasingly difficult to trace back what's going on.

Especially when you get into more complex modeling like so-called deep learning approaches, which is really kind of advanced neural network models, which are actually these sequences of very complicated models that feed into each other. It's impossible to really realistically, for any given job candidate try to reverse engineer how each prediction subsequently was made. And so that's when we move into the space I mentioned earlier where you're basically sending test data or false data in order to just see what comes out the other end. And that's about as good as you can do when you get to that kind of model complexity.

Dr. Charles Handler:

And then what happens if someone says, hey, by the laws of, or not really the laws, but by the governing body here, we need you to produce this so we can look at it and it's that complicated, then it's impossible. So, does that get them off the hook then what happens? Is it a hung <laugh> hung jury there? It's hard to know. It's hard to know.

Mark Girouard:

And I think, Charles, just to pick up on that point quickly, I do think from my experience with the OOC investigators that that's when they would say, okay, explain to me what the tool is doing. What are the characteristics it's screening for? Why are those characteristics important to success in this job? And if you can't explain that we've got a problem here.

Dr. Charles Handler:

Yeah.

Dr. Richard Landers:

Yeah. There's a big difference in a computer vision model. One that just takes people's faces versus one that decodes faces into specific emotional expressions and then uses those emotional expressions to specific questions. The way that it's done, the specific engineering process used really can influence what you get out of the thing. So, if the vendor makes no effort to do that, then you just have no options. But a decision made very early in the process when trying to make these models.

Dr. Charles Handler:

And I mean, I come back to, well, when you talk about sample size too, I think that's where you get into people, we talked about earlier who were just trying to generalize their model to anybody in a bucket. So, they may have a product that's been used on a hundred or 200,000 people across all these situations. Then you've got a lot of data, but it's not, it's so varied that I don't know that there's anything truthful there. You might find, I don't have, again, experience with that large of a set of data, but in general, these are the things we're going to continue to be talking about. And it's, they're not simple. It was a lot simpler when you could basically create some questions and look at their psychometrics and predict them that that's kind of stone age it seems like now, although it works.

So that's the thing. And I don't know that I've seen a lot of superior things out of some of these more complicated things at this point, but we're on a precipice, and I'm just going to open the can of Worms chat, the chat, G P T three, that everybody's going to be talking about that, and I don't want to talk about that here. But that to me is just opening the door of even more complicated things that we got to deal with just when we're thinking we're getting our footing here and the regulation, the attention, all that. It just lags behind the technological advances, which again, that's been something in our field we've complained about for a long time. Takes us a long time to do these studies in this research where when you're following the money, when you're trying to innovate, it goes quickly. So, we'll do this again in a year and see what's changed and et cetera. But I really appreciate both you guys as we play out here. I'd love for you to both be able to just let our audience know how they can find you, how they can follow you that good stuff, and then we'll close it out.

Dr. Richard Landers:

Sure. Yeah. I mean, I'm accessible via email. Anybody can reach out to our landers at edu. Also, on Twitter and both especially Twitter, often share some of the research work that we're doing on ai. Oh, great. Published auditing paper, a few that's just coming out soon appearing in American psychologists, and we got some other machine learning work that's coming up soon. So, all sorts of news will be coming forward about that.

Dr. Charles Handler:

Awesome. And Mark?

Mark Girouard:

Yeah, sure. So, I think probably the easiest way to find me is on our firm's website, which is nilan johnson lewis.com. Nilan is N I L A N, and then I'm also on LinkedIn. So, Mark Girouard, last name is G i r o u a r d.

Dr. Charles Handler:

Awesome. So, these guys are top in their fields of what they're doing, and I feel really lucky to have you guys' input. I wish we could walk away with all the answers, but the answer is be careful out there. Be careful with what you're doing, think it through. And if anything, default to what we know works.

Science 4-Hire is brought to you by Sova Assessment Recruiting Software powered by science. Sova's Unified Talent Assessment Platform combines leading-edge science with smart, flexible technology to make hiring smarter and easier. Visit sovaassessment.com to find out how your organization can provide an exceptional candidate experience while building a diverse, high performing and future ready workforce.

May 28, 2023
Dr. Charles Handler
Changing assessment for good
© 2023 Sova Assessment Ltd. All Rights Reserved.