AI Audits Are Coming to HR

What HR Tech companies and professionals need to know.
Phil Dawson
February 17, 2023
5 min read

Automation is a growing trend in HR

Companies are increasingly adopting automated systems to support, and sometimes even replace, humans in key steps of their hiring and employee management processes.

According to Forbes, all Fortune 500 companies report using some form of automation in their HR pipelines. And it’s not just large companies that are jumping on this trend: industry surveys have shown that at least 25% of all US-based companies plan to increase their use of automated systems in hiring and talent management over the next few years.

From an economic perspective, it’s not difficult to understand why companies are embracing automation. Automated systems offer a highly-scalable way to add efficiency to HR, and remove many critical bottlenecks when it comes to identifying and hiring new talent. The significance of these benefits is clearly exemplified by the growing number of highly-successful startups that have specialized in developing AI-powered talent-management software.

Concerns about HR automation

Despite their clear benefits, a number of concerns surrounding the increasing use of these systems remain.

Many critiques highlight the potential for automated systems to exhibit unintentional biases, which may disparately impact certain populations. One high-profile example is the case of Amazon, whose resume-screening algorithms were discovered to systematically favor male applicants, due to an over-representation of men in their training data.

Critics also point out that these systems often suffer from a lack of transparency. That is, algorithmic systems are intrinsically difficult to test and evaluate for fairness — and job applicants who are adversely impacted by these systems may never know how, or why.

Moreover, even for companies that wish to implement ethically responsible systems, the lack of comprehensive, well-integrated toolkits for testing and debugging algorithms with sufficient precision makes achieving this level of responsible AI difficult in practice — even when software developers know what kinds of pitfalls to look out for.

Governments are taking notice

Concerns about the potential adverse effects of automated decision-making are becoming pervasive enough that many governments are starting to take action.


Over the past several years, several US jurisdictions have begun to propose legislation that place limitations on how automated systems can be used in hiring decisions. Examples include Illinois (2019; 2022), Maryland (2020), Washington DC (2021), and California (2022). At the federal level, the Algorithmic Accountability Act requires companies to assess the impacts of algorithmic systems used in decisions that affect the terms or availability of employment, or employee management practices.


These regulatory developments aren’t limited to the US, either. The EU’s draft Artificial Intelligence Act (AIA) requires companies to assess the impacts of AI systems on health, safety and rights. What’s more, AI systems used in recruitment, promotions, and performance evaluations have been pre-emptively classified as “high-risk” and therefore subject to stringent regulatory obligations. Canada has taken a similar approach in a recently proposed AI law, requiring companies to conduct AI impact assessments and adopt measures to identify, assess and mitigate AI bias in “high-impact” systems on an ongoing basis.

Enter NYC Local Law 144

Many of the legislative proposals put forth so far either have yet to be finalized and passed into law, or else only pertain to certain specific, narrowly-defined use-cases (such as banning the use of facial analysis algorithms in video-based job interviews, for example).

However, New York City is poised to become the first jurisdiction to enforce independent bias audits of automated tools used to perform or support HR decisions affecting NYC-based job candidates and employees. NYC Local Law 144, known as the Automated Employment Decision Tool (AEDT) Law, was initially passed in November of last year, and comes into force on January 1, 2023.

Top-level take-aways

Before delving into some of the more nuanced requirements and impacts of the bill, there are a few high-level take-aways that all organizations should be aware of.

Firstly, NYC’s legislation formally defines an AEDT as:

“…any computational process, derived from machine learning, statistical modeling, data analytics, or artificial intelligence, that […] is used to substantially assist or replace discretionary decision making for making employment decisions…

What’s important to note is that this definition doesn’t only apply to systems that employ AI/ML-based technology. Rather, nearly any software-based candidate- or skill-assessment system could fall under the purview of this law. The definition also doesn’t just apply to systems that fully automate decisions: it will also include AEDTs that “support” human decision-making at any point in an organization’s HR pipeline.

As such, the broad scope given to the definition of AEDTs is likely to have wide-ranging impacts across many different businesses and industrial sectors.

Secondly, the bill also makes one very important stipulation: namely that it will be the end-users of AEDTs — and not the makers of these software tools — that will be held responsible for complying with the various requirements set out in the legislation. This means that any company that uses automated systems in their HR pipeline will have to demonstrate compliance independently, and won’t be able to rely purely on assurances from the software’s manufacturers as a shield against regulatory consequences.

Given the large number of American and international businesses that have operations in New York City, however, we should expect the law to have impacts that defy its local nature. With AI regulations coming down the pipeline worldwide, international companies based in NYC may view the new requirements as an opportunity to prepare for compliance with regulatory regimes globally. Meanwhile, software vendors are likely to find themselves under new market pressures to develop tools that have been more comprehensively vetted for bias, simplifying downstream compliance audits for their clients.

As such, regardless of where they may be based, HR software developers, employers and recruitment agencies that don’t operate in NYC should still be paying attention, as the precedents currently being set are likely to shape the regulatory outlook for the use of algorithmic systems in HR in other American cities and beyond.

Regulatory requirements & responsibilities

So, what are the formal benchmarks that companies will have to demonstrate compliance with? The bill lays out a number of key requirements:

  1. All AEDTs must be independently audited for biases prior to January 1, 2023, and audited annually thereafter.
  2. Summaries of these audits also must be released and made easily accessible to the public (such as via a company’s website).
  3. Organizations using AEDTs for hiring or assessment must inform applicants that these tools are being used. They must also provide information about the specific traits or skills or they are intended to measure.
  4. Job applicants must be explicitly informed that they are allowed to opt out of the AEDT assessment, and that they can request alternate accommodations in their place.

Companies will face daily fines for noncompliance, with penalties increasing by the day. Additionally, they could also be subject to potential legal action by end-users of these systems.

Important unanswered questions remain

To be sure, there are a number of important question marks and ambiguities surrounding the current legislation.

For instance, while the bill requires audits to assess disparate impact on certain protected classes — including gender, race, and ethnicity — it does not address other important attributes, such as age, disability status, or sexual orientation.

Other ambiguities concern the geographic scope of the law. As written, the bill applies to jobs that are located in NYC. However, this leaves unclear whether these requirements will apply in certain edge cases, such as an NYC-based company hiring for remote-work positions.

However, perhaps the single-most important limitation is that the law provides no specific definition for an “audit” of AEDT software. By not setting any clear criteria for identifying and managing bias thresholds, the law leaves the door open to a range of testing and assessment methodologies that may be insufficient.

How should companies respond?

To prepare for for the new regulatory landscape, employers and recruitment agencies can take the following proactive steps:

First, you should begin by asking your HR software provider about their model and bias assessment practices, and whether these have been independently verified by a third party. Software partners who cannot provide information regarding their model audits may cause greater risk and disruption to your ongoing HR operations.

Second, it’s time to prepare for your independent bias audit. The law’s requirement for “independent” auditing — combined with the deep technical expertise and tooling it takes to assess complex AI models for bias — makes it likely that employers and recruitment agencies will have to partner with third-parties in order to meet their new regulatory obligations.

AI assessments

Navigating the thorny world of AI assessment services can be a difficult task, and not all service providers are made equal. With this in mind, what sorts of features and competencies should businesses look for when shopping for AI assessment services?

  • Comprehensive risk management: The ethical and legal ramifications of AI/ML systems can be complex to navigate. Therefore, organizations should ensure that AI assessment service providers have a deep understanding of how model behavior connects to emerging legal and regulatory requirements as well as best practices on AI risk management.
  • Diagnostic granularity: AI/ML models are complex, and their behavior is rarely fully captured by broad, generic accuracy tests. In order to thoroughly check for possible biases and improve model fairness, assessments need to employ an extensive range of tests, often specialized for specific model- and data types. Businesses should ensure that service providers have the technical expertise and tools to administer comprehensive suites of tests that are tailor-made to each unique business scenario, in order to provide fine-grained insights into the strengths and weaknesses of your software.
  • Cross-functional actionability: Designing and fine-tuning AI/ML systems to ensure that they meet business objectives requires close collaboration between a wide range of teams within your organization. An ideal assessment service should be able to provide summary measures and recommendations that are intelligible to both the ‘technical’ and ‘non-technical’ stakeholders, and thereby translate the nuances of your models into clear action items that can be communicated across multiple levels of your organization. This will be a crucial step not only for covering your regulatory requirements, but will also help your organization develop more robust AI/ML models that better serve your core business needs.
  • A globally harmonized framework: As AI assessments become a fixture of regulatory regimes, businesses should ensure that any technical assessment service they procure is rooted in a globally harmonized framework, such as independent assessment and certification schemes. This will help legal, risk & compliance and product teams scale assessment activities and recommendations in other jurisdictions as new AI regulations come into force.

The bigger picture

Ultimately, the purpose of an AI model assessment doesn’t have to be limited merely to avoiding fines or protecting your organization from legal and regulatory action. Getting into the practice of routinely carrying out in-depth technical assessments can help you gain crucial insight into the inner workings of your software, which is the best way to ensure that your AI/ML models are as accurate, reliable, and trustworthy as possible.

Partnering with model assessment providers can play a key role in helping businesses capitalize on their AI investments. In this sense, increased calls for regulatory oversight may turn out to be a blessing in disguise — at least for companies that adopt a rigorous and well-informed approach to AI/ML testing.

Remember that this doesn’t have to be a complex process. With the right expertise and technology, independent assessments can be straightforward and simple.

About Armilla AI

Armilla AI has developed the first comprehensive quality assurance platform for automated and AI systems. Its technology enables users to oversee every phase of model development, detect hidden biases and vulnerabilities, and receive automated alerts about abnormal behaviors. Armilla leverages its platform to help clients accelerate time to production for trustworthy AI systems, and lead independent AI assessment, audit and certification activities in line with emerging regulations and industry standards.