The question every organization asks when evaluating accessibility tools seems simple: if we choose this particular solution, how much manual testing will we still have to do?
This has been difficult to answer industry-wide for a few reasons, not least of which because many players in the industry (including us) maintain anti-benchmarking clauses in their terms of service. But it’s also true that there have been few third-party data sets for analysis.
Looking at audits
To address this gap, we decided to ask friends in the industry to send us a copy of an audit that was recently done for one of their websites. We have assumed that these audits were performed entirely manually, but even if some tools were used, the point is that they were performed on a production website, at the highest standards in the business for audits.
How do we know? Because to date, we’ve collected 35 of these audits, and virtually all were done by the two most reputable leaders in the auditing business. Here are some basics of the data set as of December 2025:
Descriptive statistics for the audit set
| n | 35 |
| Avg. number of issues reported | 178.6 |
| Avg. number of critical issues | 43.5 |
| Avg. number of serious issues | 105.4 |
Now that they’re in hand, we’re able to run some helpful analysis on them.
Because each issue reported in an audit is listed and described, we can assess whether that issue would have been automatically detectable by our tools. All of our ability to detect problems comes from a code base of checkpoints that we’ve developed called our validation set. And because our very large validation set includes the much smaller set of axe-core validations, we can also assess which of these issues would also have been detectable by a tool running axe-core. (Because we include axe-core validations, it isn’t theoretically possible for a tool solely running axe-core to detect issues that we don’t.)
So we can “backcast,” if you will, the performance of Evinced vs. axe-core over the range of the issues reported in the manual audits we have gathered to date.1
What we found
Across the audits gathered, we found that Evinced can detect nearly 63% of the issues reported, whereas axe-core detected roughly 23%.
Percentage of accessibility issues detectable in our sample of manual audits
| Tool | % Detected |
| Axe-core only | 22.6% |
| Evinced | 62.8% |
That’s nearly three times more coverage than axe-core alone, on actual production websites. This is in line with our prior work comparing our coverage, but the data set here is particularly simple and easy to understand.
But the raw percentages, while clear, aren’t the most important learning. What truly matters is what those numbers mean for your program workload, and for its costs.
What this means for accessibility programs
Audits, even manually performed audits, have their place. They are an important part of understanding the accessibility of a production website.
But our analysis suggests something important.
The opportunity for optimizing a manual audit process is substantial. This data raises the question: if you can detect a problem automatically, why wouldn’t you do it?
To us, if an accessibility issue can be detected earlier in the development process, it should be. Automated tooling can run at a developer’s or designer’s desk, in context, without slowing delivery or requiring specialized accessibility expertise. When those issues are caught early, they never need to appear in an audit report at all.
That shift matters because every issue found late represents work that had to wait: triage, reproduction, scheduling, and remediation long after the original developer’s context has faded. By contrast, audits are for spot checking the automation, and most importantly for deploying human talent for coping with the new: new designs, new approaches, new components, new products.
In addition, it also frees human accessibility experts to focus on the critical and frankly much needed work on driving adoption.
This opportunity, while substantial, can also be stated in the negative: if you are spending tons of resources on using humans to do things automation could do, by the laws of economics, you’re robbing your organization of resources spent on things like driving adoption.
There is no free lunch in accessibility.
Our next blog post will tackle just how expensive this lunch really is. Stay tuned!

