In this episode Joe Devon, co-founder of Global Accessibility Awareness Day (GAAD), talks about how generative AI can both help and harm digital accessibility — and what it will take to tilt the balance toward inclusion. Joe shares his personal motivation for the work, real-world stakes for disabled users across web, mobile, and developer tooling, and compelling stories that illustrate why accessible design is a human-rights issue as much as a compliance checkbox. He digs into AI’s current and future roles: from improving caption quality and auto-generating audio descriptions to evaluating how well code-gen models produce accessible UI by default. Joe introduces AIMAC (AI Model Accessibility Checker), a new benchmark comparing top models on accessibility-minded code generation, what the results reveal, and how model providers and engineering teams can practically raise the bar with linters, training data, and cultural change. He closes with concrete guidance for leaders, why involving people with disabilities is non-negotiable, and how solving for edge cases makes AI—and products—better for everyone.
Announcements
- Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
- When ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.
- Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.
- Your host is Tobias Macey and today I'm interviewing Joe Devon about opportunities for using generative AI to improve the accessibility of digital technologies
Interview
- Introduction
- How did you get involved in AI?
- Can you starty by giving an overview of what is included in the term "accessibility"?
- What are some of the major contributors to a lack of accessibility in digital experiences today?
- Beyond the web, what are some of the other platforms and interfaces that struggle with accessibility?
- What role does/can generative AI utilities play in improving the accessibility of applications?
- You recently helped create the AI Model Accessibility Checker (AIMAC) to benchmark which coding agents produce the most accessible code. What are the goals of that project and desired outcomes from its introduction?
- What were the key findings from AIMAC's initial benchmarking results? Were there any surprises in terms of which models performed better or worse at generating accessible code?
- The automation offered by using agentic software development toolchains reduces the manual effort involved in building accessible interfaces. What are the opportunities for using generative AI utilities to act as an assistive mechanism for existing sites/technologies?
- Beyond code generation, what other aspects of the AI development lifecycle need accessibility considerations - training data, model outputs, user interfaces for AI tools themselves?
- You co-host the Accessibility and Gen AI Podcast. What are some of the common misconceptions you encounter about AI's role in accessibility, either from the AI community or the accessibility community?
- There's often tension between moving fast with AI adoption and ensuring inclusive design. How do you advise engineering teams to balance innovation speed with accessibility requirements?
- What specific accessibility issues are most amenable to AI solutions today, and which ones still require human judgment and expertise?
- As AI models become more capable at generating code and interfaces, what guardrails or validation processes should engineering teams implement to ensure accessibility standards are met?
- How do you see the role of accessibility specialists evolving as AI tools become more prevalent in the development workflow? Does AI augment their work or change it fundamentally?
- For engineering leaders building platform and data infrastructure, what accessibility considerations should be baked into foundational systems that AI applications will be built upon?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on acessibility awareness?
Contact Info
Parting Question
- From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?
Closing Announcements
- Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
Links
- AIMAC
- Global Accessibility Awareness Day (GAAD)
- GAAD Foundation
- AltaVista
- Cursor
- Accessibility
- Braille Display
- Ben Ogilvie
- VT-100
- Ghostty
- Warp Terminal
- LLM-as-a-Judge
- FFMPEG
- Aria Tags
- Axe-Core
- MiniMax M1
- Codex Mini
- Qwen
- Kimi
- Google Lighthouse
- GitHub Copilot
- Be-My-Eyes
- WebAIM
- XRAccess
- XR == Extended Reality
- Deque University
- Fable accessibility feedback organization
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
Hello, and welcome to the AI Engineering Podcast, your guide to the fast moving world of building scalable and maintainable AI systems. Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most, building intelligent systems. Write Python code for your business logic and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML and AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions.
Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end to end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin. And for dbt cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud. Your host is Tobias Maci, and today I'm interviewing Joe Devon about opportunities for using generative AI to improve the accessibility of digital technologies. So, Joe, can you start by introducing yourself?
[00:01:23] Joe Devon:
Yes. My name is Joe Devon. I am the co founder of Global Accessibility Awareness Day, and this is a day that goes viral every year to the tune of at least 200,000,000 people. We stopped counting once we hit that number on social media. And as a result of this going viral, I found myself square in the accessibility movement, but I started out as a developer. I started coding at 13 years old. Right now, I'm chair of the GAD Foundation, but I did have an interesting history with a variety of different jobs. I worked for a search engine company called Compassware that predated AltaVista, let alone Google, was a backend programmer for americanidol.com, and started my own dev shop and built it to about 100 employees before COVID had some other ideas for how to exit that company. And I guess last but not least, I'm the chief VIBE coding officer for my family.
So that's where we end up in 2025, 2026.
[00:02:26] Tobias Macey:
And do you remember how you first got started working in the AI space?
[00:02:30] Joe Devon:
It depends on what you call AI. I was asked to start a group called the Semweb Group, Semantic Web Group in LA. And a lot of the AI researchers were very disappointed that artificial intelligence didn't take off all those decades ago. So they kind of moved a lot of them to the Semantic Web space, which was just as successful, at least in terms of the marketing. And so I got to know lots of people in that space. So a lot of people dealing with graph stores and such, and then really didn't touch it as I was friends with the people that ran the Semtech biz group. I was on the board there. And then as that became a little bit less active, kind of forgot about it for a while. And then all of a sudden ChatGPT comes out with that version that just goes totally wild.
I found Cursor and the rest was history. You see, I hadn't touched code for about fifteen years. I'd become one of those people running a company, and I'm not good at the context switching from dealing with human beings and being the head janitor, head salesperson, head everything of the company and dealing with people to what's much nicer dealing with code. But Cursor and AI really helped me get back into the code. So now I'm coding morning to night, but it's not really coding. It's just telling the AI what to do and trying to keep keep up basically.
[00:03:58] Tobias Macey:
And so you have found yourself very much enmeshed in this space of accessibility. And for purposes of this conversation, I'm wondering if you can give your definition of what that term encompasses.
[00:04:12] Joe Devon:
Yeah. It's it's just about making technology work for people with disabilities. And what prompted me to write the blog post that became Global Accessibility Awareness Day was personal, which is how most people get into it. My dad was in his mid-80s and he couldn't bank anymore. He'd gotten fished. To get to the bank took all day, and the bank's website was inaccessible. And it sort of felt like a human rights thing to me because even though banking is a business, can't really survive in today's society without banking. And there was no reason that my dad, who spoke 11 languages and was brilliant, should be locked out of it. And so it's really about making technology work for people like my dad, for people that might be blind and they can't see the mouse on the screen, so they can't use a mouse.
They can't see the screen, so they need a screen reader to speak it out to them. If you're deaf, you need captions. If you're deaf blind, you use a braille interpreter, and and so you need a transcript for the braille display. And then you have color contrast and visual clarity for people that are getting older or that just have low vision. So it runs the gamut, but it's really about making sure that technology is usable. It's usability, but for more people.
[00:05:27] Tobias Macey:
Yeah. Absolutely. I mean, it's definitely one of those terms that becomes all encompassing, but it has a lot of breadth and it's obviously a very important aspect, actually, for my day job work in higher education. So every piece of technology that we produce is federally mandated to have a high degree of accessibility, so it's something that I deal with quite frequently. And I know that you mentioned some of the different types of disabilities that will be a hindrance to interacting with digital technologies, and also there is a lot of affordances for people with varying levels of capability in the physical world as well, notably curb cuts when you're going through a crosswalk.
But I'm wondering if you can talk to some of the ways that having some of those handicaps or hindrances to experience interacting with digital technology, how that can act as a substantial hindrance to day to day activity for people who especially people as the world continues to be digitized.
[00:06:29] Joe Devon:
Well, let's take a simple example of if you're deaf, right? And think about the time of COVID where everybody moves to doing things on Zoom or one of these screen sharing. I can't remember the word, but you know what I'm talking about, the virtual meetings. If you are deaf, you would probably prefer email, writing, sending things that you can read as opposed to people speaking on a screen. So even if you can read lips well, the best you'll get is maybe 70%, and that it'll depend on the person. So you need to have these captions. And if the captions aren't good, then you're really out of luck because all of a sudden, you went from your way of communicating to everybody being on a screen. And now imagine that you're you're reading the captions, right, and you've got five, six, seven people on the screen. Who is speaking? Did the captions come out right, especially live captions, or were they more like craptions, as they're affectionately called in the business? And imagine you've just had a one hour meeting and you had to read a book, like literally read the text of a book because people are speaking so quickly. So the quality of those captions is really vital, and it really can impact your day to day activity. And I specifically brought up somebody who's blind because, not only because I have friends who have told me about this experience, I didn't make up that word reading a book. That was something that a friend of mine and former employee of mine said. But most people tend to think of screen readers, and they don't even realize that something like Zoom can be really difficult for for somebody who's deaf, let alone if you're deaf blind and you need to read a transcript because you have a refreshable braille display. So I I'd say that's just an example, but you can take that to any any individual disability.
[00:08:22] Tobias Macey:
Your point of some of these video conferencing utilities as one of the major touch points for the need for accessibility is very enlightening because most of the time when people think accessibility beyond the physical world, they're probably thinking about a website. And I'm wondering if you can just talk to some of the various types of interfaces and experiences that are digitally mediated that do need this level of investment in accessibility improvements to allow people to actually take part in these various experiences that are so broadly shared?
[00:08:58] Joe Devon:
Yeah. So let's let's think about mobile apps. Right? With the mobile app, it's empowered so many activities that were previously impossible. And take for an example, and I use this one because I was at a restaurant, we had a whole big party for GAP, Global Accessibility Awareness Day, which from now on, I'll say GATT for short. And there were a whole bunch of people that were blind there, and they were all taking Ubers at the end. And we were out on a street that was very dodgy, and I wanted to make sure everybody got in their Uber. So imagine that you're at a restaurant, and before you have Uber or Lyft or one of these apps, you had to trust somebody to call up the cab company. You had to know when the cab was there, so you had to trust a stranger to tell you that they were there. You had to trust the cab driver. You had to trust them with the money that you were paying them. And there's so many stories of people that were abused in various ways as a result of this situation, and the mobile app changed all of that. And now imagine that you're in a restaurant and there was an update to the software that made it inaccessible.
Five minutes ago, it was accessible and you got to the restaurant, but now you're sitting outside, the restaurant's closing, you're in a bad part of town, and your mobile app doesn't work anymore. That is a life threatening situation, and that's why this is so important, and it might not be obvious if you've never sort of been in that situation. But let's take another example just to give you Oh, and let me just add one more thing there. My mentee, Ben Ogilvie, who heads up accessibility for ArchTouch, created the state of mobile app accessibility report, where they took from about five different industries, they took the top mobile apps and they graded it for the most common user journeys. We're not talking about edge cases here. And 72% of the common user journeys had a poor or fail rate in their report. So those numbers are are pretty bad. And for the web, it's even worse. So this is not theoretical. These these are real problems. And now let's take another example since I know you've got a more technical audience. I created this benchmark for accessibility called AMAC, and I built an entire CLI for it because I was sure on the command line it would be accessible. Why wouldn't it be accessible? You just have words on a screen. There's so little visual stuff that should trip you up. So after building this entire CLI, I stupidly waited till the end to turn on a voiceover or a screen reader to test it, and I was shocked that almost nothing came out of it. I used all the common tools people used to build those CLIs, and literally, you could hear nothing, and I had to throw out the entire CLI and all of that work because there's almost no CLI that's accessible. There's like one app in Go that will help you sort of build something somewhat accessible. And Cloud Code, while I'm doing testing and I'm on Cloud Code so much, I have to turn it off because it's just all so much noise. And I have a friend who's blind who does code with Cloud Code, but he says it's very inaccessible and hard, but he's able to do it. I don't know how, but he's obviously better at screen reader. So those are just a couple of different examples of the variety of ways that that the different technologies interface with people with disabilities.
[00:12:19] Tobias Macey:
On the command line utility, I also wonder if it's a factor of which terminal you're using as well because so many different terminals have varying levels of support and features even for some of the various control codes. And then you get into some of the fancy terminals such as warp that even changes a lot of the paradigms that we're used to from the v t 100 terms that we've been using for the past, I don't know, four or five decades.
[00:12:43] Joe Devon:
Yeah. No doubt that that's a big part of it. But in I've used Ghosty, the new one, and that's just a passion project from Mitchell Hashimoto, who came out with HashiCorp. And I kind of touched base with him about seeing if it could become a little more accessible, but it's very hard to ask somebody who's just building a passion project in their spare time to, there's no monetary thing. So like, what lever do you really have there to push it too hard? But he said he'd like to see it get there. But honestly, like, they're all a mess.
[00:13:16] Tobias Macey:
So now digging into some of the role of AI in this overall accessibility space, you touched briefly on the the idea of captions or transcripts for virtual meetings. Obviously, there are closed captioning for commercially produced media, but not necessarily for some of the more mass media. So thinking in terms of YouTube, TikTok, all these various video platforms. I know that YouTube will auto generate transcripts, but they're obviously going to be of highly variable quality, especially depending on the level of sound quality and the recording, etcetera. Curious if you can just talk to some of the ways that you see AI playing a role in helping to improve the accessibility of digital properties as well as helping to circumvent some of the shortcomings of accessibility for some of those same media, whether that's websites, mobile apps, etcetera?
[00:14:12] Joe Devon:
Yeah. So I call this crossing the Rubicon AI. And essentially in any field, when AI gets better at something than a human, it crossed that Rubicon. And when it comes to captions, it's about when is AI going to do it better than humans? And I don't know if we're there or we're close to there, but I I think I could build within a day or so a system that'll probably be better than more most humans because what I would do very simply is try five different, you know, ASR tools to create captions, have them time coded, and then compare the captions. And then if you compare the captions across three or five different systems and you see where they agree or disagree, and then you pull out the diffs where they disagree and maybe make it semi automated, you'll probably get almost perfect captions. But even if you automate it and use an LLM as judge at the end to decide between the differences, you can probably get really far in terms of automated captions, but that's just you hacking it up yourself. Right? But it can go so much deeper than that. So I I have a story where I had a a group chat with two friends who are both blind, and I had three videos I wanted to show them that were in a series, and they literally only had visuals. There was no sound in it at all, so it would have been completely useless to send to them. So I'm like, Oh, why don't I try to Vibe Code a solution?
I I Vibe code So because, again, I hadn't touched code in fifteen years. So I just used think it was Cloud Code to have FFmpeg download the video and then sent it to another model to analyze every frame and describe it, and then had another model summarize all of those frames and then turned it into an audio description track. Then I used AI to another model for speech to text that turned that into an audio file. And then again, used AI with FFmpeg, which in the past would have taken forever to learn exactly the right commands and all that, to stitch the audio together with the video. And an hour later, I had three videos with a perfect audio description. It was just sublime. And now it'd take me about five minutes to get that script to take a new video. And this was, like, six months ago. Now the models could have probably done it in ten minutes.
[00:16:33] Tobias Macey:
And one of the, I think, main driving factors of why accessibility on the web, on mobile devices, etcetera, is generally so poor is because it is a non zero amount of effort and investment to add and support those various accessibility features. I know on the web, there are things like ARIA tags to assist the screen reader and having appropriate descriptions for some of the different content, whether that's for images for people who are visually impaired. And there are also issues as far as keyboard navigability. So making sure that the tab ordering on the various elements is correct so that it maps to the user experience. And because of the fact that a lot of these visual mediums are going to be changing so rapidly as designers come up with new layouts, as you add new pages, etcetera, there's a decent amount of effort involved.
But with the fact that these LLMs have gotten so good at code generation, I'm wondering how you're seeing that change some of that calculus for developers and engineers and some of the people who own these businesses, how that changes the ways that they think about that investment and whether you're seeing any overall improvement in the level of accessibility across the board.
[00:17:48] Joe Devon:
Definitely not across the board. I would say that it's generally pretty bad if you just ask it without any kind of steering to just write some code for you. It's pretty good at alt text. It actually gets that right pretty often, most of the time, but it'll make all kinds of mistakes just like humans do and just like most of the code that it's trained on. And as a result of that, as I had been doing all of this Vibe coding, I really realized that this was going to be a make or break moment for accessibility. If we got the AI model companies to pay attention to accessibility, then we would have a much more accessible web and world. And if we don't, then it's just gonna be an order of magnitude worse.
And so that's why I created AMAC, which is the AI model accessibility checker. I vibe coded the first version in three days, which I'm sure was very inaccessible, but it did get it did get the concept across, which was ask the top AI models to generate a web page and then use axe core, which is an automated testing tool, to grade all of them and then see how well all of the AI models did. So that first version was easy to write, but very hard to make accessible, and there were lots of bugs in it. And as anybody that's been playing with these AI coding tools knows, the models keep getting better, but those first versions, every time I fixed the bug, it introduced two more bugs.
And before releasing it publicly, I found so many fundamental errors that it really required becoming a coder again and really looking at the data model especially. Once I sort of defined the data model myself and did not allow the AI to decide it, and I also sort of pushed everything to the database because I was more of a database guy back in the day, then I was able to sort of steer it. But as time goes on, if you don't learn how to steer the model, you're not gonna get good results. So now it's great to see who does well and who does who does poorly, and I was I was really surprised with the results. And now you mentioned this AMAC or AI model accessibility
[00:19:56] Tobias Macey:
checker utility that you've built and wondering if you can give an overview about what your goals are with that utility and some of the ways that you're to see it used to improve the distribution of accessibility across the web and various other digital properties?
[00:20:14] Joe Devon:
Yeah. Well, literally just came out with the new version which come which compared all of the top models. And my goal is that the model companies take a look at the results, and since they compete on benchmarks, to have a benchmark that's noticed out there that gets picked up by them. And then they're going to compete on it, and they're going to do a great job because it's not that hard if you do the training for it. The data is known, the mistakes are known. This is not something that should be all that difficult for them to achieve compared to everything else they've done. And I bring up the results only because it will help illustrate why this is gonna make a difference. So OpenAI killed the competition.
The top four models are all GPTs, and they've got five models in the top 10, and their open source model came in 11. So they have a lot to be proud of. And then you've got MiniMax M1 is tied for fourth place with Codex Mini. So here you have a I think it's a Chinese model that is a small model that is doing extremely well and way better than the other models that cost a lot more and that are really well known. Even Quen got two models in the top 10, and Kimi came in ninth, and Anthropic only got to the tenth spot. So here you have an expensive model that everybody's talking about for accessibility, and I think it was Haiku that got into, just squeaked into the top 10, and Opus 4.5 is something like 19 on the benchmark, and it's very expensive. But the big surprise is Google came in with three point zero Pro. They came in dead last out of 36 models. And I think that when something like that happens, all of a sudden, it takes the shine off of an amazing model. And I don't think that they're gonna come in last again. So the real goal here is that AI model companies pay attention to accessibility.
Honestly, as much work as I put into this benchmark, which I should mention is a GAD Foundation activity and that we partnered with ServiceNow to come out with it. I should at least give them props. They're not gonna make this mistake again. And our mission is to make sure that digital technology is accessible. So what better way to do it than this?
[00:22:28] Tobias Macey:
I think it's interesting too that all of these models, especially the big frontier models have been trained on effectively the entire Internet. And the fact that a large portion of that Internet is not accessible is likely one of the contributing factors to why a lot of these frontier models do so poorly in generating accessible code because they don't have enough in their training corpus or at least a large enough portion for that to be the de facto means of code generation. And I'm curious, as you have gone through this work of benchmarking and published the results, what are some of the ways that you can imagine, whether it's the large model providers or somebody fine tuning an open weights model, building up that training corpora to provide the good examples of how best to achieve accessible interfaces and just some of the outcomes that you're hoping to see from the result of this benchmark and some of the ways that you're hoping to maintain that set of benchmarks and results as these models do continue to evolve and improve?
[00:23:30] Joe Devon:
Well, one thing that's interesting is WebAIM does a yearly survey of the top million websites on the Internet, just their homepages. And the data there, at least in terms of the the worst errors that come out, come from the color contrast. Something like 80% of WebAIM's top issues websites' top issues was color contrast, and we're finding the same thing. Maybe even a higher percentage of the of the issues are color contrast. If all of them got color contrast right, they would be well on their way to way more accessible code. So that's super interesting. But what I forgot to add earlier was this is just the first benchmark, which is very easy to do. If they go in in their training and they put in linters, axe core at the base, and they have a linter for accessibility, then all the automated stuff, they should all get a perfect score. This is not difficult for them to set up. They just have to actually try it. Right? And Google has Lighthouse. Right? They literally in their developer tools, they use the Lighthouse is based on X score. They literally have the tools to do all of this. They just need to put the attention on. Right? But in the future versions, we're going to do some manual testing and we're going to make it a little bit harder. But in terms of them having enough data, they have the data. It's all in the latent space. They have every piece of data they need.
They just need to steer the model toward the right corners of the code the same way that they're gonna steer the models to the right corners of the web that they've all slurped up. They clean it up, and you've just gotta clean up that data. And I did hear the other day that there's a corpus of data, which I think what they've done is they took the accessibility bugs for all the open source projects on GitHub, and they took the before and after. So you have a ton of data that shows you what the fixes look like. So this is not a very challenging problem to solve. You just need some attention to it, and then we're gonna be in a way better place.
[00:25:30] Tobias Macey:
One of the other interesting aspects of thinking about AI in the context of accessibility is the ways that some of the tools that are powered by AI and how accessible they are. So for some of these various chatbots, are they accessible? Do they work well with screen readers for some of the AI coding tools? How do they rank in terms of accessibility? You mentioned ClaudeCode. One of your friends was having a poor time with that. And I'm curious how you're seeing some of these model providers and and various models and coding tools either lean into or fail to lean into some of those questions of accessibility so that they can be used by as broad of an audience as possible?
[00:26:14] Joe Devon:
Yeah. I haven't personally tested it, so it's hard for me to to to give a robust answer, but just generally from my friends in the space. And I've co presented at GitHub Universe with some folks at GitHub. I know them pretty well. So I know that there are accessibility people on the teams that are building some of their AI products. And from what I understand, if you use Copilot on GitHub, it is accessible. I just know the people that have built that, so they assure me it is. So I don't think they would lie about that. And from what I've heard from some others, most of the others aren't, or they are to various degrees. But I assume that people have figured out how to get around it because most of my friends that are blind or have different disabilities are using ChatGPT and some of these other consumer products, so they found ways to do it. And then you also have to think about specialized services like Aira and Be My Eyes, which came out with a Be My AI product. And they were a launch partner with OpenAI, which is probably why OpenAI did so well in the benchmark, to tell you the truth, because I know that they've got a relationship there and that they're paying attention to accessibility, at least to some degree. And it's also been my experience using their models that it's been a lot easier to steer them. But with BeMyAI, what it is, is it started out as a volunteer group where people who can see would connect up with people who are blind, who say, okay, here's my camera. They can see the screen and then say, is what I'm wearing, does it match? Or, Can you find my keys on the floor? Or whatever they wanted.
And the selling point was the anonymity. Whereas you have, Aira was another company and they went the other way where they trained people to be professional about how they are being helpful, which is just another way to go. So some people don't like the fact that they might know the person, sort of know the person on the other end that acts as their eyes, whereas in the other group, they want that professionality. So it all depends. But I think, don't I know about Eyre, but I think they've also started to build an AI product. But Be My AI, now you don't even need a human for a lot of the tasks. You have an AI that's looking at it. And I got a demo where, when it first came out, where a friend of mine who's blind, I said, tell me what's on this menu. Like, let's say I wanna get a lemon meringue pie. Can I check if that's on the menu? And took a picture of it and showed it right away. Today, I think we're already far enough along where that's not that impressive, but at the time it was like, oh my. And so if you're blind and you're using these tools, it's really big. And the Meta Ray Ban glasses, I know a lot of people that use it especially for airports, because at the airport, you can have an AI tell you where to go, where the gate is, you know, where to sit, like everything. It really can kind of direct you pretty well. So it's it's it's been a bit of a game changer for some.
[00:29:09] Tobias Macey:
And so digging now into some of the perception of accessibility, the role of AI there, but also just generally the importance of accessibility. I know that you also work in the podcast space. And so I'm wondering, as you've been doing your own podcast and discussing some of these aspects of accessibility, the role of AI, what are some of the misconceptions or falsehoods that you've encountered as you've discussed some of these topic areas with folks?
[00:29:43] Joe Devon:
Well, I mean, nobody that comes on the podcast is really a doomsayer when it comes to AI, at least not yet. But, in the industry, there's definitely been plenty of doomsayers from the very beginning, and I still see a lot of it around. And I'm told sometimes not to say what I believe is is gonna come because it's not very popular in the accessibility space. I think part of it is just fear, and there's definitely legitimate fear about job loss, which I think is already happening because if a CFO can cut payroll even by the perception that AI is going to change things, they're they're gonna do it, and they have been doing it. And that part is upsetting and concerning. But from the beginning, they were saying this is going to be bad, it's going to be bad. And I was like, Parts of it might be bad, but a lot of it is going to be a game changer and it's going to help. You know, the WebAIM million that I mentioned earlier, that report showed that 97% of the web was inaccessible.
And after six years, we've now gotten it down to 95 or 94% inaccessible, the top million sites. We have an industry here that's been around for some decades, and that's the best we can do, improve it by a few percent. I mean, the report isn't 100% perfect because some of the issues are not as bad as other issues, but it's still not a great situation. And so if AI can help improve that, then I'm all for it. And what I try and tell people is you're not gonna stop AI. It's here, and it's gonna grow. And if you just leave it to others, you don't have, nobody comes out with a benchmark, if nobody talks about it, and nobody goes out, and in essence sells that this is important, right? Because I hate sales in one way, but in the other way, you have to do it in order to get people to understand your mission and your vision.
And if we don't get the people building these products to understand how they impact folks, then we're gonna leave the game up to them. And we already know what happens there. You're gonna have that 97% inaccessibility. We have to engage. We have to be a part of this world. We have and we can't go in and just naysay everything. You have to be positive about it, even though there are negative aspects. I'm terrified that that there might be job loss and that this time is different. Technology always created more jobs, and this time feels different. It does. Maybe it'll create more jobs. Maybe it'll take away more jobs. So far, it has probably taken more jobs. Right? You know, I I worked for an organization and AI decimated their main income driver, and they had mass layoffs, including me. And I had AI in my title.
So there you go.
[00:32:26] Tobias Macey:
Yeah. No. It's definitely strange and interesting time that we live in right now. And I I think that to the point of accessibility in particular, AI acts as an amplifier of whatever you put into it. And as we mentioned, a lot of the web is inaccessible, and so the code that we generate from these tools is also inaccessible because we're amplifying the inputs. And for people who are invested in improving accessibility, as you said, they need to make sure that they're guiding these tools in a direction to enforce some of the accessibility requirements. You mentioned the need for incorporating a linter into that feedback loop for the AIs. What are some of the elements of education and training and just some of the ways that engineering teams should be thinking about incorporating accessibility as a first order requirement into the work that they're doing as a means of amplifying its reach and capability as we turn the corner into this new and uncertain era that we're entering?
[00:33:33] Joe Devon:
Number one is is actually working with people with disabilities. If you actually see what your product looks like from the perspective of someone with a disability, you will never look at your product the same way. I'm officially a public speaker. I've brought people to tears talking about I've had moms come to me like crying because of the impact that it's had on their kids and how moved they were about all of this. But nothing that I've ever done or said compares to this one time where my co founder of GAD, Jenison Asuncion, who is blind, he went into a travel company, and he is a world traveler. When I met him, first time, he had just done a red eye from Canada. I think it was from Toronto to Vegas, did a red eye, went to a wedding of a YouTuber, by the way, Tommy Edison, the blind film critic. And then he took another red eye the same day. So he didn't even stay overnight at the hotel in Vegas. He went to another city, went right off the plane to Keynote, and then did sleep over that night.
So it's like the third country, and then he took another trip to Chicago the next day. This is exactly the kind of person that this travel company that's that's their audience. And when he came in, I I remember we're in a room full of people, of engineers, mobile developers, and then he got stuck in a keyboard trap just to register. So before he could even do anything else and he was like, I can't I I can't get out of this. What's going on here? I can't see anything. I can't I don't know what's happening. And the entire room went silent, And the impact that he had there is something I could never possibly replicate because that's your work. And you look at the impact of somebody that's your user, your core user base. It's not because they're blind, but they're that they can't be perfectly the persona that you are after.
[00:35:30] Tobias Macey:
Yeah. That's a very powerful vignette and representation, and I think that it is those personal stories that have the most lasting impact because it's all well and good to say, oh, you need to make sure your website's accessible because you're losing out on x percent of your audience or potential revenue streams. But everybody, I'm sure, has either has someone in their day to day orbit who has some form of impairment, whether it's something that you're aware of or not, or you have at least encountered somebody who has those impairments.
[00:36:04] Joe Devon:
Can I I add you since you bring that up? I I I wanna add something here. Well, first of accessibility team then told us that they heard from their team, Oh, I used to not really think much of those accessibility bugs and kind of ignored them, but now I get it, and now I'm going to pay attention to it. So that impact was so important. But I wanted to also mention that I had somebody that worked for me who was a designer who was color blind. And it took years for me to find out, even though it's an accessibility company, but no designer wants to get known, Oh, I'm the color blind designer, because it's gonna impact their career. So you just never know what people are facing in their day to day life. Right? And he was a great designer, by the
[00:36:45] Tobias Macey:
And the other piece of it too is one of the common follow ons from all these accessibility conversations that I've experienced is that beyond it opening up the possibility of a certain portion of your audience to be able to interact with your properties, it also improves the user experience for non impaired people. So one of the examples that I've often heard is making it easier to do keyboard navigation, makes it simpler for the busy parent who's juggling a kid in one arm and a phone or tablet in the other or, you know, laptop in the other trying to get things done. It makes it so that their life is easier.
[00:37:25] Joe Devon:
Yeah. And and you really hit the nail on the head there too because with the AI coding, the little secret that people are beginning to realize is that CLIs are better than APIs. If you can build a CLI first, you're going to be able to interact with your agent much better. And this, because computers are kind of looking at it the same way as humans do with a screen reader. So essentially people are like, Oh, wait, the computer needs to understand what's going on and the CLIs are not very accessible. And it's like the first time that people started to pay attention to this because it's their agents that are communicating and they benefit from that very same case. I didn't explain that very well, but hopefully all the words eventually
[00:38:13] Tobias Macey:
arrange themselves well enough to be clear. That's a very valid point too, is that making it more accessible for humans also makes it easier for the computer to introspect and be able to improve the inputs to its feedback loop to make sure that it's producing appropriate outputs. And so I was curious as well, what are some of the areas of improvement that you would like to see in terms of the types of tooling for doing some of that measurement of accessibility where you mentioned axe core as a means of determining is this website accessible, but what are some of the elements that still require human judgment to be able to make an appropriate determination of accessibility, and what are some of the ways that we can improve the tools that are available to these agents to accelerate their feedback loops and make accessibility in a broader scale much more part of that inner loop?
[00:39:13] Joe Devon:
The biggest gap is the computers being able to launch a screen reader or any assistive technology. There is no cloud platform that has all the major screen readers, for example. And I think that there's some licensing issues that will stop that from ever happening. But even if you could have, you buy the licenses for yourself or for your organization, right, and you allow an agent to do that testing directly, that would be a game changer, I think, in terms of automating the testing. You still need that human being to look at, well, what does this sound like to a human being? What is it? How does it come off as? Can't really I mean, accessibility professionals, they can get a lot of the way there, but it's not the same if you're able to navigate without a screen reader. If you're a 100% a screen reader user and you have no choice to turn that off and understand what is going on without it, you're gonna catch things that you just won't catch because you're popping up a screen reader for the purpose of testing.
And so that human element, I don't know if it will ever go away. But again, are we gonna hit a Rubicon moment? I'm not supposed to say this, but we are. We are. There will be a time where it's gonna be able to where AI will be able to run every single screen reader as long as you bought a copy for yourself and look at the code from every single angle and identify the differences between every type of AT. 'll get better than the human at doing that, and then we're gonna be in a much better place. And I know that there are a number of standards for the web for various accessibility technologies.
I'm curious what you're seeing in other digital arenas, whether that is things such as augmented reality or virtual reality. Mobile devices are obviously a a key area. But as the entire world gets more and more digitized, what are some of the areas where we're lacking appropriate or widely adopted standards for the accessibility
[00:41:25] Tobias Macey:
technologies to be able to interact with?
[00:41:28] Joe Devon:
Well, when it comes to XR, you have a group called XR Access that are really good. And I don't know that we're we're ready for standards at this point because it just I don't know. Is is XR ever gonna take off? I have every single version of the Quest, and the few times that I've used it, I love it. But that that big thing on me just is not comfortable. The the what was the the Apple product called again? The Vision Pro. I don't even remember the I think it's something like that. Yeah, I'm not sure. I tried it in the Apple store. It's an amazing technology.
These things are just fantastic, but I don't I don't know how anybody's gonna use them. Even the RayBans, they're really cool, but you have the weirdness factor from the people on the other side who don't wanna be filmed by everybody all the time. It's certainly helpful if you can okay. Let me do some visual recognition and then go to LinkedIn and find the entire history of the person that I'm looking at. That can be useful. Like, I don't you know, as a public speaker, you often you meet so many people, and they often will come up to you and you can't tell, Is this somebody that I should know and should be embarrassed that I don't know? Or did I once meet them ten years ago because they gave a talk? And so it'd be super useful to be able to do that visual recognition, pull up the LinkedIn, but it's also super creepy. And we've come to accept a lot of things I never thought we would, but that one's a tough one. And in a way, I I hope we don't accept it. And I don't know. I don't know if this technology is going anywhere. Blockchain was fascinating, but all I've seen are people getting honest people getting cheated a lot. I have no interest in it. I I know some people that love it that are really good at it technically, but I I can't stand it. So I just don't engage in that so much. So I I don't know that I can give you a good answer as you might want. Sorry.
[00:43:22] Tobias Macey:
Oh, that's absolutely fine. And then another dimension of this as well is that the ability for software engineers and technologists to improve the accessibility of the products that they're creating because of these LLMs accelerating the development cycles. It also lowers the barrier to entry for nonengineers to actually get involved in the creation process. And I'm curious how you're seeing accessibility professionals, people who are more on the testing and validation side, how is the improved accessibility to misappropriate the word of these code generation tools, improving their ability to get more involved in the creation process and the remediation process of the accessibility shortcomings?
[00:44:16] Joe Devon:
I I don't know too many people that are that are really doing much. I know maybe two, three, four people. I haven't seen it have the impact that I'd like it to, but it's still early. I I think the problem is you have to put in the hours. Have to put in the miles, and if you don't, then you can't steer these models. It took me forever to get it to a point where I could ask a question about, is this accessible, and steer it to hopefully give me a good answer. And for example, I got Claude Co. To the point where I told it which sources I wanted it to look up. Anytime I asked the question, I said, I want you to look it up, scrape an article from these sources, and then tell me whether this is accessible or not, or how to build it accessibly or whatnot. And it worked for like two weeks, and then something changed, and a lot of people have complained about Cloud Code's quality dipping. But all of a sudden, they started to quote all of these sources, but they lied. They literally would consistently it wanted to defend the code that was there. It didn't wanna change anything. It didn't want to evaluate if it was or was not accessible, so it would change what was on the page and then say, this is what it says.
And I said so I would keep steering it more and more and say, give me a direct quote. You've lied to me so often that I'm going to double check it, and it would still lie. Right? And then I started to move to Codex recently, especially after I saw the how well OpenAI did, and that's not happening on Codex. I'm actually getting really solid answers, but you have to put in the miles of playing with all these different models. And I don't know too many people who are, or if they are, they're not being very loud about it. So I wouldn't say that I've seen the impact yet, but now that I've gotten this benchmark where I want it to be, I definitely have some ideas for products that either I'll open source or depending on on what it does, sell it and try and help people get there. But they'll get there. I think it's another year or two, and then it'll it'll start to happen.
[00:46:23] Tobias Macey:
And so for people who are in engineering teams, particularly in leadership roles, what are some of the ways that they should be thinking about accessibility, their responsibilities for improving it, their investment in the core technologies that they should be investing in to help improve the ability for as broad of an audience as possible to interact with what they're building, particularly in the context of the various AI capabilities that are being sprinkled through virtually everything these days?
[00:47:01] Joe Devon:
The every organization has their own quirks and their own data, and one of their differentiators is the data. So if I were part of an engineering organization and my task was to focus on accessibility, I would start with taking the entire corpus of data, of code, and the accessibility bugs that are being tracked, and then generate these data sets that show for my particular use case, what does the before and after look like? What is good code? What is bad code for my code base? And then use that to train models or to fine tune models and improve whatever they put in front of the engineers. And then, of course, another aspect is you do need to have some kind of chatbot that is going to give you some good direction. You should be able to ask your chatbot accessibility questions and have it answer it. Most of the organizations that are doing accessibility testing and such definitely have some chatbots. The one that I have access to is Deque, and it does pretty good. They have, I think they're called the Deque Bot. And if you buy a subscription, it's total if you're a developer and you wanna learn accessibility, I would definitely go to Deque University.
WebAIM has some good articles. You know, w three c, there's there's several organizations with good articles, but DQ University will take you through that in a thorough manner, and they've got a really good chatbot, which is probably for sale at an organization. So you need to be able to have on Slack or whatever tool you use or Teams. You need to be able to ask a chatbot questions and get really reasonably good answers. And then you need to have some agents that are sitting on your code base that are looking for accessibility bugs on every commit that are paying attention to the user issues that come up and try and vet and validate them and automatically do a PR.
And it really requires some effort to be robust and to steer the model well, but it's definitely doable. I think we finally reached the point. We're just talking the last few weeks. We finally reached a point where I think you can build really good bots that will write the code to fix it as soon as the bug comes out. It shouldn't be that hard. And what I have in mind is building something that that really sources what they say and proving it. So you say, okay. Here's a quote, but it's a real quote, not a lie. This is the source. This is the fix. All you have to do is vet it and accept it. And once you know how to fix that kind of bug, you're not gonna have to look it up again.
[00:49:36] Tobias Macey:
And as you have been working in this overall space of accessibility technologies and in particular investing in the benchmarking of these code generation agents and just experimenting with the various ways that AI can improve the accessibility of digital experiences. What is one of the most interesting or unexpected or challenging lessons that you've learned in the process?
[00:50:01] Joe Devon:
It's really about the culture. You can sit and come up with a regulation that you have to be accessible, and then the organization will focus around the legality of it and try and hit the letter of the law, or you can focus on the culture. And there are two aspects to it. If you can get buy in, top down buy in, and bottom up and middle out to reference that wonderful TV show Silicon Valley, You suddenly can have an organization that will operate on all cylinders, but in most organizations, don't have the finance department behind you. You are underfunded. You've got one to 10 people supporting an organization, often 100,000 and more, and end users sometimes in the millions, and you've got a handful of people. That is overwhelming.
So what you have to do is you have to focus on the culture. And the example I love to give is if you're a designer and you can't get a budget for accessibility, but if you do the learning yourself and you care about your craft, then you're gonna know if you're designing a chat interface that has an online offline indicator. And if you make the the online indicator a button that's green and an offline indicator that's red, you have now made something that is gray and gray to a color blind user. If instead you have that green and red, but you also write online offline because you don't use color alone to convey information, now you're good at your craft. Craft. And because you're good at your craft, your users are going to benefit. And you do not need to ask the CFO for permission to write the word online offline while you're designing because that's part of your culture, that you care about your craft, and you're gonna build something beautiful that also works.
That is something that goes far beyond trying to convince people, because everybody asks me, How do you prove that there's an ROI? Well, I don't know. How do you prove that usability is gonna improve your company? It's so obvious, but talk to the CFO about a usability budget. It's really hard. They don't care. They really don't care. But the people that are writing software and the people that are designing software, most of them care. And if they can understand the impact of what they're doing and that it costs no money to learn how to do it well, then you're gonna have an impact.
[00:52:19] Tobias Macey:
Absolutely. And particularly when code generation is becoming the way that software is being written and there is effectively a zero marginal cost for adding those accessibility features, then it makes it even more of a no brainer to just start including it as a de facto standard. This is true. Are there any other aspects of the work that you're doing on accessibility awareness, your AI model benchmarks, or just the overall role of generative AI in this space of improving the utility of digital experiences that we didn't discuss yet that you'd like to cover before we close out the show?
[00:52:57] Joe Devon:
Yes. This is this is a message to the AI practitioners, artificial intelligence. What is it we're trying to do? We're trying to emulate human beings. And how do you emulate a human being? You have to get personal. And as you're getting personal, you have to understand the differences, what our capabilities are, and if you want to build the best AI models, focus on the edge cases. And where are you gonna find the edge cases when it comes to human beings? People with disabilities. I guarantee you, if you make something usable to someone who's blind, to someone who's deaf, to someone who's deaf blind, you are going to make a way better product. You're gonna solve problems that you didn't even know existed because you solved for those edge cases. You are going to be the best AI researcher and engineer if you would pay attention, if you go to the edge cases. And I always love the people that that look for the edge cases. They're always the best at what they do.
[00:53:53] Tobias Macey:
Absolutely. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gaps in the tooling technology or human training that's available for AI systems today.
[00:54:10] Joe Devon:
That's a big question. I mean, if we're just purely looking at the benchmark, color contrast is the number one currently. But I think the biggest gap is there because it's expensive and it's hard, and that is trying to get testing with people with disabilities. There's a great organization called that will give you that feedback. And once you've seen someone with a disability at an organization like Fable go through your website or your mobile app or whatever your product is and show you the gaps, you just can never look at things the same way. But it costs money. I think once you do it, you feel like you haven't fully gotten your product to where it could be, and you'll always want to have that as long as there's a budget for it. So it's not an easy answer. I wish I wish that I could come up with an easy answer. Like, AI will just solve this for you, but you really need to you need to speak to your users, and that that has not changed since the dawn of commerce. Right? You need to speak to your users. And there are a lot of users with disabilities, so you need to speak to them and get their feedback.
[00:55:15] Tobias Macey:
Alright. Well, thank you very much for taking the time today to join me and share the work that you're doing on improving the awareness of accessibility challenges and your work on helping to push the industry further into investing in that space. It's definitely a very important mission. So I appreciate all of the time and effort that you're putting into that, and I hope enjoy the rest of your day.
[00:55:38] Joe Devon:
Well, thank you so much. I really enjoyed our conversation.
[00:55:42] Tobias Macey:
Thank you for listening. Don't forget to check out our other shows. The data engineering podcast covers the latest on modern data management, and podcast dot in it covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts@aiengineeringpodcast.com with your story.
Intro and episode topic: AI for accessibility
Guest intro: Joe Devons journey into accessibility
From Semantic Web to modern AI and coding with Cursor
Defining digital accessibility and personal motivation
Real-world barriers: captions, craptions, and meeting fatigue
Beyond websites: mobile apps, rideshare risks, and CLI pitfalls
Terminals and the accessibility mess
AIs role: better captions and auto audio descriptions
LLMs and accessibility: risks, AMAC benchmark idea
AMAC results: model rankings and why benchmarks matter
Fixing the basics: color contrast, linters, and training data
Are AI tools themselves accessible? Copilot, Be My AI, Ray-Ban use cases
Doomsaying vs opportunity: AIs impact and urgency to engage
Make accessibility first-class: learn from users and lived experience
Accessibility helps everyone: CLIs, agents, and machine understanding
Tooling gaps: automating screen reader testing and the Rubicon ahead
XR, wearables, privacy concerns, and uncertain standards
Upskilling testers with AI: steering models and reliability issues
What leaders should do: datasets, bots, and continuous remediation
Culture over compliance: craft, ROI debates, and zero-marginal-cost code
Final message: design for edge cases to build better AI
Biggest gaps today and the need for real user testing
Closing remarks and outro