In this episode of the AI Engineering Podcast Will Vincent, Python developer advocate at JetBrains (PyCharm), talks about how AI utilities are revolutionizing software engineering beyond basic code completion. He discusses the shift from "vibe coding" to "vibe engineering," where engineers collaborate with AI agents through clear guidelines, iterative specs, and tight guardrails. Will shares practical techniques for getting real value from these tools, including loading the whole codebase for context, creating agent specifications, constraining blast radius, and favoring step-by-step plans over one-shot generations. The conversation covers code review gaps, deployment context, and why continuity across tools matters, as well as JetBrains' evolving approach to integrated AI, including support for external and local models. Will emphasizes the importance of human oversight, particularly for architectural choices and production changes, and encourages experimentation and playfulness while acknowledging the ethics, security, and reliability tradeoffs that come with modern LLMs.
Announcements
- Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
- When ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.
- Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.
- Your host is Tobias Macey and today I'm interviewing Will Vincent about selecting and using AI software engineering utilities and making them work for your team
- Introduction
- How did you get involved in machine learning?
- Software engineering is a discipline that is relatively young in relative terms, but does have several decades of history. As someone working for a developer tools company, what is your broad opinion on the impact of AI on software engineering as an occupation?
- There are many permutations of AI development tools. What are the broad categories that you see?
- What are the major areas of overlap?
- What are the styles of coding agents that you are seeing the broadest adoption for?
- What are your thoughts on the role of editors/IDEs in an AI-driven development workflow?
- Many of the code generation utilities are executed on a developer's computer in a single-player mode. What are some strategies that you have seen or experimented with to extract and share techniques/best practices/prompt templates at the team level?
- While there are many AI-powered services that hook into various stages of the software development and delivery lifecycle, what are the areas where you are seeing gaps in the user experience?
- What are the most interesting, innovative, or unexpected ways that you have seen AI used in the context of software engineering workflows?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on developer tooling in the age of AI?
- When is AI-powered the wrong choice?
- What do you have planned for the future of AI in the context of Jetbrains?
- What are your predictions/hopes for the future of AI for software engineering?
Parting Question
- From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?
- Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
- JetBrains
- Simon Willison
- Vibe Engineering Post
- GitHub Copilot
- AGENTS.md
- Kiro IDE
- Claude Code
- Jetbrains QuickEdit
- Claude Agent in JetBrains IDEs
- Ruff linter
- uv package manager
- ty type checker
- pyrefly
- IDE == Integrated Development Environment
- Ollama
- LM Studio
- Google Gemma
- Deepseek
- gpt-oss
- Ollama Cloud
- Gemini Diffusion
- Django Annual Survey
- Co-Intelligence by Ethan Mollick (affiliate link)
Hello, and welcome to the AI Engineering podcast, your guide to the fast moving world of building scalable and maintainable AI systems. When ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models. They needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App relies on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated.
Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows, but Prefect didn't stop there. They just launched FastMCP, production ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing fast Python execution. Deploy your AI tools once. Connect to Claude, Cursor, or any MCP client. No more building off flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and FastMCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.
[00:01:29] Tobias Macey:
Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most, building intelligent systems. Write Python code for your business logic and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML and AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch.
Build end to end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin. And for DBT cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud. Your host is Tobias Macey, and today I'm interviewing Will Vincent about selecting and using AI software engineering utilities and making them work for your team. So, Will, can you start by introducing yourself?
[00:02:36] Will Vincent:
Yes. Thank you for having me on. I'm a Python developer advocate at PyCharm JetBrains working on the IDE, and I have a long background in web development with Django.
[00:02:46] Tobias Macey:
And do you remember how you first got involved in this overall space of ML and AI?
[00:02:51] Will Vincent:
Oh, so I think probably about two years ago. I mean, it it feels like it ages in dog years because it's it's only been about two years, but it feels it feels like twenty. Yeah. So I've been using AI tools, yeah, for at least two years every day. I think initially, like many people, chat GPT, but then Claude, you know, in the browser. And then in, I guess, this year, especially, agentic use, which we're gonna touch upon. So for me, initially, it started off as, I think, a better search or putting in snippets, but as the context windows, which we'll talk about all this, has expanded, yeah, now it's it's a little more YOLO. It does it's not fully, but, you know, you just hop into the command line and shove everything in and ask questions. And I've been working on techniques to kinda get the most out of these agents, but I'm fully on board with, like it's way more than autocomplete. I don't know if I would call it intelligence, but there's there's something there. Like, we'll talk about this. I have everyday moments where I'm like, how could it possibly be that clever? And then it'll also just make, you know, hallucinate just make total mistakes. Right? So it's completely untrustworthy.
[00:03:49] Tobias Macey:
Absolutely. And the other interesting juxtaposition as well is that in the initial introduction, as you said, it was fancy autocomplete, and it was helpful sometimes. And sometimes it would just suggest things that were completely pointless and just got in the way. And now we're at a point where there's a much higher degree of consistency in terms of its actual utility if you steer it right. And there's also the difference between just code generation of the, you know, the the common trope this past summer has been the idea of vibe coding, of just give it a prompt, and it'll spit out some code. And you don't even care what the code is as long as it does what you said. But then you jump over into the principle and practice of software engineering, which is an admittedly young career and occupation in terms of the history of the world and the history of human work. But we do still have several decades of research and practice and techniques that have been built up. And I'm wondering what you see as the overall impact that these AI utilities are having on software engineering as an occupation and as a discipline.
[00:04:58] Will Vincent:
Sure. So a lot to unpack there. Simon Wilson, one of the Django creators, and he writes a lot. He put a post out recently about the term vibe engineering, which I think is more how I'm thinking about it. So there's vibe coding, which is someone who's never code before who can use these tools, you know, cursor especially to create something, and they think, oh, now I'm a coder, you know, over a weekend. And that's one level, and I think it's good that it lowers the bar to creating things. In practice, you do hit there are things you need to know, and having a background does help. And so Vibe Engineering is more hopefully how I'm doing it and my friends are doing it, where we're combining these tools with some knowledge and guiding it and treating it like a yeah. Sometimes a junior level developer, but also one who's read everything there is to read and can do a lot of things.
So I guess I would just say I think of it much more as Vibe engineering, and that's kind of the more interesting place for me because Vibe coding, you know, you try it out, but, like, you're gonna hit walls pretty quickly around trying to build anything remotely complicated or anything with a database. But if you already have some background and then you add these tools in and you work on how to get the most out of them, which we'll talk about I mean, I use the agents all the time now. In fact, I the the big thing for me is I wonder how anyone's gonna learn to code in a way because if you already have some background and some fundamentals, you can just fly with these things, but you kinda don't have to type anymore. You know, a lot of people or not a lot of people. Some people who are very tip of the spear talk about not even really using IDEs, just command line interface, reviewing and get, and yeah. So the nature of coding, I think for sure has changed, but you still need to kinda know what you're doing. And I think it's just easier than ever to skip those stages of, like, typing out a four loop and, you know, like yeah. I sort of wonder about that. That to me is an unresolved question. But if you already know what you're doing and you use a modern model and, again, we'll talk about how to get the most out of it, 100% it works. Like, you can fly, especially if you're using something like Python or Django, which are super well documented. If you're using modern things or newer things that are less documented, obviously, they won't work quite as well.
[00:07:04] Tobias Macey:
Yeah. To the point of the terminology around it, one of the other phrases that I've heard come up recently is the idea of AI native engineering where Oh. Rather than just throwing a prompt in, you do more of that spec driven development, which is what Cloud Code and Kiro are very focused on of rather than just going straight from prompt to implementation. It uses an intermediate step of let's negotiate on what are the actual details and scope of the problem to be solved. We'll write that out into a markdown file, and then we can either farm that out into sub agents to do that work or just use that as an iterative, checkpointing tool to make sure that the that you and the agent are on the same page as it goes through and so that it doesn't diverge from the initial scope too far.
[00:07:54] Will Vincent:
Yeah. I like that term negotiate. I mean, it's so it's so tempting to anthropomorphize these tools and think of them as humans when, of course, they're not. But I find anecdotally, the more I do treat it as a human, like, I'll be like, try harder or, like, nope. Again, you kinda get more out of them. Like, I even I think we shared in a before we started recording, I had something where I was playing around with CSS with a with an agent on my personal site, which is written in Ruby and Jekyll, which are not I'm not super knowledgeable on them, so I feel fine sort of sitting back and letting the agent go. And there was something with responsiveness where I just couldn't get it right. You'd say, I fixed it. I was like, it didn't fix it. I fixed it four times. And finally, I was like, you know what? Let's just switch to Bootstrap or Tailwind. And it literally said, no. No. No. No. Let me try one more time, and then it got it right, which sort of begs the question of, like, what is going on in there? Right? Like, why didn't it just get it right the first time? Like, I didn't provide new information. I didn't I just sort of threatened it in a way. Like, hey. You're not doing it right. So, you know, that's I think this was early on termed emergent properties, you know, somewhere, like, was it GPT two to three where it's just like, oh, hey. It sounds a lot more human. But, like, we don't we don't know. Like, even people working on these, we don't really know what's going on. Right? I mean, that's kind of machine learning. Like, we can't it can't tell us what it's thinking. But, yeah, there's there's ways to I I there's a long way of saying I find if I treat it like a person, you know, with respect, but kind of push it a bit, I get better result. Though, again, that that can change. Right? They are always updating the the filter on how how much it agrees with you and not. But, certainly, the modern version, try harder, and it seems to do better.
[00:09:24] Tobias Macey:
Another addendum that I found useful in working with some of these tools is explicitly saying, if you are unsure of something, don't guess. Ask me for clarification and direction. And that, I found, has helped to prevent it from just going off on its own and trying lots of things when I already know better than what it's trying to do.
[00:09:46] Will Vincent:
Right. I mean, that presumes it sort of knows when it doesn't know something, which that's sort of a almost a philosophical thing. But I like I like what you said earlier about basically having a spec for it. Like, sort of the it's ironic that, like, you know, humans would never write a spec for another engineer in the team, but we're happy to do it for an LLM to to do a better job. But to your point, I I just last week, I was at a gathering, and someone was talking about using LLMs for the first time, and they didn't have a good experience. But they just tried to one shot it. So they you know, and I think at a minimum, you need to have a a spec sheet basically.
And, you know, people often don't think of the fact you can ask the LLM. You can ask the model to help you write the sheet, you know, to write the spec. You know, you don't have to just do it from scratch. In fact, I would recommend to someone new to this, put in the whole code base of using an agent and then say, I wanna create some guidelines or rules, help me figure it out and have some back and forth. Like, you don't have to do it all yourself. But to not at a minimum, to not use rules or guidelines, you how could how could it possibly know what you want it to do?
[00:10:49] Tobias Macey:
Absolutely. And even in the GitHub Copilot documentation, they actually have a prompt that you can use to generate an agents dot m d or Copilot instructions dot m d where it will tell it to look through the code base, try to identify common patterns, common utilities, and then generate that into an agents.md that you can then later use for bootstrapping those conversations. But beyond just the code generation capabilities, broadening it out into software engineering, what are some of the ways that you're seeing the different permutations of AI imbued in the broader ecosystem of tools that a software engineer relies on to do more than just write code? Because, obviously, there are things like code review, deployment, testing, monitoring, etcetera, that all play into the overall space of actually writing code that provides value to end users.
[00:11:44] Will Vincent:
Yeah. Yeah. Big big question there. I mean, everywhere. I think you have to if I make the question more granular, I'm seeing it yeah. How do how do we tackle that question? I guess what I would start by saying, I'm seeing it still not used in a lot of places. I I still sorry to change the question slightly. I wanna get to that. I'm still seeing a lot of places either for privacy reasons, and they don't know they can use local open models, you know, through Ollama or LM model, or I'm seeing people just not put any effort into them. But you can just fly if you put a teeny bit of effort into it. You know, you don't have to run parallel things. You don't have to have a suite of agents. You don't have to spend thousands of dollars a month. I think if you just, you know, use a modern a modern agent, pull in the whole code base, have a spec sheet, and really just constrain it. And and I often so sorry to change your question. I'll often tell it, like, don't write code. You know, like, tell me what you wanna do. And there are some new tools, like, for example, in JetBrains, we have one called, I think it's quick edit that kinda does this for you because it's unreasonable to expect people to sort these out on themselves. I think in a year, they'll be much more hand holding on this. But right now, when I talk to it, I say, okay. Here's the code base. Let's do some guidelines.
Let's tackle this task, like but, yeah. Don't write code. Try to make it as small as possible because I can't it'll just go ahead and rewrite a 100 files if I let it. I just have to keep it really constrained so I can reason about it and spot things because I can't review more than a few files at a time. So for myself, I try to keep it really small, get commit as I go, and that way I can spot something. So yes yesterday, I was working on a project where a colleague is wrote something in Java, Kotlin back end with a Next. Js front end and wanted to have a Django piece on it. So he's like, hey. Can you come in and just do the Django piece? And okay. I don't know Java and Kotlin really at all, but I was able to hop in with an well, I use cloud code in this case and say, hey, analyze the code base. Like, look in the database. I think it was a Prisma database, which again, I hadn't really have any experience with and just map out and go step by step. But because we went step by step and I'd have some knowledge of Django, I could spot where it, like, used a technique that's 10 old that's really bad, for example. And I was like, oh, what about that? Like, switch that. But if I just, like, tried to one shot it or did it quickly, we wouldn't have got it. So I guess a long way of saying it took about I think it took about two hours of, like, dedicated focus. But I had even if I just typed it by hand, it would have taken me almost two hours. And I flew through and it got 80% of it right. And I was able to monitor it so I could figure out the 20% it got wrong. So, really, it's much more I see it as, like, these, like, rearing horses that just wanna write code and do stuff, and you just have to, like, pull back on them and control them. That's how I get the most out of them.
[00:14:21] Tobias Macey:
And the other interesting piece too is even just using GitHub Copilot as the example just because that's one of the ones that I've been using actively at work. You have Copilot in terms of being able to generate the code, but then you also have Copilot in the context of the GitHub UI where you can use it for generating code reviews. And I find it kind of hilarious when you have Copilot writing the code and you're doing some monitoring and maybe some minor editing of it, but then you have Copilot reviewing the code. Right. And then it is one, it's just kind of funny that you're having the the AI check its own work, but also the fact that the review agent often doesn't know anything about the work that's being done and will just gloss over pretty substantial areas of consideration, particularly if you're doing any sort of structural refactoring or you're trying to do net new architectural decisions.
And it'll just find the most obscure little thing to say, oh, this is what I'm going to comment on. But then you run it again, and it'll find something totally different. And just the the lack of continuity and context between those different portions of the life cycle, I find interesting. And then, also, if you're using different tools for those different points, then it's even more obvious that they're not going to share any context or information beyond the text that is, at the end of the day, the thing that you care about, the the final artifact of those interactions. Deployed environment, most of the time, and and this is true of humans too, is you write the code, but you have no conception of where it's actually going to be running. And maybe if you know that it's going to be running on a virtual machine versus in a Docker container, you'll make different decisions about how you think about subprocessing or threading, etcetera.
Or maybe you're going to have to use an object store instead of a local file. And, obviously, we have abstractions to be able to handle those types of things, but there's no there's often not any real consideration in terms of the agents unless you guide it that way. But those are things that need to be considered in the implementation path. And, again, this is true of humans as well. But just thinking through, like, as we try to bring these agentic capabilities into the same space as human operators and treat them effectively as coworkers or at least as employees in some abstract sense. We need to be thinking about how do we manage that continuity of context and informing the implementation details based on the realities of the operating environment that we're targeting.
[00:17:01] Will Vincent:
Yeah. And I think part of it is so much comes back to the the context window. So if you're you know, so every so if you load in a code base and you have a back and forth, all of that is included with future responses. If you then go hop into a totally different context to do a code review or something else, it doesn't have any of that knowledge. So on top of the fact they're nondeterministic, it's just gonna hop in and make other assumptions. I mean, I wonder have you? I'm sure you've, like, put in a doc of, like, this is the deploy deployment flow, and this is how we do it. Like, have you found that that gets your agent any better responses if you say, like, here is what I'm trying to do here?
[00:17:34] Tobias Macey:
So for cases where the deployed environment does matter, I'll often guide it and provide that context to say, this is actually going to be running in this situation or this is the overall architecture, and then that will help to guide it. Often, what I find most helpful and this points again to that divide between people who can be massively accelerated by AI agents because they already have all of the necessary context in their own head to be able to understand where the LLM is missing that context versus people who are earlier in their career path and maybe haven't already built up that domain knowledge and context and don't know to provide that guidance is that I will oftentimes know exactly what I'm targeting. I just don't want to go through the process of typing it all out Right. Internalizing the the details of a particular API interface. I just know this is broadly what I want. Here are all the documentation references. Here's all the code that I know that you need to be looking at. Go ahead and do the thing that I said to do versus the other side of the fence of I am a new developer in this team or even in my career. I don't even really know what I'm trying to do yet. I just know that this is the thing that I was told to do.
[00:18:46] Will Vincent:
Right. Well and I think even I find if I'm, you know, sort of leaning back and agentically coding something, even if I wanna jump in and just do a small amount of coding myself, it feels it, like, needs to be a word for that. Like, it feels weird. It's sort of like, like, really? Like, I have to type it myself? You know? So I'll find myself telling the agent to do exactly what the code I want, but I just can't be bothered to hop into the IDE myself sometimes or or to, like, you know, switch over to the IDE. Yeah. Yeah. So it's I mean, there's also I'm curious what you think. I've I'm struck by there's this term also. Someone came up with a blast radius of agentic code where, you know, sometimes you try to one shot something and there's, like, this huge bomb and all this code and it's where you can do smaller ones or or sometimes, you know, if you ask it a task and it's taking an extra long time, it's like, it shouldn't have taken that long. You know, you can, like, escape and then ask it, like, hey. What's up? I know there's a whole new vocabulary that's coming around with with how to use these tools. And I guess to the point I first started with, I'm still trying to figure out how do I actually write code, use the agent. Because so for me, it's helpful to and this is not just because I work there. Like, it's helpful to all have it within an IDE, which JetBrains and, you know, Versus Code and other companies are doing. But there are friends who are tip of the spear who tell me that, like, IDEs are going away. All they do is use a Git client to review. They use the command line and because they can just tell the agent to run the code. So I don't fully agree with that, but I can sort of see how you get in that direction where you're just harnessing a fleet of agents doing stuff. And if you have the capacity to review it, I don't know. Maybe that is the brave new world. I still think people want it kind of all in one place and, like, the ability to hop in and out a little more, but I don't know. Open question.
[00:20:19] Tobias Macey:
Absolutely. And to the point of blast radius, an interesting point that came up in another conversation I had recently is that in order to mitigate that, one of the architectural patterns that arose a few years ago as a result of the potential for blast radius even just with the human developers is the idea of microservices where you split apart the overall application into smaller independently deployable units because it was easier to manage the organizational structures and who was working on what more so than it was any sort of architectural or deployment optimization. Yeah. It was like a huge justification. It's just, like, just bump up the responsibility a level. I mean, but The interesting thing is that it's the idea of Conway's Law and how the communication structures of the organization are manifested in the software that you develop, and these LLMs are now a new participant in that communication structure, which then changes the ways that we need to be thinking about structuring our software to be able to manage those communication patterns.
[00:21:23] Will Vincent:
Yeah. I mean, there's also just these once you've been coding for a while and and you've seen it, I've seen it, that these pendulum shifts, they kinda become less interesting, you know, like, for you know, yeah. It's sort of, like, what is interesting? Like, is it arguments about, like, single page applications versus server rendered templates? That's one. Like, there's just a host of these ones that on the second or third time through, it's harder to get worked up about it. But you do need to have I think, especially with these agents, you have to have your own taste. Like, maybe this is an optimistic take, but, like, you have to have an opinion to guide it because it will do kind of whatever you want it to do, but it doesn't, by definition, have good taste. It's just, you know, you have to say, hey. I like to do my code this way or, you know, look at this. You know, if you wanna be super lazy, you can say, look, you know, look at this code base and try to mimic those styles. You know, there are sort of shortcuts, but you still have to provide some degree of taste on it. But, again, if you've never I wonder, how do you have taste if you don't write code yourself? You know? I mean, maybe that's me being being an old fuddy duddy, but it's like, well, if you've never really, really written code. Yeah. I don't know. I mean but, you know, they can do so much more than write code. I mean, I use it. I use them all the time to just review stuff, you know, to just, you know, hunt for bugs, but, like, think of ways I could optimize this. I mean, just sort of, again, like a colleague who doesn't get tired, who's infinite patients, who I never fully trust. But, you know yeah. I think you said earlier on, like, maybe if it you know, if I say, hey. Give me five ideas on this, then I keep pushing it a little bit. Like, I wish I wish you could find a way better to have it review itself.
You know? Like, are you sure? Are you sure? Are you sure? Because I feel like if you asked it the right way or enough times, it maybe it would find the issue. But, like, why would I why do I need to ask in the first place? Right? I mean, that's that's the theory is that these agents can sort of do do that, but that hasn't been fully ironed out. Because if it was, it would just, like, get it right more of the time.
[00:23:12] Tobias Macey:
And circling back on the question of the role of IDEs in this new world, particularly given your position at JetBrains, which is one of the prominent developers of IDEs. Obviously, you're very invested in that question, and I'm just curious how you're seeing people address that consideration where the IDE as a concept and as a principle came about because of the need for context management at the human level of you want to be able to have information at your fingertips and be able to have the sidebars that have your tree layout and the symbols and your terminal and test results and build status all in one place and be able to jump through and have helpers to automate some of these different tasks. And as we move into a world where many of these agents are very focused on the command line because of the fact that it gives access to a broader set of utilities, how do you see people tackling that question of where, when, why, and how do I use an IDE versus just letting the agent do it for me?
[00:24:12] Will Vincent:
No. It's a it's a big question. I mean, I our CEO earlier in the year sent out an internal I forget if it's a video or a memo, basically saying, we don't really you know, we don't know what the future is gonna be, but, like, the company has always been building tools for developers. It just so happened that an IDE was one of the best ways to do that in the last twenty some years. But if even if that shifts, like, the company, JetBrains, has a lot of different products. Like, it's not necessarily wedded to the IDE. I mean, it is in terms of revenue and stuff now, but the goal is to empower developers. So even just integrating these tools, you know, one thing that's happened to us is we've been adding trying to integrate AI features in as has everyone. But I think there's a consensus now that you it's very difficult to make money, like, reselling tokens as opposed to just having a really tight interface because everyone was losing money off this. Right? Like, people didn't realize how expensive these token things were, and you could do tricks around optimization and caching that Cursor in particular was trying to do. But if you're selling something for a dollar and only making 30¢ or whatever the economics are, like, that's not sustainable.
So internally, it's been super interesting. Like, we had this tool, Juni, which is our own agent, and then external agents are now even more powerful. And so, like, we've added Claude into PyCharm and the other IDs. So we're, I guess, a long way of saying we're adapting as well to what does this world look like. I still think in some ways you need an organizing way forget for the command line to write some code to do search, but I suspect it'll be quite a bit different than it is today, you know, in five years' time. But I think most people don't wanna just have four or five different tabs open to do different things. I think there's value in integrating it all, but, yeah, it's changing. I mean, I'd also add there's in the Python space, there's even better tooling. You know? So astral, right, with rough, with UV, that's really helpful. That changes things. And now with the types, you know, there's t y, there's pyrefly. I think there's probably another one. Yeah. It's a brave new world where JetBrains has done like, so PyCharm has had strong typing support for a very long time because these tools didn't exist. But now that there's open source tools, there's the question of, okay. Do we integrate them? Do we modify them? Do we fork them? Do we sponsor the people doing them? You know? Ultimately, JetBrains, PyCharm wants to just have the best experience for the developer, and part of that is being humble enough to change things around as AI has changed for us. Again, because we had our own agent, we still have it, but you can use Claude just integrated in. You can also hop in the terminal and use codex, whatever you want. You know, there's no restrictions on what you can use. I believe Versus Code is similar. You know, you can use whatever models you want. I mean, personally, just to finish this long thread, I'm excited about local models. I think those are really, really interesting. So you can already certainly in PyCharm and JetBrains ones, you can hook in AI chat and features into local models, or you could just use Ollama. Like, I like to play around with that a lot. And some of the newer models on there, Gemma models from Google, DeepSeq has an 8,000,000,000 parameter model, OpenAI, I'm blanking on the name, they have their own one. But then you can even do things like there's a 671,000,000,000 parameter DeepSeq model that you can run with Olama Cloud.
So this is the idea that they'll run the inference for you and charge you presumably less than if it was a proprietary model. That's super, super interesting to me, the idea that you can have almost frontier level, presumably secure, less cost. So, yeah, long long answer to a short question, but JetBrains is adapting with wherever whatever helps developers. And I suspect there'll be some role for an IDE, but what we call an IDE will probably look a lot different in five years.
[00:27:37] Tobias Macey:
I think too that one of the patterns that are starting to develop around the use of these agentic development utilities is that the challenge then becomes managing the parallel work streams where one of those aspects is the rise of the obscure usage of Git work tree is now being a fairly common application where you can actually make a copy of your Git repository, stash it into a subdirectory, have an independent flow of work that's happening there that's not going to mutate your state in the your default branch. And so being able to then actually have multiple agents working on different paths of implementation in parallel without having to conflict with each other as far as the status of your Git repository, but then being able to merge it all back in. And then just the question of, well, I've got four different terminals running with different agents doing different things, and maybe the IDE then becomes just a means of managing the context engineering for those agents, the view of what is its status, determining what are the appropriate permission scopes for different use cases, and it just becomes more of an orchestration hub more than an implementation
[00:28:52] Will Vincent:
hub. Yeah. I mean, that's super interesting. I mean, it also presumes that agents are the current speed that they are. I mean, if they were two x, five x, 10 x faster, you wouldn't you know, would you need to run them in parallel as much? Maybe. Or maybe you need to toss harder problems at them. I think it also comes down to what how does your brain work? I mean, I'm I'm a single monitor kind of person. I don't want six windows up. Like, I just I can't deal with that. But some people can. Right? So so for me, I'm curious. I'm aware of these parallel agent approaches, and some people I know do them. I barely can keep up with one agent, let alone six. Though I do you know, there's that old x k c d about, like, developers hitting swords and, like, you know, get back to work. My code's compiling. It does feel like waiting on an agent is sort of like the new compilation level where it's, you know, short enough that it's long enough it breaks you out of the flow, but it's not so long that you, you know, take thirty minutes and go run an errand or something. So I would love to find a way to be productive during that time I'm waiting for the agent, but I think for me anyways, spinning up more agents just further muddies my own limited human context. But maybe your experience is different.
[00:29:58] Tobias Macey:
Yeah. So generally speaking, I have largely been single process, particularly given that with the Copilot command line agent, anyway, it will often ask for permission or confirmation as it does things versus with, like, the Gemini or Claude code. You can say, just do whatever you want and let me know when you're done, which is a little bit scary.
[00:30:22] Will Vincent:
You can change that, though, I think. Right? Like, you can I think it was Armin Ronicker? He has, like, a custom YOLO mode. I guess he uses that with Claude, but, like, he wrote a a little script to just be, like, just don't ask me. Like, just Yep. I can't speak to, like, the GitHub implementation, but I'm sure there's a way to also just tell it to, like, you know, go for it. Yeah. But also, is that really what you want? Because then you're Wow. Ceding even more control to the agent and its,
[00:30:46] Tobias Macey:
decisions that it makes. But, also, I think it's interesting to think through to your point about speed of execution is that if it is a lot faster so for instance, Google had their diffusion model that they were experimenting with that was substantially faster than the corresponding Gemini model that was released as, I think, the Gemini two point o version. And so as the speed of the models improves and as the context windows expand, maybe we do give them more complicated tasks with more trust engendered, presuming that we've built up the workflows and the guardrails that make us feel comfortable with that.
[00:31:26] Will Vincent:
Right. But you still, you know, you still need a human to look at it. Who knows what they're doing. Right? I think that's and that's the problem. You can't this idea that, like, one agent writes something, another one reviews it, and you can truly sit back, you can't get there. But it's I mean, another metric I've seen recently is I forgot exactly what the term was, was basically, like, the cost of performance per year. Like, so models are increasing in how they hit on these benchmarks, but the cost just keeps dropping. And it was something like a three to four x per year improvement of, you know, similar levels of responses from agents. There is the OpenAI scaling laws, which, like, hey. It's not a law. It's just, like, a thing that happened, and now it's leveling up. But it is crazy. I think about, like, the cost coming down and also the speed improving through various techniques. Yeah. What does that what does that look like? I mean, so you I think you still need to have some human component in there, but are we all just, yeah, sitting back and reviewing pull requests? But maybe we don't even need to. Right? I mean, especially GitHub is doing new stuff with AI reviewing PRs. Like, I suspect a lot of stuff is gonna really, really break before people go, hold on a second. Like, I mean, we're seeing that at companies. Right? Some companies that make a big swing and, like, fire most of their developers and then have to rehire them. You still need developers. I just, yeah, I think about, like, but who wants a gen junior developer now? If especially if a mid senior level developer can go three, five times as fast, whatever the number is. Yet at the same time, the younger developer is probably more AI native. Right? That's the other that's the other tension. Right? I mean, I'm I'll go into companies with professionals, super intelligent programmers, and they're barely using AI tools because they're busy just solving real world problems. And they haven't it's not easy enough for them to jump in and use and create a spec and guide it and fight with it. They're like, I already have a backlog of tickets. I can do the work. Like, I just need time. I kinda Yeah. I wonder how all that plays out. Absolutely. Well, I mean, the the first way it plays out is that we never end up with senior engineers because they retire, and the juniors never get hired in the first place to take their take over. But I think too that that also speaks to the broader question of a lot of these AI tools are executed on a single person's machine,
[00:33:26] Tobias Macey:
largely driven by a single person, and that makes it a little difficult to manage that shared context, shared techniques at the team level. We have utilities such as the Google Jules or Copilot's agent mode where you can give it an issue, and it'll generate a pull request. And so that brings it a little bit into a shared space. But one thing that I've been planning on experimenting with is actually writing a little utility to mine the history of interactions with these agent utilities to be able to extract what are the techniques, what are the prompts, styles, what are the pieces of information that were used to be able to produce a given result both for positive and negative cases, and then trying to generate some sort of shared corpus of best practices that other teammates can build from as well as being able to populate the various agents that MD for guidance of these systems. But just curious how you and other people you're interacting with are thinking about the transition from I, as a single engineer, can be able to move faster to how do I then turn this into multiplayer mode where everybody can benefit from the things that I'm learning in my own interactions?
[00:34:39] Will Vincent:
I think that's a great question. I haven't seen a ton of people with the space to ask that question. I think it's more about trying to get an entire team to use these agents effectively, and that's a big step. Again, because there's no short agreed upon guidelines of how to do it other than just like, hey. Just play around and give it a feel and work on stuff that doesn't really matter. And then, like, when do you know that you can work on production level code? I really like what you said about sharing context across teams. I mean, that is it is sort of wild in a way that these, like, silos of of context with these conversations. And even as a context window expands, it'd be nice to share that in some way. Because even if but even if you, you know, you found these prompts, these techniques worked, we're getting new models two, three times a year from the leading providers. So there there's no there's no guarantee that the same things will work. You know, you you're gonna rerun it each time. Like, it that feels undoable. The area I would love to play in if I had the time and expertise is, like, fine tuning models on, you know, code base. So if you I don't know. MIT where you work. Right? If you just take all the code and dump it in and fine tune, you know, a deep seek model, like, what does that look like? Is that is it better? I mean, it could be. Maybe it is. Maybe it isn't. I I think there's gonna be a lot more of tools to, like, just drag and drop in an entire company's information and fine tune and then go off of that both for security reasons, but also it's you know, we shouldn't have to boil the ocean to get a good response if we only care about a small little sliver. And I believe that with fine tuning, you know, when once you as you fine tune around a specific area, it you know, the model loses capacity in the other areas. But if you only care about, I don't know, Python coding for this code base, like, do I care that now the JavaScript it writes is bad? Like, not really if I know that it's not tuned for that. That's kinda what I wonder. Like, as as simple as can you just drag and drop in JPMorgan's all their code? Poop. Here you go. Secure model, deep seek, you know, in an abstract sense. I suspect we'll get something like that. I know they're trying.
[00:36:30] Tobias Macey:
Another interesting approach that I think could help address some of that question of bringing junior engineers into the mix and helping them be productive, particularly with the caveat that generally they don't know all of the built up domain knowledge that people who have been in the team longer have is with that approach of spec driven AI engineering where the first step is that you iterate with the model to determine what is the scope of changes that need to be made for a given solution. Maybe that is the interface that the junior engineers onboard with is that they maybe even pair with a senior engineer to iterate on developing the spec. And then the junior engineer is left to guide the model through the implementation, watch the way that it's doing it, maybe ask questions of why it's being done that way, ask questions of the senior engineers of, is this going in the direction that you want it to, and just being able to even just generate some incremental pull requests of as the model makes progress on the spec, get some early feedback on the implementation to help guide future iterations, and just maybe that's the way that we bridge this divide of senior engineers being multiplied in terms of their impact, but also then being oversubscribed asked to do. Maybe they can be the ones who are moving a little bit higher up the level of abstraction where they're doing that spec development and some of the review, but they're not the ones who are just babysitting the models as they go through the implementation and churn out code.
[00:38:02] Will Vincent:
Right. Like, we're all software architects now. I mean, of course, it's like, well, why not just spin up another parallel agent? Why even higher and train a human? I mean, I think I mean but, you know, it's not like it was perfect before. I mean, I find I use them sometimes, you know, sometimes to sit back and code stuff if it's a language or a framework or something I'm not as familiar with or it's, like, I'm tired at the end of the day. But, like, they're great learning tools with the caveat that they lie. You know? So, like, just using it to explore. I mean, because back in the day, if if I had a bug, what would I do? You know, back fifteen years ago, I would try to, like, think of a great question. I'd go look online. Right? I'd look on Stack Overflow. I'd look for blogs, and you sort of snippet and piece things together. It wasn't like there was a free, high quality, easy to find resource before. You know? So I don't wanna romanticize the past too much, but I think there's something in that process of, like, just manually typing things out, having things fail. I mean, that's one of the great things about agents is they'll try something, it fails, and then sort of fix it themselves. But, like, how do how do our brains keep up with that? I don't know. I I'm working on it. It's also possible you just need to learn, like, the syntax of programming and, again, treat it like your own private teacher. I mean, teachers aren't infallible. Teachers make mistakes. You know? Maybe we're looking at the wrong way, but I don't wanna, like, over romanticize how it was before. Like, there's you know, you and I can fly with these tools as is, and they're just getting better. So I think of, like, my younger self, you know, I'd hoped that what took me two years to learn, I could do in, I don't know, six months now, you know, just because I'm just not hunting around for answers. I'll get answers, maybe not the right ones, but at least I get something, and then that can take me down the rabbit hole. The harder challenge is, you know, I'll speak for myself. Like, I've written a ton of tutorials and content and books and stuff that have all been gobbled up by LLMs. So for me and every other content creator I don't actually, I wouldn't say author. The economic incentive goes away. You know? I can see traffic to my sites falling off a cliff. So I don't feel as inclined to write, like, a detailed tutorial that no one will find because Google's not gonna send me link juice anyways, and then the LM will just consume it, and I won't get any attribution. So I can't then send sell any books or anything. So that's an issue.
I don't know who's gonna take responsibility for that one, but why post on Stack Overflow? Like, what's the point? That's a philosophical one.
[00:40:12] Tobias Macey:
Yeah. Well, it's one that Stack Overflow is wrestling with at the moment as well, and particularly because, generally, the posts that, at least, that I've posted are the question because I don't have an answer. But if I have an LLM to help me iterate more quickly to resolving it, then I don't need to ask the question in the first place.
[00:40:29] Will Vincent:
Right. I mean and so Django puts out an annual survey, which I've worked on and JetBrains sponsors it, and that's coming out soon. And one of the questions we ask is how do you learn Django? And Django has unusually good documentation, so it's still I think it was, like, 79% of people said the docs was their first choice. But then second place was YouTube, Stack Overflow, and AI. And this these results are number of months old at this point, so I'm sure that's risen. You know, books and blogs, which would be, like, I would trust because it's, like, the author's reputation, where something like 20% and, like, that will probably only continue to decline. So yeah. I wonder. At the same time, though, I think that brings up a separate point, which is that Python and Django in particular are unusually well documented and mature technologies. If you try to do, I don't know, fast API. Right? Fast API is a, like, a really cool Python framework that's not that new, but it doesn't have the corpus of Stack Overflow answers as something else. So I in my experience, the responses aren't as good as they are for Django just because it doesn't have the training data. Maybe it makes it harder for newer tools to be adopted if people can't use them agentically because there isn't all that documentation, but why write it if people don't use it?
[00:41:37] Tobias Macey:
It's also interesting too in terms of what technologies you select to implement something because if it's something that is more easily introspectable, it makes it easier for a model to find and fix those issues. So even in just my own use where I'm predominantly working in Python, I've thrown a problem at the Copilot agent. And in order to help identify what are the actual valid interfaces, it will just execute a Python command to import the module in question and then do you know, either use the inspect module or the help module to print out the documentation that just exists in the code. Right. And then it will say, oh, this is what I need to do, and then go on its merry way.
[00:42:20] Will Vincent:
Right. I know you feel super lazy. I mean, there is this sense I don't know if you have this, if it's brain rot or, you know, these studies showing that people, you know, forget how to code, forget how to write. I don't know. There's something to that. Again, I don't wanna over romanticize the past, but I do think you need maybe a little more struggle with coding, writing, thinking than models give you. So, maybe the superpower is to be our age, you know, or my age, forties, where I had experience in the old world, but I have experienced in a new world that I can get the best out of both. But a new person yeah. I mean, outside of, like, a class closed book session, of course, they're I mean, they're already using chat g p t for everything and code in particular. Like, are you really gonna struggle through algorithms? I don't know. Maybe you didn't need to struggle all the way, but, like, there's some, like, forced deep thought that, I think that maybe that's why I keep coming back to, like, deep thought. Like, when you're just waiting on the agent, I don't think that's deep thought. That's the agent thinking. And it's it's a I don't like that feeling of being reactive to when the model is done with something. It could be fast. It could be slow. It just feels like completely wasted time. Like, I I put out a post saying maybe it's like a flat spin for, like, a jet. You know? It's like in Top Gun, right, where they get into a flat spin, and you're just spinning. And, like, you might be able to get out of it, but you might not. But you don't know. You know? You're just sort of like, oh, here we are. Yeah. It feels like that for me. Like, I don't I don't like that feeling. I don't like forced reactivity,
[00:43:40] Tobias Macey:
in a way. I don't know if you feel that or not. Well, yeah. There's definitely the risk of regression to the mean where because everything is just trained on things that have come before, it's not going to provide any novel solutions. But to be honest, particularly in certain areas of software engineering, we're not inventing anything new. We're just putting the blocks together in a way that suits our needs.
[00:44:02] Will Vincent:
Right. Well no. And that's the thing is that, you know, AI by definition is like an abused term, and and there is all this it's much more than LLMs. Right? There's all that stuff around new mathematical proofs and protein stuff. So, yeah, I'm curious for for you with, in Copilot. Do you just use the default agent or or model, or do you flipping around with the models? Because I can never I would love if somebody did a guide of, like, these models seem good for this task because I I don't really know. I don't have a
[00:44:27] Tobias Macey:
I have experimented a little bit with the models. Lately, I've been having very good results with the CloudSonic 4.5. I've also had good results with GPT five, and so I've mostly just bounced between those. But, you know, as you said, it's a moving target. And so as new models come out, I'll try some of the newer ones. And if it works, then I'll go for it. I'll I will say that in the context of both the Copilot CLI and other harnesses, even just the the Gemini CLI directly, the Gemini models tend to kind of wander off in the middle of doing something and just stop. It'll say, oh, I'm going to do this thing, and then it just walks away.
[00:45:07] Will Vincent:
It's like, oh, squirrel. Come back. It needed a couple more parameters to Yeah. To focus. Yeah. I've been like, I just had some long flights for travel, and I was using mainly JEMMA. It's hard to know. It's too much change in these models. But, you know, maybe if they if they slow down their pace of change, we'll we'll get better slow down and be able to, like, implement these things. Because I do think at the end of the day, people wanna just sort of talk through, here's what I want, and the agent or an interface automatically creates the spec, automatically creates the plan, you know, hand holds in a way. Like, we're we're really it's wild. We're just, like, raw interacting with the models right now. I don't think that will stick around. Right? It feels like doing kinda, like, command line, like, DOS when we should just be dragging and dropping.
[00:45:46] Tobias Macey:
Absolutely. And as you have been working in the space, working with teams who are leaning on the JetBrains utilities, what are some of the gaps in capabilities or just in terms of the places where these agents live as far as being able to facilitate the overall life cycle of software engineering. I'm curious where you're seeing missing implementations or areas that aren't getting enough focus because of the fact that there is so much hype around the generation of the code in the first place.
[00:46:21] Will Vincent:
I mean, I think I find a lot of people just not even doing the very basic. And so I'd love to give, like, a more complicated answer, but I still see, I don't wanna I don't wanna get too specific, but even, like, accomplished teams just not using these tools. Or if they use these tools, they'll use one flavor of it, and they won't experiment with different models. They won't invest any time into trying to get the most out of them. They'll just take a you know, they got a long queue of work problems, try to one shot it into an agent. The agent will sort of solve one thing, get others wrong, then they have to spend a day or two, fixing all of that, you know, because it changed a subtle bug somewhere else. And then they go, you know what? Like, I'm just not using these tools. And I think that doesn't give a fair shake to the tools because I but I think you can't just, like, drop them into a professional workflow. You have to sort of build up to that. So I would say start with lower stakes things, start with things that aren't production code at a minimum, as you said, and we've said, like, have specs for it. So I think people just don't give it a fair shake, but I get it. They're busy. They've they've got stuff to do or maybe they have restrictions around privacy. So separate from parallel agents and a fleet of stuff and firing everyone, I think just have a rules guidelines file for something and really just ask it to think, not even write code. I think people will be really surprised by just just asking it questions. But you have to kinda tell it not to write code. I think that's part of the thing. Right? Like, it really, really, really wants to write code. So that's what I would say. Sit back or toss in, you know, for fun. Like, if you're used to using Python, like, hop into a Java code base. You know? Like, just just get clone something in and just see how much you can learn from it. I it's there's so much improvement to be had there. But, yeah, just taking fifteen minutes and thinking it can replace what you currently do, I think, is unfair. So I would say people need to you need to invest in these tools one way or another as much as we all wanna stick our head in the sand. I'm not using it to write all my code, but I'm using it as a partner for almost all the code I write these days. But, again, I'm using Python and Python and a lot of it is web stuff, and it's so well documented that it's just really, really good.
[00:48:17] Tobias Macey:
Yeah. It's, an example of what Ethan Mollick says in Co Intelligence of these AIs are a jagged frontier where you think it might be able to do something well and it can't, and you don't think it could do something else well, but it can. And so the only way to actually understand when and how to apply these utilities is to experiment and figure out, does it do what you want it to do in the places that you want it to do it? And then building up that internal muscle memory of when to reach for LLMs and when to just do it yourself.
[00:48:48] Will Vincent:
Yeah. But I think that at the same time, like, the industry can't assume that everyone has hours and hours a day to just play around with models and prototype stuff. Like, most engineers have, like, an endless queue and a manager asking to do stuff and, I don't know, find some fun stuff on the side. Like, as much as it's frustrating, like, they're just moments of kind of, you know, kind of joy for me away. Like, especially, like, on a weekend or I'm feeling lazy, and normally, I wouldn't do something and I just fire up an agent, and I can make progress. And I'm it's not a 100% production code. You know, I'd wanna spend a lot more time fine tuning it. But, like, if I wanna just prototype something, you can really fly with it. But I I so I guess I what I would say is people yeah. We need to bring that jagged frontier down a little bit. It shouldn't require hours and hours of of playtime. And and so I as a first step, like, bugs. If you have a whole bunch of bugs, pull in your entire code base and specifically say I'm having this issue and just don't write code, talk to the agent. I suspect that it will help you with some of those bugs. I would say that's a good, like, hey. I wanna use it in a production setting. I would start with the bugs personally.
[00:49:46] Tobias Macey:
And as you have been working in the space and working with people who are trying to adjust address these questions of when and how and which tools to use for software engineering. What are some of the most interesting or innovative or unexpected ways that you have seen them approach that challenge?
[00:50:05] Will Vincent:
I guess that I would say the thing that really jumps out at me is on the one hand, there's all this complexity and stuff that we're doing. But on the other hand, like, I did a blog post on, like, Copilot CLI's secret prompts. So, you know, it's it's Copilot's command line interface. But if you go in, you can find the index dot j s file. It's just an npm module. It's about a thousand lines of code. There's less than a 100 that are like you are, and you can see kind of all that it's doing. Like, it says you are the GitHub Copilot CLI. You know, you have strong problem solving and coding skills. Assume you have Internet access. Assume locally installed tools. Like, none of it is rocket science. On the one hand, it feels like all the tools are just these minuscule thin layers skating on top of these models that get better and better every year, which makes me kinda wonder, like, what's, you know, what's the point of all this other stuff if, like, the models are getting so good? It's just an agents dot MD file even on these top of the line tools. Like, I so I guess I changed the question a little bit, but that's what I think about. Like, for all this complexity and apps built on top of it, maybe we should just be sitting back and have 20 to 30 line guidelines and just let the models improve and do what they will. That's kind of what I'm struck by. I don't know about for you. Like, I find it mind blowing. It's like, I think there'd be more there there. You know, there could be, like, you could do smart things. You could, like, scan it better or do stuff. But if the underlying models are just still improving incrementally, how much do we need to do other than just pay for tokens? You know? So I guess I would say I say practically speaking, I see there's token cost, there's speed, and then there's quality, and you're sort of balancing all three of those. You know, if you've got the money, just, like, slam all the tokens. You know, that's what I see people doing. So what is that middle level of, like, okay, $20 a month doesn't give me the performance I want, but I don't wanna spend $500 a month. Like, I think for companies finding out how to, like, come back to the the cost I guess the cost thing. Right? I mentioned, how do we make this affordable enough that everyone can use it, and maybe it's actually good on your phone rather than requiring, you know, a cloud server running inference for it? Yeah. No. It's definitely
[00:52:00] Tobias Macey:
an interesting flood of decisions that we need to make as we're addressing these challenges. And, yeah, I think it's really just a matter of pick something, try it out, and see what you can get it to do because otherwise, you're just going to get stuck in the paradox of choice.
[00:52:14] Will Vincent:
Right. But you can never trust it, and there's security nightmares. Don't forget that. You know, that's that's the whole other thing of, like, you know, having it, like, run your calendar, all these other or, like, you know, there's a case of having it run like a vending machine. Like, you know, it's it's this weird thing. Right? Like, we thought we're dealing with zeros and ones and deterministic code. And then the tip of the frontier, like, these data pipelines, it's English prompts to models that change. Like, it feels sometimes like the most janky thing ever, and yet the results can be so powerful. So it's a weird, I don't know, confluence of things these days. It's ever ever less trustworthy but more powerful.
[00:52:49] Tobias Macey:
And as you have been working in this space and trying to help some of the JetBrains customers tackle these questions, most interesting or unexpected or challenging lessons that you've learned personally?
[00:53:01] Will Vincent:
I think I just have to come back to most people still aren't using these tools at all, let alone properly. And so it's very rare that I have a discussion like the one we've just had. It's more like, oh, yeah. I've got PyCharm. I tried the thing. It didn't work. I gave up on it. And it's like, well, did you add a guideline or add rules? No. Did you how did you talk with it? I think everyone needs to figure out some sort of fluency with interacting with these models, like we talked about earlier about treating it like a human, you know, figuring out how to prompt it. You know, in the same way that five years ago, I would joke to junior developers, the only difference between us is my Google flu is better than yours. Like, I can pass better questions to get better results. I think there's already some version of that. So I try to help people think like the LLMs. Think about how to ask questions. You know, don't try to one shot everything. Think about, is there a specific bug? Okay. Don't freak out because it gives you a wrong response and just assume everything is wrong. Like, just accept that that's an innate part of it. And, yeah, just give them a fair shake, I guess, is what I would say. I see too often people try it. It doesn't work and give it give up, but I think they're unavoidable. Are they gonna fully replace humans? No. Are they gonna make people way faster and change a lot? Like, for code? Absolutely.
[00:54:10] Tobias Macey:
And as you and JetBrains continue to move forward in this new and exciting and unpredictable world where AI is writing all of the code, what are some of the things you have planned for the near to medium term in terms of the JetBrains product suite or product vision and any predictions or hopes that you have for the future of AI for software engineering?
[00:54:32] Will Vincent:
So I actually I had a meeting with my boss just separately before this, and I I asked her. I was like, what can I talk about? So there's a lot of discussion and a lot of things coming down the pipeline. I can't share anything until it's out, but I think it's all pointing the direction of an easy to use interface for the user that pulls in chat, pulls in agents, gives you the choice. On my wish list, I don't know when this would come, is I would love to see defaults. So, like, be like, hey. Yeah. Let's try it. And the IDE analyzes your code base and says, oh, I see you're with Python. Like, we've internally ranked the models, and we think this is a good choice. And just holds your hand through a lot of this. You're starting to see this with some of the quick edit or next edit tools, but more hand holding, I guess, would be what I would say. JetBrains is adding some of that. I think you can expect more down the pipeline. And I think, ultimately, helping people get the most out of these tools is what an ID in JetBrains should do. But, yeah, there's a lot a lot happening, a lot being tried, a lot of things we try internally, and then some that get killed and some don't. But I think it all points in the direction of just trying to hold someone's hand to use these tools and whatever model they want. Again, no one's locking you in, which I think is important to say, like, you can pick whatever frontier model, whatever local model you want.
I just personally wish that and I think and we are working on it in some respects. We could just sort of default people to a better experience with these tools than having them explore the jagged frontier themselves.
[00:55:49] Tobias Macey:
Are there any other aspects of the use of AI and agentic utilities in the context of software engineering that we didn't discuss yet that you'd like to cover before we close out the show?
[00:56:01] Will Vincent:
Oh. I think the last thing I would say is I find that, like, your Git client is more important than ever as we're all just becoming sort of managers reviewing stuff. We didn't really touch upon that. I think everyone kinda has a different one that they really like, but it's interesting to see that's an in some ways, another layer of abstraction. We're not even typing stuff, but you're you're prompting it, but then you gotta be really good at reviewing it in Git. And I think that's a whole you could do a whole podcast on how to get the most out of out of Git. So, yeah, that would be the area we didn't really touch upon, favorite Git tools and and how we're using Git with these, LLM generated code. Maybe we all just need to switch to Mercurial. Yeah. I don't know. I, you know, I I guess anyone who has, like, strong opinions and says it with conviction in this space is to be considered suspicious. Right? I think you just have to say, like, it's changing. We're trying. This seems to work now. But, more probably, it should make people go faster, but how do we learn who gets lost behind? I don't know. They're fun to use. We haven't looked at the ethical questions at all about training or building these models, which I think shouldn't be avoided. But putting that aside, like, don't just use it for production code. Use it for something fun. Like, you know, try to be playful. I think if you can lose the playfulness as you get older and you get sort of burdened with stuff to do, but just whatever, you know, wild ideas you had in your first starting to code, you could probably do a prototype of it in a couple minutes with one of these tools. So I would just suggest people try to be playful with it and and have fun with it. Like, it can be fun. Like, you know, have your kids sit if you have kids or young people, like, sit down with them, see what they wanna do. Right? They'll ask, like, crazy questions and then see what you know, I guess this would be testing out the jagged frontier, see how it responds. You know, like, you and I probably are locked in a certain ways of thinking, but somebody who's who's not might ask crazy things and get crazy responses that could be good. Absolutely. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gaps in the tooling technology or human training that's available for AI systems today.
Oh, I think it's that when do you stop typing and when do you sit back and start agenting? I think people need to do more typing than they're doing, but I'm I'm trying to figure this out. Like, I I create content. I teach people. I wish there was more of a canonical answer on that. I don't think you should just start coding by agenting. I think I've seen this. You will get completely lost, and it's not to be advised. Like, you need to type some stuff out, but I don't know exactly where that's where that line is.
[00:58:20] Tobias Macey:
Well, thank you very much for taking the time today to join me and share your thoughts and experiences on the use of these AI capabilities for software engineering problems and some of the lessons that you're learning in the work that you're doing for JetBrains to help inform those ideas and questions. So appreciate all of the time and energy that you and the rest of your team are putting into that, and I hope you enjoy the rest of your day. Alright. Thank you. Thanks for the opportunity. Thank you for listening. Don't forget to check out our other shows. The Data Engineering podcast covers the latest on modern data management, and podcast dot in it covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts@aiengineeringpodcast.com with your story.
Guest intro and career path into AI utilities
From autocomplete to agents: reliability and risks
Vibe coding vs. Vibe engineering and learning to code now
AI‑native engineering and negotiating specs with agents
Treating agents like collaborators: prompting tactics and limits
Beyond code gen: applying AI across the software lifecycle
Constraining agents: small diffs, stepwise plans, and review
Context, deployment realities, and maintaining continuity
Blast radius, IDE vs CLI, and the future developer workflow
Conway’s Law, microservices, and organizational impact of LLMs
Role of IDEs in an agentic world and JetBrains’ perspective
Local and open models: cost, privacy, and performance tradeoffs
Orchestrating multiple agents and managing parallel work
Speed, trust, and human oversight as models improve
Team adoption, shared context, and fine‑tuning possibilities
Spec‑driven workflows to onboard and empower juniors
Learning, documentation economics, and the shifting knowledge base
Tech selection, introspection, and model limitations
Raw model interaction today vs. future hand‑held experiences
Practical gaps: start small, add rules, and use agents for bugs first
Thin wrappers over models: costs, speed, and quality balancing
Security, unpredictability, and the paradox of choice
Adoption reality: fluency, prompting, and fair evaluations
JetBrains roadmap themes: integrated chat, agents, and sane defaults
Git proficiency in the age of LLM‑generated code
Playfulness, ethics note, and experimenting at the jagged frontier
Final reflections: when to type vs. when to agent