Is Antigravity with Gemini 3 Pro Really Better Than Claude Code? A Real-World Developer Test
November 19, 2025 · 48:14
I spent the last 12 hours stress-testing Google's Gemini 3 Pro and their new Antigravity IDE on actual projects.
Topics Covered
AI
About This Episode
I spent the last 12 hours stress-testing Google’s Gemini 3 Pro and their new Antigravity IDE on actual projects.
Watch
Embedded video and links available on the episode page.
**[00:00:00]**
Hello everyone, my name is Arando Preseno and welcome to the web talk show. Is Gemini 3 Pro the game changer that everyone's talking about? That's the topic for today. We were able to test it last night and we're running it today as well. Running some tests on real life situations and seeing if it's really keeping up to the hype. I think it's great. I'll start with that. It's a good tool. Is it ready for prime time? That's what I'm trying to uncover this morning. So, if you have any questions, please drop them in the chat, ideally in a YouTube chat because that way I'll see it right here. Otherwise, if you leave a comment on LinkedIn, Instagram, etc., I might be able to catch it live on the phone and get back to you with my reply. So, what is Gemini 3 Pro? If you haven't heard any of the hype around it and you don't know what I'm talking about, well, Gemini 3 Pro is Google's answer to Anthropic and Chat GPT, OpenAI's ChatgPT and many other competitors in the space for their state-of-the-art model. Now this model is very very powerful and in benchmarks it's just destroying benchmarks. It's doing very well in all sorts of things especially like by coding games and things like that. There's people showing off its capabilities in tons of videos around the web and that's all great but I know many of you if you've been testing VIP coding or LLMs etc. You might be in that position where you're not sure and you're hearing all the hype and you think to yourself, "Oh, should I just burn the boats and go over to Gemini?" I don't think you should. This is the same thing that happened
**[00:02:00]**
when OpenAI came out.
Everyone's using that. Then Anthropic comes out with a new model and some people jump over. It might be a good idea, might not. Then cloud comes, cloud code comes along, but then GitHub has the co-pilot and then they have the CLI and you're just jumping all over the place and that doesn't really help you because you're relearning how everything works. Even though everything is pretty similar, there's still things and nuances between the different systems. So, it's better to stay with the one that that you know works and it's working for you.
Unless you have an issue already, maybe it's not doing what you want, etc., then yeah, for sure try other models. It's always good to experiment. However, with all the hype around the Gemini 3 Pro, I decided to take a look last night and download the anti-gravity software, which is their attempt at an IDE or a Gentic IDE experience. It's another fork, we could say, of um VS Code, cursor, etc. It has two parts to it. It has the the regular code editor, but then it has a sort of the agent manager experience which I can show on the screen in a few moments. And it's similar enough with the code part that you can just feel at home. You can even import your settings and things from cursor or VS code. So you can just bring in your projects. You can even bring in the workspaces. I just opened a workspace that I already had. Pulled it open. Everything came in through. I could just start working with it. So good that part you can just continue to work that way and you can have a little screen on the side like a little sidebar
**[00:04:00]**
same as you would with the cursor AI chat or within cursor also if you're using cloud code with their extension you can do it that way as well it works very similar you basically you have a little sidebar that you can chat with however the main difference I would say is the um sort of agent manager screen so if you go into the editor there's there's this little button at the top right that says open agent manager and that opens a different experience. It's more of a chat first experience where you're talking to it and then it it spins off little sub agent. It spins off things. It uses tools etc. And it's not so much assisting you while you code. It's more of a it's just taking the wheel and then just prompting you if if it needs clarifications or things like that. And you can keep both windows open, of course, and and see what it's actually doing. So, it's a it's a good experience. I'll say that initially I was able to open my project that worked just fine. I was able to get it to run. So, basically I just started I opened the code first, the editor and then I opened the agent manager because it looked fun and and I told it basically look at this project. This is what we're doing. This is these are some of the things that we're working on. uh please just run it and see if it's working as it should. And so it ran it. It has access to the terminal. So it went through and I'll show you a preview of what what it did. And it goes through and it runs it and it was able to open a browser
**[00:06:00]**
with that's one of the good things. That's like one of the benefits I would say of this versus many of the other tools. Since it is tightly integrated with Chrome because it's Google at the end of the day, then it will automatically load a Google Chrome browser that is connected to the whole experience and there's an extension that it adds and it it guides you to add it etc. It's very easy and then it can automatically take screenshots, run things, look at the console, etc. the same way you would if you take a few extra steps using VS Code or cursor or whatever and you install the Playright MCP for example. So that's what I was doing before. I I had to set that up so then I could tell it to use Playright to take screenshots of this page and browse it etc. This just does it out of the box. So that I'll definitely give it points for that. Amazing experience. I didn't have to set up anything. I didn't have to do any MCPS, nothing. It's just it just started working. Looking at the screen, opened a browser. Very nice. So, that part's good.
Now, let's talk about some of the cons. The main one being it stops working. [laughter] So, so I would say that is going to be the biggest thing to stop me from using something like this right now for real life. Okay. So, it's good for just the vibe coding, uh, making just having fun, but it's not ready yet in my opinion. Today, everyone's talking about it, so it gets overloaded, etc. I understand, but today, if you're planning, you're working on a project, you got a new project, you're like, "Oh, I'm just going to
**[00:08:00]**
throw Gemini at it." You could, of course, but just keep in mind there might be situations during the day where you're working and it works great and then it will suddenly just stop. And this reminds me of the experience with Minus when it first came out and got an invite and we got in and yeah, it's working. It's working then but it just just stops errors. It just um too many people using it or whatever. And that's very inconvenient if you're using it for real work. I I wouldn't expect this from a tool coming from Google uh at this point really uh for something like this. they have the infrastructure to support uh I would suspect but I mean yes the usage is huge but is it is it though like is it that huge that Google couldn't support the load uh didn't they expect with all the hype that they're building around is it the best model everywhere didn't they expect the load to be big weren't they prepared that I don't know that's that's one thing and and I'll talk about positioning I I think it's it's Google is obviously very well positioned um to do this, but these sort of growing pains are to be expected from one of these smaller companies that that got in and scaled fast. But I wouldn't necessarily though from past experience with Google, yeah, of course they they they've killed off some of their products um just because and they have they have some interesting ways to handle things, but but I mean load wise I I wouldn't expect it to be so buggy uh at this point personally personal opinion. So, I'm going to show you the screen and and show you what the thought
**[00:10:00]**
process behind it, exploring the codebase and running through it and my opinions on what things it did right up to this point and what things it didn't do as well and and where I saw some of the issues. And I'll talk about some very good points about it. I'm really enjoying I I I stayed up working with it last night and it is it is good. Like I said, there are some things where you smile, you're like, "Wow, okay, it did it better than Cloud Code did." But there are other things where it's still running into the same type of junior dev kind of like it didn't think of the simplest solution for it and over complicated something or didn't see an issue that was right in front of its face. That sort of thing that you're like, well, you're supposed to be the smartest thing out [laughter] there. So, still just like any other, it's not going to just work like the demos. Some of these demos are great, especially like the game ones. I'm I'm a stunned. And if you think about it, like they have all the back-end services that they can provide with it.
That's one of the the core benefits I think of Gemini, the the positioning of it being a Google service and the way they can just integrate with their services and maybe they can integrate it with the whole Gemini platform such that you're building something and then it just it knows obviously all their APIs and endpoints and how their LMS work etc. and their maps capability and and they have so much around the web space that I think if you're building a product that that's going to use Google infrastructure anyway then it's going
**[00:12:00]**
to be fantastic because they have all that already in theory and so it can build something with a very nice UI and connect it to all of their services and you could just use it and use their services and that and that that'll be great. I didn't find that it tried to push any of their services which is good. It it felt neutral in that sense. I even at some point told it or well it realized I've been using Appify to get some Google Maps data and it didn't push me away from it. It said yeah we could do it with Google Maps API or Appify but let's use Appify and actually just went through that route and I'll show that in a little bit which I found interesting. uh instead of it just pushing me to use their paid API of course to handle the same sort of data or scraping or whatever you want to call it. So yeah, overall overall it's it's good, but like I said, the main thing right now is that it just stops out of the blue. Um regardless of if it's paid or free or whatever, it's amazing that it's free right now. So you could go try it by all means. It's amazing. You can just use it with a Google with Gmail account. Just use it. It's great. You don't need an API key. You don't need anything. You can just use it. It's free. That is crazy. And the the free tier, whatever is good. It's very good. I was able to do a lot of things before I ran into your usage or whatever. Please upgrade to a key. I I worked on it quite a bit more than I would have been
**[00:14:00]**
able to on cloud code with Opus for example even in the $100 a month or $200 a month package. So very interesting in that sense. Obviously they're subsidizing it in a way to get people on it but but it's good. I found it I found it good in that sense as well. So I mentioned it very many times. [laughter] So, I'll start sharing so we can discuss it and let's see if it I don't know if it'll come through on the vertical platform. So, if you're on Instagram and you don't see what I'm sharing, sorry about that.
I I'll just explain it. Um, but you can jump on LinkedIn, Twitter, YouTube, and and you should be able to view it there. So, let me see where we left off here. I'm going to go to the top and I'm going to share my screen here. And for those of you on the podcast as well, I'm going to be explaining everything what I'm seeing. So, it doesn't really matter if you're not observing this screen. So, I will do window and then I will open anti-gravity. Okay. So, you should see my screen now. And I would like to be able to see myself as well layout. Okay, I can probably appear down here. All right. So, if we look at the screen right now, this is anti-gravity. And anti-gravity, like I said, has two views. The editor view just it just looks like cursor VS code. So, don't really worry about that. And then you have the agent manager which is what I'm showing right now on the screen. The difference being that the agent manager like I said is more of a chat interface where you are just talking to it and it'll
**[00:16:00]**
build things and ask you if it needs additional context etc. So if we look at this one for example, I basically told it this is the platform we're running. Please run it and check what it does and if you can run it that would be great. And so it goes through and it finds obviously package JSON etc. It finds out it has the Google maps integration web scraping search etc etc and it goes down reads the documentation and then it proceeds to create an implementation plan. So this all looks like what you would expect with HPT cloud etc.
So it writes an implementation plan down here and then it analyzes a codebase and creates a plan to import Google maps data from museum park sparks and playgrounds. So this app for context is an app that has um places for moms to take their kids in San Antonio. So if there's there's fun activities, parks, museums, this is a place to find those place events etc. Right? And so part of it is getting the data from different sources. And so at some point I was creating some data import pipelines. I was basically going manually to Appify myself and doing a search on Google and then going to Appify and saying, "Okay, so I want these results. Please give them to me." And Appify is a service that allows you to basically create an API out of different things. So you can you can have it go and treat Google Maps as an API with certain parameters or it'll give you back the information you need. Maybe it's many places with their images, with their reviews, things like that. And then you can parse the data or it pares the data for you. Puts
**[00:18:00]**
it in a nice structure that then you can use. So it's a very nice tool for many things. I I won't get into it too much, but that's what Appify helps you with. And so it on its own, I didn't tell it anything about amplify goes in and finds these scripts um that I had that were that already had some information JSON. So it they already it found that I had some JSON files with a list of museums or parks or things like that. And then it thought mistakenly it didn't ask me. It thought that I wanted to proceed and import those. So it goes in and starts trying to import those. So it goes in tries to do the import and then I'm like no no no that's that's not what we're going to do. Now if I expand this you'll see that it go actually goes in and created the script to do the import and starts processing the things and some things fail. they start to do they they start to add a bunch and then they fail to process some images and it fails to do some of these because they already exist.
So there's a lot that already exist and it's like there's a duplicate key etc. Keep in mind it knew from the way the thing is configured the app is configured. It knew where to find the credentials in the environment variables file and it knew how to call the database and get the information and see how the database was and it knew how to run the scripts and things like that on its own which is good. Again, I didn't tell it to install an MCP for superbase or anything like that. It just just went
**[00:20:00]**
and did its thing.
So that's also very nice. Very well done. So, it's having some trouble with this and I tell it, "No, no, I I I'm sorry. I didn't want to import those files." Instead, if you have a better way of getting those places, Google Maps for example, here I'm telling it like by all means, I mean, use your own thing if you want. I'm all for it. Those scripts were just tests that I would doing manually to test uh through through Appify. So, basically those those were just files that I was creating so that I could generate the structure in the application. uh and so that's why they were there. But once it saw that it says switching to and he said the the thought process explaining everything I'm currently focused on transitioning from static app JSON files to a more dynamic retrieval process. So this was very nice. This is something I wanted to do but I never told it that I wanted to do this. It assumed but it assumed well. Considering data integration methods now contemplating specific methods for integrating data moving beyond the initial API centric approach the user's preference for multiple sources including Google maps potentially via an existing EPify endpoint local websites is key and refining the plan based on the to-do notes considering both API integration scraping options to achieve a comprehensive all the just say comprehensive all the time I don't know why anyway data retrieval strategy for venues stopping and shifting focus. I've stopped the current import process and I'm shifting gears towards integrating Google map data. The goal is to build an import from Google Maps feature in the admin UI leveraging the API from Appify programmatically which is a step towards automating
**[00:22:00]**
and data enrichment.
So it goes searching for credentials and then stops and then it doesn't find an API key because I didn't have it there. I didn't never placed it in my environment variables. So I then go and I place the API the API token in my variables and just with that I tell it there it is. And so it goes in and it plans to build a dynamic import feature in the admin panel. It actually does it and it does it quite well. So in the panel, it builds this import places thing that allows us to search Google Maps inside the app and then choose the items that we want to import. So it just did it on its own. So it opened the browser and everything and it went through and tested it and etc. Basically created a view for us where I could just search coffee shops in San Antonio. it will do via Apify the scrape or the search or whatever. Brings back the results, places them in little cards on the screen with checkboxes. It shows me the reviews and everything. I could just check the ones I want and then click import and it will automatically import them into the schema that I already have for the database into the for those who are not technical into sort of the model the way these are structured in the database. If you imagine a form, when you want to create something, you might have a title, description, images, reviews, etc. opening times, all that. It already knows all that because it read the code. So, it didn't ask me for anything. It just created this dynamic import feature. Visual looks good. Actually, looks great. It looks just like the rest
**[00:24:00]**
of the UI in the system. I should actually just show you what that looks like. So, if I go to stop here and then I'm going to share another screen. I'm going to do Okay. So, this is the import places one. And so, if I if I did park or parks, I don't know. I'm going to do I'm going to do park. It goes and searches again. this it just built it within the conversation without me telling it something that makes sense for an admin in a platform like this. So it it created this this thing goes does the search ampify parses the data etc brings it back this thing automatically generates all the little cards with the details etc. I have no idea if it's going to get work now. I don't even have credits on Appify. This was just an API key that was new. So, they give you like $5 or something that you can use every day. There it is. So, it brought these in. It shows us the number of reviews, the address, a little image for them. And I can go in here and be like, "Oh, this looks good.
Maybe this one looks good. Maybe this spark looked good. This one and this one." And so then I could do import selected and it will go ahead. And again, I have no idea if if it will work again, but it did. Success. Successfully imported five items. And so now in the content section, it will have the new items that we just added and it brought in all the information from it. So it brought all these details. I don't think it brought the images, but it brought the location, settings, etc. And so if we actually
**[00:26:00]**
look at the site itself probably if we do this we'll see Ladybird Johnson Mallister Park they have their images and these images if we're seeing them here it actually brought them in stored them in the database so that we're not using Google URLs and of course this is not a very big image I have to prompt it further and say use well I'll get to that point in a second but appi has another parameter that allows you to at higher resolution images. But look at this. We're getting all the accessibility, amenities, review, highlights, what people search for, the location, all of these fields we used to have in our database already. So, it understood that and it was able to pull the data without us telling it anything from Appify, parse it, put it in the right place. but within this fantastic little import places feature that it just created out of the blue. So let that sink in. If you've ever built an application or anything software related that [snorts] is powerful. So I'm giving it some points here for this this achievement. Why? because the demos that they show a lot of times it's just starting from scratch. Starting from scratch is relatively easy especially building landing pages or even a little game etc. It becomes complex when it has to build a database create users authentication manage sessions etc. Most of the demos you'll see don't have any of that. Some do and but still creating it from scratch is a lot easier because it can control the whole flow. it just builds whatever it thinks and it it might do a very good job at it. However, putting it in an existing project that may or may not be well done,
**[00:28:00]**
etc., and the fact that it can go in, create a feature, add it, have the feature work with the database, with the authentication with an external API within an existing codebase is impressive. That was that was something that I don't think claude code which is my tool of choice at the moment would have achieved that quickly without excessive front. It would it would have taken a longer conversation. I would say it can it do it? Yes, of course. Probably it would need more context here. I didn't give it any [laughter] real context. It just did it.
So that was that was good. I really like that because this is a feature that I wanted to add anyway. It just did it for me. Didn't really have to tell it and it did it very well. So, it gives me a little piece of mind knowing that. Okay. So, I think I could have it add additional features. So, points for that. If you if you want to use it in an existing codebase, then it it might it might actually do a decent job of it. So, that's good. Now, let's go back to the other screen that we were looking at. So, I'm going to share the window of anti-gravity again. And if we look at it here, we see that it went in, it created the dynamic import feature and then there it was there was an error. So, there was an XJS error with chat CNN. It found some duplicated packages etc. and it went and fixed it. um refresh the caches etc. Everything worked. Fix the alert and that's good. Okay. Now, one of the issues that I was seeing as I was going through this is that it would stop.
**[00:30:00]**
So, this looks like a nice conversation, but in many of these parts, it was working just fine and doing a little steps and then it stopped and we got an error. And the error just said the agent stopped because of an error. That's it. Nothing else. So I had to tell it continue and then it continued. And sometimes it didn't. And so fortunately, if you wait a while and just tell it again or just prompt it again, it will continue. It doesn't just kill your session like some of the other competitors do. And man has that happened a lot where it would just die and you couldn't you couldn't really do anything else. Here it'll it'll sort of stop telling you that that you might want to start a new conversation or continue or whatever. uh I just tell it continue like right just write continue and it has worked quite well. Another thing that I think is impressive is this conversation lasted probably two hours and worked really nicely in the sense that I never saw a context window overload. Um there's no visibility of the context actually that I know of. I I think it's important to have it if it's going to affect you, but here it doesn't seem to show it and but it doesn't seem to care anyway because because the context window is huge. But I haven't run into any issues where it has to compress the conversation or at least it hasn't told me that it's compressing the conversation. Um so that in that sense it's good. It just feels like a user tool, not necessarily a developer tool because if it's meant to be a user tool as well, then yeah, users will get very confused with
**[00:32:00]**
context windows and things like that. So here it's it's doing its thing. It verified the dynamic import feature. That all work great. And then it verified the data import and it could see that it did. So it says go to admin import. Try it. And then we noticed that it was adding a bunch of things that didn't make sense that probably weren't parks and things like that. And I think that was just because of the initial import that it tried to import a bunch of things that we had in the JSON file. So I told it, hey, there's a bunch of things like HVAC, car dealerships, etc. that we don't want here. They don't make sense. Can you help me clean it up? So it creates a little script and it starts removing some items. But and this is something that I didn't do well. It removed I think nine items. And I said, "Yeah, yeah, but look, these are the categories that I'm interested in. Park, city park, playground, art studio, museum, art gallery, etc. Find things like that. Keep those. Get rid of the rest." And it does it. And again, it just does like 10 or nine. It removed some, but there was still like 700. And that that's the part where again, it was acting like a junior dev. It was doing the least amount of work necessary, I would say. And and that's where you get a little frustrated. You're performing like an expert and then suddenly the IQ drops down to nothing. So at that point I just said, you know what, just keep these five, get rid of everything else. And so it goes in and gets rid of everything else and just keeps I think five of
**[00:34:00]**
them or something that it did well. Now this part that it's doing here is another thing I want to point out. When I was doing it in cloud code, I had to connect the superbase MCP for example so that it knew how to interact with superbase and do things. Uh or I had to tell it look at the documentation so that you can use the API here. Again, I didn't tell it anything. It just knew how to talk to it. So up to a point I think it does a good job of abstracting away whether it's using MCP or tools or it just knows how to use the API endpoints and does it dynamically directly from the console the terminal. I think that's okay. It's a personal preference. It just gets out of your way and it's just working and it'll show you and you can look at the things and see what it's actually doing. But all these some it does by creating scripts, some it does by actually talking to the database. And it's good. I found that very interesting. I did again hear where it's saying that there or I said there are missing images.
It didn't notice like there was a point here where it it just said, "Okay, I'm done." So, I had given it a a to-do file that has a lot of things and I expected it from what everyone's talking about to just go in and go crazy and start building things and adding and adding features and thing and no, it just for whatever reason it focused on the import part of it which I never told was a priority and uh it finished like it told me, okay, I'm done. We're done. Like that's the
**[00:36:00]**
the end of the work. and it's not at all, not even a little part of the end of the work. So, I found that interesting. It didn't sort of use a to-do file to understand and maybe make a detailed plan of further action or even tell me, well, this is completed. Now, we should do this. What do you want to continue with? That's something that cloud code does very well. So, points reduced in this case for for planning ahead. And uh I think that wasn't that wasn't very well executed. Other than that, I noticed that some of the images weren't there. Most of them actually. And I told it because it didn't figure it out on its own. And so I told it, why don't they have images? Only three have images. And so it goes in and it queries the database and it checks um why. Look, if we expand this, so it goes here. It inspects the images. It analyzes them. It creates a lot of scripts. So its way of doing things a lot of times is creating TypeScript files and running them. And then it does run some curl requests in the terminal to get the images, try to find them, etc. And it finds out that it's getting some permission errors for the images. mostly because these are Google images and they typically either have a short time frame to live or they cannot be accessed directly to avoid scraping and things like that. So it goes and finds the items and tells me yes well these items have the forbidden error. So my recommendation or the fastest fix is to go to admin content edit those items and upload a cover image manually. What? Like what does that mean
**[00:38:00]**
in the context of everything we're talking about? Why would it think to suggest that I should go in and edit the items and add a cover image manually if the whole point is to make it more efficient? It actually built the thing to import things. Why would I in my right mind ask it to do all that so that then I have to go into Google Maps or something, get an image manually, download it to my computer and then upload it to the place that makes no sense. So again, IQ lowered here. So I said, "No, I want to be able to fetch the images directly either via Appify or Google Maps, whichever works best. They want to be able to get the images, download them to our app instead of loading them from Google. Now it becomes smart again and says, "Okay, so I updated the search route, etc. Let me check for the API keys, blah blah blah." And then it import it upgrades the import process so that it can actually bring the images in, which we saw earlier in the demo that it actually did. It works. But this is where I think it did something very good. It fixed it, but then goes down, says, "I fixed it." Great. It celebrates itself. So, good for it. And then down here, it goes in and fixes. Where is it? This one here. It finds the ones that already had the issue and runs a script to fix those images directly. So, I don't have to do anything about those. So it went in, reimported, fixed the ones that had the issues. That was great. So again, points for that. I think that was a good execution and it was it was
**[00:40:00]**
very kind of it to go in and fix that as well. Then we had some issues with the featured content checkbox. So on the site itself, if we look at it, we can see that there is a window this is a work in progress, of course, but if you look at the homepage, it has this featured this week section. And when you're editing a property, you can mark it as featured content here. Supposedly, you click update and it should mark it as featured content. And so the issue and that's probably the issue that we're seeing here.
There's a hydration issue etc with Nex.js. And so it's not really doing it. It's not saving it. And so my because it it should show here. So my next conversation with it, the one that I was going to do before I started the live stream was telling it to fix that particular issue. So if we go back and then we can see that here it's oh I refactored the handler to fix the featured field because it there's something missing blah blah blah. It goes it doesn't really work. [laughter] And so then it goes again it says oh I have to use the controller for the checkbox and it's ensuring strict boolean type safety. And this just sounds smart, but it's not really uh smart. And it says, I I applied a stronger fix. I switched the checkbox to use a controller, which ensures the form state and the UI stay perfectly in sync. It should definitely work. Now, of course, it doesn't, but like all LLMs, it's very proud of itself, and it says it did it. Now, here is again where I say don't just believe the hype around the model. Oh, this
**[00:42:00]**
is going to be amazing. It's the best thing out there. You don't need to know how to code. You don't need to know anything. It'll just work. Yes, you can build a lot of applications and things now by coding without knowing anything. But you still have to look at what it's doing. Pay attention because sometimes it'll just tell you that it did something. And you might assume that for by it telling you that it did something and being all excited about it and overly positive, you might say, "Well, it knows what it's doing. it probably works and it will just silently fail and you might not even notice. This is a clear feature because I'm asking it for it. But in another thing, if you told it to build a big application or a big game or whatever and especially if it's important for business, it you might think it already implemented what you asked it to because it ran all the checks or whatever and it didn't really do it. It just told you it did it. And sometimes it'll fool itself and it'll fool you into thinking that it did what you asked it.
So you still have to be very careful even if it's the best thing out there or what people say. It's always important to double check and and make sure that things work. And that's where tests come in. I'm sure this particular ID is very good at tests. If you tell it in your prompt, make sure you create tests and run them against the code, etc. It it's probably very good at doing that and will will take away much of the effort from your part, but at least that way you could see it running through
**[00:44:00]**
taking the screenshots, checking the different things, running actual command line tests to to make sure that things pass and that they have the valid values and all that. that will also help if you if you do that that way instead of just blindly assuming that what it's telling you is true. So this one again it's IQ dropped drastically. So it started fixing it and like cloud code and like chipt and like any others it said yes I did it this time it works and it doesn't it doesn't really do it might be a context window thing I'm not sure this is what starts to happen when you use one of these models and you get closer to the end of the context window it starts being less good less smart because it has to keep a lot of thing keep track of a lot of things and it's just not good at it. So that's why a lot of the experts will tell you use sub agents. So instead of having a huge project, a huge conversation with everything, you should have one main orchestrator one that keeps track of the the main thread basically and then throw out little sub agents to do part of the UI or part of the authentication or part of this other thing and then each one has its own context and things like that. Now this is supposed to do that behind the scenes with its own thing. So that's why I don't understand why it gets less smart as it goes, but it's probably part of the same. In any case, here it just looks like when some of the other models I've seen are just having a bad day. [laughter] So So it's just, yes, I
**[00:46:00]**
did it.
I found it. And it even puts a little detective icon on it. And of course, it doesn't work. But anyway, it keeps telling me, and this is where it died again, and it died again, and I just kept telling it to continue. and it it just it never fixed it. So I I this was a lot of back and forth before the call just just seeing if it would do it and it didn't. So is it the amazing game changer that everyone was talking about? I don't know. Will it move the waters? Yes, of course.
It does some things really well and I would say better than some of the others. I'll give it that very easy to use. I'll give it that too. The model itself is as proven by what people are testing and the benchmark etc. It is very very very good. I won't take anything away from the model itself and yeah it's good. I think the real power will come two ways. One is as they grow if they keep it as they grow their IDE um the anti-gravity which is a cool name if they keep at it and keep making it better and don't kill it like they've killed many of their products then uh yeah it's a good contender for cursor and anything vs code and and all the others so so yeah that that would be that would be a good thing I I think uh there's good competition and since it's the same structure VS code type interface then people can just jump in and use it. So that part is good. Now where the real power I think stands is Google's positioning in the whole ecosystem because while open AI is focused
**[00:48:00]**
on just like everyone using it not necessarily businesses not necessarily power users not necessarily developers just everyone that they have their their share right of the market and then anthropic is focusing on enterprise they're doing a lot with enterprise and they're mostly uh businessoriented in the sense of revenue which is a great thing. I think they they chose a good market. Google has the benefit of having a lot of the planet on their services and having a lot of additional services behind it. So they can very easily become the biggest just because of the installed space that they have.
Especially if the whole deal with Apple, Siri, whatever they're doing works out, then yeah, it's going to be very interesting to see what happens there. And also, obviously, they have Android, so they could just put this power into everyone's hands that has Android. That's a big chunk of the planet as well. So yes, they have very very good potential to make something great and also to keep the experience simple because I think we're all moving towards this agentic kind of uh experience more than looking at the code itself because at the end of the day like I've said in other conversations the code was just a way for us humans to create a way for us to interact with the machine which actually talks in bits right and so it's not that you have to learn code. We created the code to be able to talk to the computer. So if now the code means I can talk to it directly, by all means. So it's not whether you know how to code, it's just that if the thing knows how to code, just make sure that it knows what it's
**[00:50:00]**
doing.
But when now we're at that point. So if you're able to engineer, then it doesn't matter if you're talking to it on your phone or on the screen or by voice, it'll do its job. And so having Android on a lot of things then on iPhones or whatever, that would be very interesting when when they all have a way to just interact via voice and just tell it what you want and it'll show you what it's doing and it just it could just spin up nodes and computers and subprocesses and agents and do things for you without you having to be in a desktop computer actually building this stuff. I think that's sort of where we're headed and it's going to be very interesting and I think Google is properly placed to be able to achieve that. So again, we'll see. I don't know where their priorities lie, but everyone's talking about this. So I I thought it was valid to just have a different point of view as to whether it's better than Cloud Code or any of the others. Am I moving over to Gemini 3? No, I don't think so. I'm very happy working with Cloud Code as it stands. It does what I need. Sunonnet 4.5 is very good and it's it's a good tool to have and I think I will continue to test the Gemini and uh anti-gravity software. But while it maintains the error rate that it has where it's just stopping the flow out of the like just out of the blue, that's that's not okay for business. You cannot depend on a tool that's just hiccoping as much. Once they fix that, then yeah, it might be worth a deeper look into whether it's doing
**[00:52:00]**
things better than cloud code. But at this point, I think everyone is going to be looking at what everyone else is doing and just adding those features into their own systems. So, we'll see. It's a good tool. I think it's off to a very good start. And of course, I didn't talk about all the capabilities of the model, but of course with all everything they have, they're very good at image generation, video generation, and with Nano Banana, etc., it's it's amazing. So, you can imagine if they tie everything together as they're doing, you will be able to very quickly generate a very advanced AI first type application. So if you need a very quick app for your business or very quick game or a very quick experience, you can very quickly generate something that that's going to be able to generate images on the fly, generate uh summaries on the fly, generate video uh within the experience of the app. So so that I think that's also where we're going where these apps will no longer just be basic to-do task list etc. It will be more of a AI enhanced applications where even what we're talking about here is not going to just show you a list of places. It'll it'll it'll determine in real time based on the places that there are and what it knows about you what places you would like to go to because of the age of your kids and uh the distance from where you are and the traffic that is currently there because it's tied to Google Maps and uh how much time you have because of your calendar because it's connected Google calendar and also where friends that you know might be going because connect to
**[00:54:00]**
the social feed.
So ton of stuff [laughter] you can do and it's going to be very exciting. But uh what do you think? Let me know in the comments what do you think about the Gemini 3 Pro and of course about anti-gravity which is very interesting. Just learned about it last night. So let me know and please follow and like and subscribe if you like the show and if you like the videos. And I'll see you in