Google Antigravity and Gemini 3 Pro: What's Really Changing in Development and Why It's Not a Cursor Killer / Habr

A demo of Google’s new Antigravity agentic developer tool. (Image credit: Google)

On November 18, 2025, Google unveiled a combination of two products: the new flagship model Gemini 3 Pro and the agent-first IDE Google Antigravity. The classic concept of "LLM + editor plugin" has its limitations: large monorepos, complex CI/CD pipelines, multimodal artifacts (screenshots, videos, logs, PDFs), and traceability requirements don't fit well into a chat window next to the editor. What was needed was a combination: a model that can confidently "think" and understand different modalities, and an environment that plans, executes, and clearly visualizes the work of agents. Google answered our prayers and presented just such a combination: Gemini 3 Pro + Antigravity.

Headlines like "Cursor is dead," "a new era of development," and "agents will do everything for you" immediately flooded the feeds. In this article, we'll break down what exactly Google has launched, how Antigravity differs from Cursor, which development scenarios are already changing, and where it's still too early to abandon your familiar stack.

What Exactly Did Google Roll Out?

1) Gemini 3 Pro - a multimodal model with a long context, optimized for complex reasoning, agentic scenarios, and code. I recommend checking out the Gemini 3 Developer Guide

2) Google Antigravity - a new agent-first IDE, designed from the ground up not as an editor with code suggestions, but as an environment for multiple AI agents to work on a real project.

Moreover, Antigravity is not a plugin for VS Code (like Cursor or Copilot extensions), but a separate application based on a fork of VS Code, tailored for multi-agent scenarios. Gemini 3 Pro and Antigravity are designed as a unified architecture for agentic development.

Gemini 3 Pro: A Model for IDEs and AI Agents

https://blog.google/products/gemini/gemini-3

Officially, Gemini 3 Pro is "the smartest model from Google" and "the world's best multimodal model for agentic scenarios and vibe-coding", according to the Google blog. It's worth noting that this isn't just an upgraded Gemini 2.5 Pro; its architecture and settings are tailored for scenarios where the AI doesn't just respond in a chat but sequentially performs tasks within a codebase.

Well, they were a bit hasty about its superiority over Gemini 2.5 Pro) Artificial Analysis tested Gemini 3 Pro on its benchmark, the AA-Omniscience Index - models are run through 6k questions on various topics, with instructions to answer only if the model is certain of the answer.

https://x.com/ArtificialAnlys/status/1990926803087892506

And it turned out that Gemini 3 Pro hallucinates just as much as Gemini 2.5 Pro. Thus, Gemini 3 Pro significantly outperforms competitors in correct answers, but in 88% of its "misses," it hallucinated instead of remaining silent. The model that hallucinates the least is Claude Haiku 4.5, which only invents answers in 26% of cases. But let's move on to the main benchmarks and technical capabilities of Gemini 3 Pro.

Key Technical Features

Бенчмарки Gemini 3 Pro и сравнение с предыдущей Gemini 2.5 Pro, Claude Sonnet 4.5 и ChatGPT-5.1 Источник — Gemini 3 Pro benchmarks and comparison with the previous Gemini 2.5 Pro, Claude Sonnet 4.5, and ChatGPT-5.1 Source

The official documentation states:

Sparse Mixture-of-Experts (SMoE) Architecture - The model consists of a set of specialized "experts." For each piece of text, it uses only a subset of them, not all at once. This allows for a significant increase in the model's capacity without making each request too expensive.
Long context of up to 1M tokens on input and up to 64k on output, with a knowledge cutoff of January 2025. This is enough to keep a large monorepo with documentation and logs in context, rather than just individual files.
Multimodality - Gemini 3 Pro can work with text, images, PDFs, interface screenshots, and video frames in a single request. This is closer to the real-world development scenario, where specifications, logs, and interfaces are presented in various formats.
Benchmarks, "Agentness": 54.2% on Terminal-Bench 2.0 (terminal actions/tools) - this is an indicator of actual "computer use," not just chatting.
Built-in Tools: Gemini 3 Pro in Vertex/ AI Studio comes with a set of built-in tools: Google Search, File Search, Code Execution, URL context and standard function calling. This means the AI doesn't just write text, but can also run tests and commands: search files, execute code, browse the internet, and call APIs.
Pricing - in preview via the Gemini API: $2 per 1M input tokens and $12 per 1M output tokens (including thinking tokens) for prompts ≤200k, and $4 / $18 for longer ones.
Controlled Reasoning - there are 3 parameters that influence the model's behavior:
1) thinking_level - you can set the reasoning depth level (low / high / dynamic), which simplifies the configuration of reasoning instead of a manual "thinking budget." This is useful when you need to solve a complex problem, not just complete a line of code. Instead of the old пожалуйста, думай пошагово in the prompt, you can now simply set the thinking-config in the request config.

Example in Python (refactoring synchronous code for asyncio and asking the model to think thoroughly):

import os
from dotenv import load_dotenv
from google import genai
from google.genai import types

load_dotenv()
api_key = os.getenv("GEMINI_API_KEY")
if not api_key:
    raise RuntimeError("GEMINI_API_KEY не найден в переменных окружения")

client = genai.Client(
    api_key=api_key,
    http_options=types.HttpOptions(api_version="v1alpha")  # thinking / media_resolution
)

model_name = "gemini-3-pro-preview"

prompt = """
У меня есть легаси-код на Python, который обр��батывает файлы синхронно.
Перепиши его с использованием asyncio для лучшей производительности
и объясни, какие могут быть гонки, если файлы — общий ресурс.

Вот код:
import time
def process_file(filename):
    time.sleep(1)  # имитация работы
    return f"Processed {filename}"

def main(files):
    results = []
    for f in files:
        results.append(process_file(f))
    return results
"""

response = client.models.generate_content(
    model=model_name,
    contents=prompt,
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(
            thinking_level=types.ThinkingLevel.HIGH,
            include_thoughts=True,
        )
    )
)

# В реальном проекте лучше разбирать response.candidates,
# здесь для простоты выведем текстовый ответ
print(response.text)

This way, you're not trying to persuade the model to "think step-by-step," but rather giving it a conditional budget for thoughts via thinking_config.
Thinking modes and thinking_config are now available through the API version v1alpha for the Gemini Developer API.
For more details, I recommend checking out the Gemini 3 Developer Guide

2) media_resolution - how many tokens to spend on images, PDFs, or videos (fine text/small elements vs. context saving). In other words, you can configure how thoroughly the model analyzes images and PDFs to avoid wasting context.

Approximate costs:

Images HIGH — ~1120 tokens; MEDIUM — ~560 tokens;LOW — ~280 tokens. Video frames LOW / MEDIUM — ~70 tokens per frame; HIGH — ~280 tokens per frame.
Scenarios LOW — bulk image classification or many video frames;MEDIUM — typical tasks with images / screenshots; HIGH — dense text, small font, important details (PDFs, complex diagrams).

Example (Python):

from google import genai
from google.genai import types

client = genai.Client(
    http_options={"api_version": "v1alpha"}
)

with open("image.png", "rb") as f:
    image_bytes = f.read()

image_part = types.Part.from_bytes(
    data=image_bytes,
    mime_type="image/png",
    media_resolution=types.MediaResolution.MEDIA_RESOLUTION_HIGH,
)

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents=[
        "Что изображено на картинке?",
        image_part,
    ],
)

print(response.text)

3) Thought Signatures - compact signatures of thought chains that allow preserving context (the line of reasoning) between requests and within agentic pipelines:

The model decides: "I need to call the function check_server_status" and along with the functionCall it returns a thoughtSignature.
You call the function on your end and get a result (e.g., the database status).
When you send the function's response back to the model, you must also return the same thoughtSignature.
Then the model continues thinking from the same point, instead of rebuilding everything from scratch. Why these specific characteristics are important for IDEs and AI agents

Why These Specific Properties Are Important for IDEs and Agents

For a regular chatbot, it's enough for the model to answer questions more or less coherently. In a development environment and in scenarios with AI agents, the tasks are different. Here, the model is required to:

Be able to hold a large codebase and related data in context.
A large codebase, logs, documentation, past test results—all of this needs to be seen at once, not in pieces. Without a long context, an agent simply has nothing to work with.
Be able to parse different data types, not just text.
Interface screenshots, PDF specifications, diagrams, API requests, monitoring graphs. If the model only sees plain text, it loses half the picture.
Work in a controlled and predictable manner.
Sometimes you need the model to "dig deeper" and analyze a complex case, and other times you need it to respond quickly without unnecessary philosophizing. That's why settings for reasoning levels and how it uses computational resources are important.
Be able to not just answer, but to act through tools.
An agent must be able to open a file, run tests, check logs, modify code, and use a browser. Without this, it remains a consultant, not an executor.

It is for such scenarios that the combination of a powerful model like Gemini 3 Pro and an agent-first IDE, was conceived, not just to help write code, but to actually take on some of the routine work in developing and maintaining projects.

It is for this role of the model that the combination with Antigravity is built.

What is Google Antigravity?

On the very first day, many media outlets wrote, "Google has released a Cursor killer." In reality, the picture is, of course, more complex. Let's first figure out what Antigravity, is, and then compare it with Cursor.

Antigravity is a separate desktop IDE focused on agentic development, not a plugin for VS Code. Besides integration with Gemini 3 Pro, other models are also supported (for example, Claude Sonnet 4.5).
Antigravity is available on Windows, macOS, and Linux as a standalone application.
Antigravity has two modes of operation - Editor and Manager View:
1) Editor View - is similar to a familiar editor (it feels like a fork or relative of VS Code): code editor, terminal, file tree. The difference is that agents have full access to:
- read and modify files
- run tests and any CLI scripts
- update dependencies
- open the built-in browser
- save results as artifacts.
2) Manager View - a control panel for agents and tasks, allowing you to view the AI's work as a sequence of understandable actions. Here you can see:
- what tasks agents are currently performing
- what steps have already been taken
- what artifacts have been generated (plans, diffs, logs, screenshots)
- comments from team members.

In other words, the IDE is organized not by files and tabs, but by missions that the agents are performing.

Antigravity was designed from the start for a scenario where multiple agents can work on a project simultaneously: one rewrites code and fixes tests, another updates the infrastructure, a third writes documentation, and a fourth is busy with refactoring.
For transparency of actions, Antigravity creates Artifacts - plans, task lists, diffs, code changes, test outputs, screenshots, and browser recordings. This is necessary to verify what has been done and what will be done. You can leave comments directly in the artifacts.

Artifacts are not just a log, but a verifiable trail: they help you understand what exactly the agent did, what it based its actions on, and where it potentially made a mistake.
IDE navigation, the agent manager, and browser behavior are described in the Official Documentation.

Antigravity vs. Cursor: A Fair Comparison

Directly comparing Antigravity and Cursor isn't entirely fair, as they address different questions. Therefore, it's more accurate to compare which tasks each tool handles better, rather than who will "kill" whom.

Cursor is an IDE with a powerful AI assistant inside.
Antigravity is an environment where AI agents perform tasks, with the editor and infrastructure connected to them as tools.

Where Antigravity is stronger:

Agent-first design. Antigravity was created as an environment where agents are full-fledged participants in development, not just code suggestion providers in a side panel. The entire architecture is built around their plans, states, and artifacts. Hence the Manager View, artifacts, step tracing, and focus on task orchestration.
Native integration with the Google Cloud stack.
According to Google's materials, Antigravity is particularly logical where Cloud Run, GKE, Cloud Build, and other GCP services are already in use: agents can work with the same pipelines, environments, and infrastructure that people do.
Transparency for teams. Every step an agent takes is recorded: there are logs, diffs, and artifacts. This simplifies code review, change management, auditing, and internal checks.
Multimodal capabilities for work. When a single task requires looking at an interface, checking an API, and modifying code, it's more convenient when screenshots, a browser, and a code editor all live in the same environment, rather than in several disparate tools.

Now let's figure out where Cursor is a better fit.

Where Cursor Remains More Convenient Today

Familiar VS Code interface. Cursor is essentially VS Code with deep AI integration. For teams accustomed to VS Code and its plugin ecosystem, it's easier to continue working there.
For a single developer on a laptop.
Cursor is perfect for the "developer + local project" scenario: explaining code, refactoring, generating tests, and making quick changes in the current repository.
Product recognition and credibility.
Cursor has been in use for a long time, and there's a wealth of experience, guides, and best practices surrounding it. Antigravity is still in public preview: the product is actively developing, but it's not yet a fully established platform.
Focus on the individual developer.
Cursor primarily saves time for the individual developer. Antigravity is geared towards a broader level - teams, processes, and agentic orchestration.

What Tasks Antigravity Already Handles Best

Currently, we can identify three groups of tasks where Antigravity looks particularly compelling:

1. Large Monorepositories and Infrastructure Tasks

Typical scenarios:

mass migrations (framework, API, code style);
enforcing a unified code style and configuration;
updating dependencies and templates across dozens of services;
preparing for infrastructure changes (cluster, CI/CD, services).

Here, both a long context and the transparency of agent action chains are important: you need to understand not only what changed in the code, but also why the agent decided to do it that way.

2. Small Teams (2–5 people) That Need to "Punch Above Their Weight"

For small teams, Antigravity provides the ability to delegate some routine work to agents:

one agent finishes writing tests and monitors their status;
a second one makes infrastructure changes;
a third maintains documentation and the change log.

Meanwhile, developers can concentrate on architecture, task definition, and reviews.

In other words, Antigravity allows such teams to take on tasks that were previously only accessible to large departments.

3. Vibe-Coding and Rapid Prototyping

Antigravity is well-suited for situations where the focus is not on the process of editing lines of code, but on the result in the form of a working prototype:

Собери интерфейс с таким сценарием и проверь, что он открывается.
Подними API, сделай несколько тестовых запросов и покажи результат.
Добавь фичу, прогоняй тесты и подготовь описание изменений.

This is a convenient format for internal tools, pet projects, and experiments.

How to Try Gemini 3 Pro and Antigravity

Gemini 3 Pro:

Create a project in Google Cloud or go to Google AI Studio.
Connect to the Gemini API and get a key. I recommend the Gemini API Quickstart, as well as How API Keys Work (where to store them, environment variables, etc.)
In your code, use the model gemini-3-pro / gemini-3-pro-preview and experiment with the settings for reasoning depth and media handling. Here, I recommend checking out the Separate Developer Guide for Gemini 3

At this level, you can:

give the model large code snippets and documentation;
connect a tool for code execution and file search;
see how its behavior changes with different settings.

Antigravity:

Download Antigravity for your OS from the official Google website.
Authorize and connect a repository (local or remote).
In Editor View:
- ask an agent to review a small service;
- give it a task to write tests or do a small refactoring.
In Manager View:
- set up a task for multiple agents;
- track what artifacts and diffs they create;
- review the changes and accept/reject them.

Even in a test project, this is enough to feel the difference between a regular AI assistant in an editor and an environment where agents perform work from the top down.

Conclusion

Let's organize our thoughts and summarize:

Gemini 3 Pro is a model designed for complex scenarios: long context, multimodal data, controlled reasoning, and tool use. This makes it a natural foundation for agentic IDEs.
Google Antigravity is a standalone IDE with an agent-first design from the ground up. Its foundation is not autocompletion, but the management of tasks, agents, and artifacts within a real project.
Cursor and similar AI add-ons aren't going anywhere. For individual development and teams that prefer VS Code, they remain convenient tools.
The main change isn't about "who will replace whom," but about a new dimension in the market. Both classic IDEs with AI plugins and agent-first IDEs like Antigravity, as well as cloud platforms for running agentic scenarios at the infrastructure level, will coexist peacefully.

If you're thinking about how to integrate AI into your development processes, you should look not only at the model but also at the combination of "model + IDE + infrastructure". Google, with Gemini 3 Pro and Antigravity, shows one possible version of such an architecture. You can support me on my channel NeuroProfit - where I write about what I understand or am trying to understand myself, test useful AI services, and generally try to be helpful.