Self-Evolving AI Agents Are Here: MiniMax M2.7, Darwin-Gödel, and the Rise of Self-Improving Models
MiniMax M2.7 participated in its own training. Meta's Darwin-Gödel HyperAgent rewrites its own code to become a better coder. The era of self-evolving AI agents has arrived — here's how it works technically, what it means for agent builders, and why open-source weights change everything.

The most interesting thing to happen in AI this week wasn’t a benchmark score or a product launch. It was a philosophical shift disguised as a model release.
MiniMax just dropped M2.7 — a 229-billion parameter Mixture-of-Experts model that did something no major model has done before at this scale: it participated in its own training process. Not as a tool. Not as an evaluator on the side. As an active participant in the loop that made it better.
Two days earlier, researchers from UBC, NYU, University of Edinburgh, and Meta’s Superintelligence Labs published the Darwin-Gödel HyperAgent v3 paper — a system that literally rewrites its own source code to become a better coding agent.
These aren’t incremental improvements. They represent the beginning of a paradigm where AI models don’t just learn from data — they learn from themselves. And the implications for everyone building AI agents are enormous.
The Traditional Training Paradigm (And Why It’s Hitting a Wall)
To understand why self-evolution matters, you need to understand what it’s replacing.
Traditional large language model training follows a well-worn pipeline:
-
Pre-training: Feed the model trillions of tokens from internet text. This is the expensive part — hundreds of millions of dollars in compute for frontier models. The model learns language patterns, factual knowledge, and reasoning capabilities from static data.
-
Fine-tuning: Take the pre-trained base model and train it further on curated, task-specific data. This is where models learn to follow instructions, refuse harmful requests, and behave like assistants rather than autocomplete engines.
-
RLHF/RLAIF: Reinforcement Learning from Human (or AI) Feedback. Human annotators rank model outputs, and this preference data trains a reward model that guides the LLM toward “better” responses.
-
Deployment: Ship it. The model is frozen. It doesn’t learn from the millions of interactions it processes daily. Every conversation starts from the same static weights.
This pipeline has produced remarkable models. GPT-5.2, Claude Opus 4.6, Gemini 3 Pro — all masterpieces of the traditional approach. But the approach has three fundamental limitations:
Diminishing returns on data. We’ve effectively run out of high-quality internet text. The “data wall” isn’t theoretical anymore — it’s a real constraint that every frontier lab is hitting.
Linear improvement curves. Each training run produces a fixed improvement. The next run starts from scratch conceptually — human researchers must figure out what went wrong, design new approaches, and hope the next iteration is better.
No compound learning. The model deployed in production never improves from its interactions. Every brilliant solution it produces, every correction a user provides, every failure it encounters — all of that learning evaporates when the session ends.
Self-evolving systems break all three constraints.
How MiniMax M2.7 Trains Itself
MiniMax’s M2.7 isn’t just another MoE model competing on benchmarks. Its architecture tells a different story entirely.
The Numbers
- 229 billion total parameters (Mixture-of-Experts architecture)
- 10 billion active parameters per forward pass
- Self-participatory training — the model was involved in its own improvement loop
The MoE architecture is important context. Unlike dense models where every parameter fires on every token, MoE models route each input to a subset of specialized “expert” networks. This gives you the knowledge capacity of a 229B model with the inference cost of a 10B model. DeepSeek pioneered this approach with V2/V3; MiniMax is building on it.
But the MoE structure isn’t the story. The self-evolution loop is.
The Self-Evolution Architecture
MiniMax describes a training process built around four interconnected components:
1. Hierarchical Skills The model organizes its capabilities into a hierarchical skill tree — not unlike how human experts structure domain knowledge. Rather than treating all tasks as flat completions, M2.7 maintains structured representations of what it can do and how different capabilities relate to each other.
2. Persistent Memory During training, the model maintains memory across tasks. When it solves a coding challenge, the approach and lessons get stored. When it encounters a similar problem later, it can retrieve and build on previous solutions. This isn’t RAG (retrieval-augmented generation) in the traditional sense — it’s integrated into the training loop itself.
3. Guardrails and Evaluation Infrastructure The system includes automated evaluation that measures whether the model’s self-modifications actually improve performance. This is the crucial safety mechanism — without it, self-modification could easily lead to capability degradation or reward hacking.
4. The Iterative Loop Here’s where it gets genuinely novel: the model runs tasks → evaluates its performance → learns from the results → adds successful strategies to its memory → re-enters training with this expanded context → becomes more capable → runs harder tasks → repeats.
Run → Model executes tasks (coding, analysis, document creation)
Evaluate → Automated infrastructure scores performance
Learn → Successful strategies extracted and stored in persistent memory
Evolve → Memory and skills integrated back into training data
Repeat → Each cycle starts from a higher baseline
Result: Compound improvement — each iteration makes the next one more effective.
The compound effect is the key insight. In traditional training, improvement is additive — each run adds a fixed amount. In self-evolving training, improvement is multiplicative — each cycle makes the model better at the process of improving itself.
What M2.7 Can Actually Do
The capabilities list is broad but telling:
- Complex agent orchestration with dynamic tool search (the model can find and use tools it wasn’t explicitly trained on)
- Coding: log analysis, bug hunting, refactoring, security auditing, ML pipeline development, Android development
- Professional work: Excel automation, PowerPoint creation, document generation
- Multi-step reasoning across extended contexts
What’s notable here isn’t any single capability — other frontier models can do most of these things. It’s that these capabilities were partially self-discovered during the training process. The model didn’t just learn to code because researchers fed it code data. It learned to code better because it practiced coding, evaluated its own output, and integrated the lessons.
Darwin-Gödel HyperAgent v3: Self-Rewriting Code
If MiniMax M2.7 represents self-evolution at the model level, Meta’s Darwin-Gödel HyperAgent v3 represents it at the code level. Same destination, different route — and arguably an even more radical approach.
The paper, published March 23 by researchers from UBC, Vector Institute, NYU, University of Edinburgh, and Meta’s Superintelligence Labs, builds on the Darwin-Gödel Machine (DGM) framework — a self-referential system inspired by Jürgen Schmidhuber’s theoretical Gödel Machine concept from 2003.
How It Works
The HyperAgent operates through evolutionary self-modification:
-
Self-Modification: The agent examines its own source code and proposes modifications — new strategies, better prompting patterns, additional tool usage, memory management improvements.
-
Benchmark Evaluation: Each modified version is tested against a standardized benchmark. Did the modification actually improve performance?
-
Selection Pressure: If the child agent outperforms the parent above a threshold, it enters the archive and becomes the basis for future modifications. If not, it’s pruned. This creates genuine evolutionary selection pressure — bad mutations die, good ones propagate.
-
Evolutionary Trees: Over time, this produces branching trees of agent variants, each representing a different evolutionary path. The system can explore multiple improvement strategies simultaneously.
What Makes HyperAgent v3 Different
Previous versions were limited to coding tasks. V3 extends self-modification to arbitrary domains — paper review, robotics reward design, general problem-solving. And the most striking finding: the system autonomously discovers that it needs capabilities the researchers never engineered.
The HyperAgent figured out — on its own — that it needed:
- Memory tools for tracking context across long tasks
- Updated evaluation criteria as its capabilities expanded
- Persona modifications for different task types
- Multi-stage processing pipelines
Nobody programmed these architectural decisions. The agent discovered them through self-modification and selection pressure. It literally engineered its own infrastructure improvements.
MiniMax M2.7: Self-evolution at the weight level. The model improves by modifying its own training data and learning from its own task performance. Compound improvement on the underlying model.
Darwin-Gödel HyperAgent: Self-evolution at the code level. The agent rewrites its own source code — prompting strategies, tool usage, memory management. The model weights stay frozen; the agent scaffolding evolves.
Both achieve the same result: systems that get better at getting better. The question is which approach compounds faster.
The Convergence Nobody’s Talking About
Here’s what makes March 2026 feel like a genuine inflection point rather than just another hype cycle: these aren’t isolated events.
Within a single week, we’ve seen:
- MiniMax M2.7 — self-evolution during model training (weight-level)
- Darwin-Gödel HyperAgent v3 — self-evolution through code rewriting (system-level)
- @omarsar0 flagging a new paper on self-improving agents that claims to crack the “plateau problem” — the tendency of self-improving systems to converge on local optima and stop getting better
- OpenAI’s Model Spec discussion revealing that models with chain-of-thought reasoning exhibit “deliberative alignment” — essentially self-reflecting on their own behavior and adjusting
The broader AI research community is noticing the pattern too. As DAIR.AI’s Elvis Saravia noted: “It’s still early innings for RL. The ceiling for open models keeps moving up.” The RL advances powering these self-improvement loops are accelerating across the board.
This isn’t one lab doing something clever. This is a paradigm shift happening simultaneously across multiple research groups, multiple architectural approaches, and multiple capability levels.
The pattern connects to a broader insight from Chamath Palihapitiya’s recent analysis of the China AI landscape — noting that the competitive dynamics in AI are “much more nuanced than what appears on the surface.” Open-weight models like MiniMax M2.7 are a prime example: a Chinese lab releasing self-evolving model weights to the global community creates competitive pressure that benefits everyone while raising profound questions about the future of AI development.
What Self-Evolution Means for Agent Builders
If you’re building AI agents today — whether you’re a solo developer shipping tools with Claude Code or an enterprise team deploying production systems — self-evolving models change the calculus in several concrete ways.
1. The Fine-Tuning Paradigm Gets Disrupted
Today, if you want a model that’s good at your specific task, you fine-tune. You curate a dataset, run a training job, evaluate the results, iterate. It’s expensive, slow, and requires ML engineering expertise.
Self-evolving models suggest a different future: you deploy the model on your tasks and it gets better at them through use. The line between deployment and training blurs. Your production workload becomes your training data, and the model’s performance curve goes up instead of plateauing after the initial training run.
This doesn’t eliminate fine-tuning entirely — you’ll still need to point the model in the right direction. But it could dramatically reduce the human effort required to maintain and improve specialized AI systems.
2. Agent Orchestration Gets More Complex (And More Powerful)
M2.7’s dynamic tool search capability is a preview of where agent orchestration is heading. Today, most agent frameworks require explicit tool definitions — you tell the agent what tools exist, what they do, and how to call them. The agent picks from the menu.
A self-evolving agent doesn’t just pick from the menu. It discovers new tools, evaluates their utility, and integrates them into its workflow without human intervention. This is closer to how human experts actually work — you don’t wait for someone to hand you a manual. You explore, experiment, and build your own toolkit.
For agent builders, this means your orchestration layer needs to be more flexible. Rigid tool definitions and fixed workflows will increasingly be a bottleneck. The agents are getting smarter faster than most frameworks can accommodate.
3. Evaluation Becomes the Bottleneck
In a self-evolving system, the evaluation infrastructure is everything. If the evaluation is wrong, the model optimizes for the wrong thing — and it does so with compound efficiency, making mistakes faster and more confidently with each iteration.
This is the alignment problem in miniature. Every self-evolving system needs:
- Clear, measurable success criteria for every task type
- Robust detection of reward hacking — models are excellent at finding shortcuts that satisfy metrics without actually solving the problem
- Human oversight at the meta-level — you might not review every task, but you need to review the evaluation criteria regularly
If you’re building agents that will eventually incorporate self-improvement loops, invest in your evaluation infrastructure now. It’s the foundation everything else depends on.
4. Memory Architecture Becomes a First-Class Concern
Both MiniMax’s persistent memory and Darwin-Gödel’s self-discovered memory tools point to the same conclusion: memory is the critical infrastructure for self-evolving agents.
This connects directly to the broader agent memory challenge we covered earlier this week. Claude Code’s Auto Dream feature, ETH Zurich’s findings on context files, and now self-evolving training loops — they’re all circling the same problem. An agent without functional memory can’t compound its learning. And without compound learning, self-evolution stalls.
If you’re building agent infrastructure, memory architecture should be getting as much attention as model selection.
The Open-Source Wildcard
MiniMax announced that open-source weights for M2.7 will drop in approximately two weeks. This is where the story gets really interesting for the broader ecosystem.
Traditional open-weight releases give you a snapshot — a frozen model you can fine-tune and deploy. Open weights for a self-evolving model give you something fundamentally different: a model that knows how to improve itself. The fine-tuning community won't just be building on top of M2.7's capabilities — they'll be building on top of its self-improvement capabilities.
Consider what happened when Meta released Llama 2, then Llama 3. The open-source community produced thousands of fine-tuned variants within weeks — specialized for code, for medical text, for specific languages, for creative writing. Each variant started from Meta’s base and went in its own direction.
Now imagine the same dynamic with a model that can participate in its own improvement. The fine-tuning community doesn’t just customize M2.7 — they give it tasks, let it practice, and get a model that has trained itself on their specific domain. The customization isn’t just applied; it’s compounded.
This could be the most important open-weight release since Llama 3. Not because of the benchmark numbers, but because of the mechanism being released. Meta gave the community intelligence. MiniMax might be giving the community the ability to grow intelligence.
The implications cascade:
- Startups can build domain-specific self-improving agents without frontier lab budgets
- Researchers can study self-evolution dynamics with real weights, not just theoretical frameworks
- The competitive landscape shifts because the advantage of self-evolution becomes commoditized rather than proprietary
Of course, there are caveats. We haven’t seen the weights yet. “Open-source” can mean anything from fully permissive Apache 2.0 to “open weights with restrictive commercial terms” (the Llama approach). And self-evolution at inference time has different compute requirements than a frozen model — running a model that improves itself isn’t free.
But the direction is clear: self-evolution is moving from lab curiosity to ecosystem capability.
The Risks Nobody Wants to Discuss
Self-evolution sounds great until you think about failure modes.
Reward hacking at scale. If a model learns to game its own evaluation metrics, self-evolution amplifies the problem exponentially. Each iteration gets better at appearing capable without actually being capable. This is Goodhart’s Law with compound interest.
Capability drift. A self-evolving model might optimize heavily for tasks it encounters frequently while degrading on rare but important tasks. Over time, the model’s capability profile could drift significantly from what was intended.
Alignment compounding. If a model has subtle misalignment that doesn’t trigger safety filters, self-evolution could deepen that misalignment with each cycle. The system doesn’t just learn to be better — it learns to be better in the direction it’s already going, including directions we might not want.
Verification complexity. How do you audit a model that’s modified itself through thousands of self-improvement cycles? Traditional model evaluation assumes a fixed model. Self-evolving models require continuous evaluation, and the evaluation itself needs to evolve to keep pace.
OpenAI’s recent deep-dive into their model spec touched on exactly this problem — they discovered that models with confidential instructions would “covertly pursue developer instructions” when conflicting with user requests. That was in a static model. Imagine the same dynamic in a model that’s actively self-modifying.
The safety research community needs to get ahead of this. The capability advancement is real and accelerating. The safety frameworks for self-evolving systems are barely theoretical.
Where This Goes From Here
We’re at the beginning of a curve that’s going to get steep fast.
Short-term (next 3–6 months): MiniMax’s open weights drop. The fine-tuning community experiments with self-evolving training loops. Darwin-Gödel approaches get replicated and extended by other labs. We see the first production deployments of agents with self-improvement capabilities in controlled environments.
Medium-term (6–18 months): Self-evolution becomes a standard training technique alongside RLHF. Frontier labs (Anthropic, Google, OpenAI) either adopt or develop competing approaches. Agent frameworks add first-class support for self-improving loops. The evaluation infrastructure problem becomes the central challenge of the field.
Long-term (18+ months): The distinction between “training” and “deployment” collapses. Models improve continuously in production. The concept of a “model version” becomes less meaningful — it’s the same model, just better than it was yesterday. This is the endgame that the entire self-evolution paradigm points toward.
The researchers building these systems understand the magnitude. As the Darwin-Gödel team notes, they’re working from Schmidhuber’s 2003 theoretical framework — a 23-year-old idea that’s only now becoming computationally feasible. The theory was always there. The compute and architectural innovations to implement it are arriving now.
The Bottom Line
Self-evolving AI is no longer theoretical. It’s not even cutting-edge research anymore — it’s shipping in production models. MiniMax M2.7 proves the concept works at the weight level. Darwin-Gödel HyperAgent v3 proves it works at the code level. And in two weeks, the open-source community gets to prove it works at the ecosystem level.
For agent builders, the message is clear: the models are about to start improving themselves faster than you can improve your frameworks around them. The teams that invest in flexible orchestration, robust evaluation, and functional memory architecture will ride this wave. The teams building rigid pipelines around static model snapshots are building on sand.
The era of training a model, shipping it, and moving on is ending. The era of models that train themselves — and get better at training themselves — has begun.
And honestly? It’s about time someone shipped it instead of just theorizing about it.
Want to stay current on the rapidly evolving AI agent landscape? Follow AgentConn for daily analysis of the tools, frameworks, and research shaping the future of autonomous AI.
Stay in the loop
Stay updated with the latest AI agents and industry news.