Manifesto

Manifesto

Manifesto

19 November 2025

19 November 2025

19 November 2025

The Hard Part of Robotics is Robotics

Written by

the Flexion Team

A. Introduction

Every week, new products emerge that automate the creative, cognitive work once thought untouchable. In seconds, AI can generate an essay, compose a symphony, or create a film. But when it comes to physical work, progress has come far slower. Warehouses employ over a million robots, yet human workers still walk more than twelve miles per shift, lifting thousands of pounds a day. We’ve built digital abundance but left physical labor behind.

Meanwhile, the demographic tide is turning. Populations are aging, the workforce is shrinking, and productivity growth has stalled. Against this backdrop, attention has turned to general-purpose automation. Machines that step beyond a single workflow and operate across the breadth of the human world. The idea is not new. Honda unveiled ASIMO in 2000. Boston Dynamics has been stunning audiences with acrobatic humanoids for over a decade. And yet, outside the demo reel, where are they? Why does the physical world still look so much like the 20th century, when the digital world has leapt into science fiction?

The truth is that unlike software, robots can’t live in clean abstractions. They must act in the messy physical world, where every task is different, every environment brings surprises, and failure isn’t an option. The constraint isn’t motion but generalization. Progress won’t come from tweaking scripts or perfecting current workflows, but from rethinking how robots think.

B. Why Humanoids

Automating the existing world

The civilized world was built by us, for us. Every door handle, staircase, and tool assumes human reach, human force, human dexterity. For automation to scale, it must meet the world where it is, not demand the world be rebuilt around it.

Across industries, robotics projects often collapse under hidden costs. Major industrial transformations can cost billions per site, not because robots are expensive, but because factories require redesign, retooling, and reprogramming. Purpose-built robots often appear cost-effective until you add the engineering bill. Humanoids invert the equation. They shift capital from facilities to capabilities. The building stays the same, while upgrades ship as models, not metal.

Imagine opening a box, powering on a humanoid, and putting it straight to work in any environment with no redesign required. That is the promise: automation that arrives ready-made for the world we already built.

What “humanoid” really means

By “humanoid” we don’t mean cosmetic mimicry. We mean a capability envelope: locomotion, dexterity, perception, and reasoning in one system resilient enough for the complexity of the physical world. Our fingers can thread a needle. Our feet can balance on a ladder. Our bodies plan the next motion without conscious thought. Replicating the capability of humans, not their appearance, is what makes humanoids so ambitious.

This combination is not arbitrary. It is the minimum necessary to cover the remaining frontier of work, which is mobile, bimanual, and variable. It begins with human compatibility but does not stop there. A humanoid might have an extra limb for stabilization, hybrid locomotion for long flat runs, or sensors that see beyond human eyes. The point is not to imitate biology, but to inherit compatibility by default and extend capability where useful.

If you can make humanoids work, you can simplify to other form factors. The reverse is not true. Starting with humanoids forces us to solve the hardest problems first, the foundations every other platform will rely on.

C. Why a Horizontal Software Platform

From body to brain

For decades, a horizontal software layer in robotics was impossible. The foundations simply weren’t there, and in many ways still aren’t. Hardware is fragmented and without a common abstraction layer, each robot is its own ecosystem, stitched together with custom integrations that can’t scale. Task diversity makes the problem worse. Every deployment faces a different environment, forcing engineers to rebuild the stack each time. Hardware binds itself to software and software to workflow. With low production volumes and high task variability, vertical integration became the default.

But vertical integration was a response to limitation, not a law of nature. As standards emerge around onboard compute and simulation, and as deep learning makes generalizable software a reality, the equation begins to change. What once required rewriting now becomes retraining. New skills can be fine-tuned instead of hand-coded. Skills learned in one context — grasping, balancing, navigating — become part of a common library shared across industries.

That’s why humanoids are not just a hardware bet. Their value lies in what they unlock: a body for a common intelligence layer that can generalize across morphologies, tools, and environments.

Integration, not reinvention

As humanoids scale, the economics of robotics begin to change. Volumes rise, integration costs fall, but training embodied intelligence remains costly in data, compute, and time. Building a foundation model from scratch for every robot is as uneconomic as every startup training its own GPT from scratch. The tax isn’t only money, it’s time-to-usefulness. Teams burn months wiring the same perception modules, motion planners, skill libraries, simulators, and telemetry before they can attempt the actual job.

A shared platform flips that equation. Each deployment strengthens the next, and the cost of the Nth application approaches integration, not invention. For the first time, heterogeneity is no longer a flaw to suppress but the reason to build a platform.

D. Why Reinforcement Learning & Simulation

Farming humans for data

While large language models had decades of digitized text, images, and video to learn from, humanoids are the first of their kind, and the data of them moving simply doesn’t exist. To fill the gap, engineers have turned to human-led teaching, where robots learn to mimic the motions of human demonstrators.

These methods have value in the short tail of cases, where imitating a trajectory is the point. However, every new task, robot, or setting demands another round of human effort. Shift the shelf, change the box, swap the gripper, and the training must begin again.

Progress remains gated by how much data people can collect, one motion at a time. The result is brittle skills locked inside narrow contexts, learned at enormous cost. We haven’t escaped the labor trap, we’ve just moved it from the factory floor to the robotics lab.

Reinforcement learning as the engine

To build a true platform, robots can’t be hard-coded for every body, tool, or site. They need a method that doesn’t replay instructions but adapts in real time. Reinforcement learning (RL) provides exactly that: a framework where robots teach themselves by acting, sensing outcomes, and adjusting until they succeed.

The key output of RL is not a script but a policy: a mapping from what the robot perceives to what it should do next. A hand doesn’t just execute a fixed grasp, it redistributes force when an object shifts. A leg doesn’t just follow a path, it adapts when friction changes underfoot.

If the case for a platform is diversity, RL must be the engine. But there is a catch. RL doesn’t learn from a handful of examples; it demands vast amounts of data, more than any real robot could gather without breaking itself or bankrupting its operator.

The simulation unlock

The way out is training in simulation. In rich virtual worlds, robots can act, fail, and learn at scale, compressing millions of lifetimes into weeks. A stumble on a simulated stair does not break hardware; it becomes data shared across every robot using the same brain. Grippers can practice on thousands of objects, while locomotion policies traverse endless terrains, exploring edge cases no lab could stage. Where the internet gave language models a vast corpus of text, simulation gives embodied AI a corpus of motion, and a rehearsal space for the real world.

From sim to real

However, naïve sim-to-real transfer does not work. The common workaround is domain randomization, where simulators perturb every parameter in the hope that robustness will emerge. But this expands the space of possible worlds far beyond those observable in reality, forcing policies to act too cautiously everywhere, resulting in inefficient behaviors.

A more effective path is to reduce uncertainty, through real-to-sim pipelines that calibrate dynamics, contacts, actuation, and sensing so that simulation reflects reality where accuracy matters most. Domain randomization remains, but is applied selectively, only to parameters that genuinely vary in real-world deployments, such as ground friction or load distribution. Therefore, the policy is trained to handle real variations, not imaginary extremes.

Yet even the most finely calibrated simulators cannot capture the full diversity of our world. To bridge this gap, we can leverage generative models to extend the training distribution in a grounded way, by producing plausible variations drawn from the real-world rather than arbitrary noise. In doing so, simulation becomes a source of meaningful diversity. These models can also evaluate and shape policy behavior, revealing failure modes and guiding improvement over time.

E. Conclusion

Robotics has long been defined by dazzling prototypes, where each breakthrough has followed the same pattern: a vertical stack, engineered for one narrow task, rebuilt from scratch each time. The future lies in a different path, a horizontal software layer that lets robots, starting with humanoids, generalize. Other computing revolutions accelerated when shared platforms emerged: iOS and Android for smartphones; cloud platforms for computing. The absence of such a layer in robotics is why humanoids still appear as science projects rather than everyday tools.

Teleoperation and scripted choreography may look smoother today, but they are brittle, spectacular until they must scale. Soon enough, our approach, leveraging reinforcement learning in simulation to build true autonomy, will outpace every human-led teaching method. Ultimately, that is how we will move from impressive demos to a platform robust enough to power an entire ecosystem of humanoids and beyond.

Written by

the Flexion Team

Share

Affolternstrasse 42
8050 Zurich, Switzerland

Shape the Future

Whether you're interested in our product, partnerships, or joining our team, we'd love to hear from you

Shape the Future

Whether you're interested in our product, partnerships, or joining our team, we'd love to hear from you

Shape the Future

Whether you're interested in our product, partnerships, or joining our team, we'd love to hear from you

Shape the Future

Whether you're interested in our product, partnerships, or joining our team, we'd love to hear from you