Run the VLA on the robot,
not just in the demo.
A vision-language-action model folds laundry in the lab. Put it on the actual robot and it stalls, and not because it can't do the task. It can't do it fast enough. The capability is there. What's left is engineering: get it inside a latency budget, then keep it safe once it's running.
A 10–30× gap. This repo closes it with the levers that actually pay off at batch 1, then adds a supervisor so the fast policy is also one you can leave running.
Deploy-compiler
Set a budget. It picks the best config off the real-L4 frontier, live.
Deployment budget
Action-chunking runs many actions per sampler call: cheaper per action, but the last one is more stale. This knob sets how stale you'll allow.
Safety supervisor
The runtime trust layer. It vets every action before it reaches a motor, live, on this server.
Send it an action to vet
The policy is calibrated on a normal posture. Pick what to throw at it.
Pick an action and hit Vet this action.
The verdict, the action actually sent, and the running governance log show up here.
Efficiency gets it onto the robot. The supervisor lets it stay.
Everything here is measured and reproducible, on the hardware robots actually carry. The code, the four-experiment low-bit study, and the full write-up are public.