TL;DR: I needed a code → review → test pipeline with autonomous AI agents, where the orchestration is deterministic (no LLM deciding the flow). After two months exploring Copilot agent sessions, building my own wrapper (Protoagent), evaluating Ralph Orchestrator, and diving deep into OpenClaw's internals, I found that Lobster (OpenClaw's workflow engine) was the right foundation — except it lacked loops. So I contributed sub-workflow steps with loop support to Lobster, enabling fully deterministic multi-agent pipelines where LLMs do creative work and YAML workflows handle the plumbing. GitHub Copilot coding agent wrote 100% of the implementation.


Table of Contents


The Backstory: Two Months of Chasing Autonomous Dev Agents

This didn't start last weekend. It started two months ago when GitHub shipped the Copilot coding agent — the ability to assign a GitHub issue to @copilot and have it work autonomously in a GitHub Actions environment, pushing commits to a draft PR. The Agent Sessions view in VS Code gave you a mission control for all your agents, local or cloud.

That planted the seed: if a cloud agent can work on one issue autonomously, what if you could chain multiple specialized agents into a pipeline? Programmer → reviewer → tester, all running in the background, all pushing to PRs.

Building Protoagent

The first thing I built was Protoagent — a multi-channel AI agent wrapper in TypeScript/Bun that bridges Claude SDK and GitHub Copilot CLI to Telegram and REST API. The idea was to control AI agents from my phone, using my own subscriptions, with no vendor lock-in. It supported multi-provider switching, voice messages via Whisper, session management, crash recovery, and a REST API for Siri/Apple Watch integration.

Protoagent solved the "talk to an agent from anywhere" problem, but not the orchestration problem. It was still one agent, one session, one task at a time. I needed the pipeline.

Discovering Ralph and OpenClaw

Around the same time, I found Ralph Orchestrator — an elegant pattern for autonomous agent loops with hard context resets. And then OpenClaw — which turned out to be a much more complete version of what I was trying to build with Protoagent: multi-channel, multi-agent, with a full tool ecosystem, skills marketplace, and a Gateway architecture.

OpenClaw made Protoagent redundant. But none of these tools solved the specific problem I was after.


The Problem

I wanted autonomous AI agents working as a dev team: a programmer, a reviewer, and a tester, running in parallel across multiple projects. The pipeline: code → review (max 3 iterations) → test → done. No human in the loop unless something breaks.

The requirements were clear: