⬡ MBS Hub features use_cases docs ./get_access
 PRIVATE BETA  ·  v0.2.4  ·  70+ PHASES SHIPPED

One app. Every AI tool you need.

Local LLMs. Full IDE. Autonomous agents. Image generation. Model training. Cloud deploy. All offline. All on your machine. Nothing sent anywhere.

10–12t/s30B on 4 GB VRAM
1,733backend commands
0cloud deps required
256Kcontext window
workbench — mbsd daemon — AI Runtime
The Problem

The AI toolchain is
broken by design.

Inference client. Code editor. Image tool. Deploy pipeline. Training platform. Five apps, five subscriptions, five surfaces for your IP to leak. Nobody built them to work together — because nobody had to, until now.

01 / 03

Your workflow is 4 apps.
It should be 1.

Context switch between inference, IDE, image generation, and deployment. Every handoff is friction, lost state, and engineer time. The tools were never designed to share context with each other.

Workbench unifies all six categories in one offline-first window. No integrations. No context lost between steps.
02 / 03

With most AI tools, your code
isn’t actually private.

AI editors route completions through remote APIs. Image tools process your work on third-party servers. Your source code, prompts, and IP flowing through infrastructure you didn’t choose.

Every inference call runs on your hardware. No remote processing by default. No telemetry on work product.
03 / 03

Local dev and production
are different worlds.

You build locally, then hand off to entirely separate tooling for containers and cloud. The AI work you did has no direct path to what ships. You rebuild the pipeline every project.

Docker, Kubernetes, and 10 cloud providers in the same window where the agent wrote the code.

Workbench doesn’t fill a gap. It closes all three. One app, running offline, combining six tool categories the industry kept separate for years — because combining them required building all six from scratch.

What’s Inside

Six pillars.
One app.
No compromises.

Each of these would ship as a standalone product. In Workbench, they share state, context, and models — making each one exponentially more useful than it would be alone.

inference-logs — CUDA — 10.4 t/s — Qwen2.5-32B
Local LLM Inference

Local AI Engine — CUDA-accelerated. Zero cloud.

GGUF inference with Flash Attention v3 and speculative decoding. Any model, any size. Auto-configured on first launch — no CUDA setup, no quantization guesswork. Mix local and cloud in the same session.

10–12 t/s on 30B models, 4 GB VRAM minimum
Flash Attention v3 + spec decoding + 256K ctx (8-bit KV)
Auto-quantization Q2_K → Q8_0 based on available VRAM
Multi-modal vision + real embeddings + cosine similarity
10 cloud providers (Anthropic, OpenAI, Groq, Mistral…)
mbsd daemon (JSON-RPC 2.0, TCP:3031) + TS & Python SDKs
GGUF / CUDA / FLASH ATTN v3 / 256K CTX
dev-panel — Monaco LSP — Python (pylance) — DAP active
Code Editor

Professional Code Editor — real IDE, not a text field.

Monaco editor with full LSP, real PTY terminal, DAP debugger, and native Git. Every AI completion uses the local model in Pillar 1 — no additional API call, no latency, no context switch.

LSP: Python, TypeScript, Rust, Go, Lua, C/C++ — auto-installs servers
Real ConPTY/openpty terminal with OSC 633 shell integration
DAP debugger: 9 adapters (Node, Python, Rust, Go, C++…)
Native Git: blame, partial stage, interactive rebase, reflog, graph
150+ settings, tab groups, keybinding editor, extension API
Node.js extension runtime + plugin marketplace + snippet library
MONACO / LSP / DAP / NATIVE GIT
mcp-workflow — agent — step 4/11 — GSD-2 active
Agent Workflow

Autonomous Agents — goal to production, unattended.

Describe a goal. The agent decomposes it, edits files across your codebase, runs terminal commands, calls APIs, tests results, iterates — live audit trail every step. Safety gates prevent anything destructive without your approval.

23 built-in tools: code edits, files, terminal, Git, web search
24 MCP servers: PostgreSQL, Docker, browsers, blockchains, mobile
GSD-2: milestone/slice/task hierarchy, crash recovery, cost ledger
Three-tier safety gates — full action audit log per step
BSON episodic memory + MemoryPalace cross-session context
Stuck detection, complexity scoring, automatic spec replay
REACT / GSD-2 / 24 MCP SERVERS / BSON MEMORY
image-gen — SD 1.5 — 512×512 — step 18/30
Image Generation

Creative Studio — images and voice, fully offline.

Complete Stable Diffusion pipeline and a professional Voice Studio — on your GPU, with no per-image cost, no API keys, no prompts shared anywhere. Generate until you’re satisfied.

Text-to-image, img2img, inpainting, ControlNet, LoRA stacking
Batch generation with real-time preview + hardware auto-config
Voice Studio TTS: 22 voices (Kokoro ONNX + SAPI fallback)
4 STT modes incl. whisper.cpp — voice-to-code + dictation
P2P network for session sharing across devices
Theme Editor — full UI customization, 10+ built-in themes
STABLE DIFFUSION / 22 TTS VOICES / 4 STT MODES
training-dashboard — LoRA — step 240/500 — loss 0.423
Training Dashboard

Model Training — fine-tune and use, without leaving the IDE.

Fine-tune any model on your own data. Watch it train. Export and load it directly into the inference engine. Cloud GPU rental is one click away when local hardware is not enough.

LoRA & QLoRA fine-tuning from 6 GB VRAM
GRPO training via real trl/transformers pipeline
Live loss curves, gradient norm & step timing dashboard
Cloud GPU Hub: Vast.ai, RunPod, Lambda Labs (real APIs)
7 export formats: GGUF, ONNX, CoreML, TensorRT, AWQ, GPTQ, TFLite
Hardware-aware preset wizard — auto-tunes batch size & LR
LORA / QLORA / GRPO / CLOUD GPU HUB
deployment-center — Vercel — prod — deploying…
Deployment Center

Deploy Anywhere — from the same window where you built it.

Every project built in Workbench ships to production without switching tools. Docker, Kubernetes, and 10 cloud providers in one panel. Cost estimates shown before you commit.

Docker: build, tag, push, run, logs, Compose up/down lifecycle
Kubernetes: pod/svc/deploy explorer, port-forward, namespaces
Azure, GCP, AWS, Vercel, Netlify + more — real APIs, cost estimates
Remote SSH + Dev Containers (docker build/run, SCP, Remote LSP)
mbsd + mbs CLI + TS & Python SDKs for pipeline automation
Encrypted secrets vault + environment profile management
DOCKER / K8S / 10 CLOUD PROVIDERS / SSH
Use Cases

Build anything.
Run everything.

Because all six pillars share state, the outcomes aren’t just the sum of the parts. The agent knows your codebase. Your fine-tuned model answers the agent’s questions. Deployment follows directly from the build.

full-stack development

Describe an app. Ship it to production.

Give the agent a goal. It scaffolds the project, writes the code, runs tests, fixes errors, and deploys to the cloud — while you watch every step in the audit log. No context switches. No different tools.

“Build a task manager with React, FastAPI, SQLite. Add auth, write tests, deploy to Vercel.”
AI EngineCode EditorAgentDeploy
ai-augmented development

A coding assistant that knows your entire codebase.

Local inference means completions in milliseconds. The model sees your whole codebase through semantic injection — not just the open file. Episodic memory carries project history across every session.

“Review this PR for security issues, suggest fixes, and add missing tests. Match the patterns in the rest of the codebase.”
AI EngineCode EditorAgent
custom model training

Fine-tune on your data. Use it immediately.

Upload a dataset, pick a base model, run LoRA fine-tuning on your GPU. Watch loss curves live. When training finishes, load it directly into the inference engine — no export pipeline, no cloud bill.

“Fine-tune Llama 3 on my company’s API docs so it answers internal developer questions accurately.”
AI EngineTrainingAgent
background automation

Schedule agents to run while you sleep.

GSD-2 runs multi-phase agentic tasks autonomously. Checkpoint and resume on crash. Safety gates prevent destructive actions without your approval. Define the workflow once — Workbench runs it on schedule.

“Every night, check my repos for failing tests, open a PR with fixes, and Slack me a summary.”
AgentAI EngineCode Editor
creative production

Generate images and voice, completely offline.

Stable Diffusion on your GPU with no per-image cost. Combine ControlNet conditioning with LoRA style adapters. Voice Studio’s 22-voice TTS for narrated content — zero data leaves your machine.

“Generate a hero banner and feature illustrations, then produce voiceover audio for my product launch.”
Creative StudioAgentDeploy
multi-model research

Run multiple models against the same problem.

Multi-model chat mixes local and cloud providers in one session. Run adversarial debates, chain outputs between models, or benchmark your fine-tuned model against a frontier model — without swapping tools.

“Have local Qwen-32B and GPT-4o debate the best architecture for a real-time multiplayer backend, then scaffold the winner.”
AI EngineCloud APIsAgent
Technical Depth

70 phases.
Real. Shipped.

Not a concept. Not a prototype. Built phase by phase — each shipped, audited, and functional. Here are the numbers that matter.

0+ Phases shipped & functional
0 Backend Tauri commands
0+ Rust modules
0+ Community members in beta

AI Runtime

  • GGUF + Flash Attn v3 + speculative decoding + auto-quant
  • Real embeddings, cosine similarity, semantic codebase search
  • Multi-modal vision (LLaVA, Qwen-VL) — routes through loaded model
  • 10 cloud providers, SSE streaming, BYOK cost tracking per call
  • mbsd daemon (JSON-RPC 2.0, TCP:3031) + TS & Python SDKs

Agent Intelligence

  • ReAct loop: 23 tools + 24 MCP server integrations
  • GSD-2: spec-driven workflow engine, milestone hierarchy
  • Crash recovery, cost ledger, stuck detection & auto-spec replay
  • BSON episodic memory + MemoryPalace cross-session context
  • Three-tier safety + complete action audit log every step

IDE Core

  • Monaco + LSP (7 langs), call hierarchy, semantic tokens
  • Real PTY + OSC 633 shell integration, buffer search
  • DAP: 9 adapters, inline values, conditional breakpoints, source maps
  • Native Git: blame, partial stage, interactive rebase, reflog, graph
  • Node.js extension runtime, marketplace, snippet library

Build & Deploy

  • Docker lifecycle, Compose, K8s pod/svc/deploy explorer
  • Azure, GCP, AWS, Vercel, Netlify — real APIs, cost estimates
  • LoRA/QLoRA/GRPO + Cloud GPU Hub (Vast.ai, RunPod, Lambda)
  • 7 model export formats incl. TensorRT & CoreML
  • Remote SSH + Dev Containers with docker build/SCP/Remote LSP
[ ✓ ] STATUS
70+ phases shipped

Every feature on this page is implemented and functional. No vaporware, no stubbed commands.

[ ✓ ] PRIVACY
0 cloud deps required

Every inference, image gen, and agent action runs on your hardware by default. Nothing leaves unless you choose.

[ ✓ ] QUALITY
0 critical stubs remain

cargo clippy: 0 warnings, 0 errors. All former stubs replaced with real implementations.

[ ✓ ] COMMUNITY
3,600+ beta developers

Active community since early alpha. Every release shaped by real-world developer feedback.

Get Started

Start
building
today.

Private beta. All six pillars included, all features unlocked. Request access and we’ll send download instructions as soon as your request is reviewed.

MBS Workbench v0.2.4
workbench-setup-0.2.4.exe  ·  ~583 MB  ·  Windows 10/11
Signed installer · VirusTotal verified · GPU auto-detected on launch
Request Access →
system_requirements
OSWindows 10/11 · Linux (Ubuntu 20.04+)
RAM8 GB min · 16 GB recommended
GPUNVIDIA 4 GB+ VRAM · CPU fallback available
Storage10 GB free (models stored separately)
CUDA12.x recommended · 11.x supported

./join_waitlist

Invite-only. Fill in your details — we’ll send access instructions as soon as your request is reviewed.

// no spam  ·  no data sold  ·  access instructions only

Free during private beta. All six pillars with no feature gates. Premium tier ($10/mo) planned for cloud API credit bundles and team features — local inference, agents, training, and image generation will remain free forever.