Gestalt Workframe
Platform brief
AI orchestration that keeps spend predictable, answers grounded in your own knowledge, and your brand on the front. The backend picks the right model for each turn. Routine work stays on small local models. Premium cloud is reserved for hard judgment. A guided terminal, persona routing, connectors, and per-deployment bundles ship with it.
Operating layer
A visitor sees a terminal and a library. Underneath, five layers do the work: pull in your sources, build a grounded knowledge base, route each turn to the right model, enforce privacy and spend policy, and publish only what operators approve.
Filesystem, S3, ITGlue, Hudu, Microsoft 365, GitHub, RSS, and curated web feeds.
One Document schema. Scoped credentials. Redaction pipeline. Provenance attached.
Vector index, browsable library pages, source-controlled corpus, citations, and reusable training material.
Intent frame, retrieved context, model profile, local or cloud choice, cost, risk, and policy.
Grounded answers, service intake, contact handoff, newsletter, library, lessons, and labs.
Public experience
Not an open chatbot. Visitors start with a guided intake and land in one of three personas: Service Inquiry, Practitioner, or Educator. The router shifts modes mid-conversation when intent changes.
Off-scope requests redirect to your configured paths instead of burning model budget.
Your knowledge stays browsable, cited, and indexable. Patterns, schemas, runbooks, and examples live as library pages, not buried in chat transcripts. Search engines and AI crawlers see them too.
Every grounded answer is graded before it streams. Unsupported claims get replaced, not warning-stamped.
Admin control plane
One review queue, fed by every source you watch. Operators feature strong signals, queue newsletter items, add custom finds, and dismiss noise. GitHub, RSS, web diffs, subreddits, and saved searches all land in the same place, fetched server-side behind SSRF guards.
Provider health, route diagnostics, and budgets on one token-gated page. Flip routing strategy at runtime: best value, prefer local, prefer cloud quality, local only, or cloud only. Per-turn, per-session, daily, and monthly caps let cloud spend degrade gracefully instead of running away.
Security model
Every input that crosses the public surface is treated as untrusted. The backend owns routing, tool access, credentials, retrieval, and spend so each boundary that matters can be audited.
Browser input, intake answers, KB chunks, tool results, and model output are all untrusted. The backend owns routing, tool access, and provider selection.
No exec, no shell, no filesystem for the model. Provider tools are whitelisted per mode, Pydantic-validated, bounded, and quarantined on result reinjection.
Connectors run on scoped, short-lived server-side credentials. No long-lived tokens in prompts, KB chunks, logs, or browser state.
Context flagged cloud_llm_eligible=false blocks cloud providers. If local inference is unavailable the router fails closed with a readable error instead of leaking to a cloud route.
Same-origin checks, route body limits, IP and session abuse budgets, SSRF guards, redacted public health, generic stream errors, and text-only terminal rendering.
Architecture
FastAPI, a static Next.js frontend, and an nginx proxy behind one origin. docker compose up brings the stack online on port 8080 with persistent storage.
A bundled generic deployment lets evaluators walk the whole product before configuring anything.
One Document schema, scoped credentials, and a redaction pipeline behind a single connector contract. Reference connectors ship for filesystem, S3, ITGlue, Hudu, and Microsoft 365 files. Each is its own installable Python package.
Connectors are Apache-2.0 so integrators can build and ship their own without license friction.
Brand, copy, intake, connectors, redaction, newsletter, discovery, and curriculum settings live in one folder per deployment. Set DEPLOYMENT_ID to switch. Re-purposing the same engine for a new audience is a configuration change, not a fork.
Provider abstraction with best-value routing.
Local LLM by default on any OpenAI-compatible endpoint. Operator-controlled cloud spillover for the rest. Model profiles capture task fit, status, and limits so adding or swapping a route is a config change, not a code change.
≥ 70%
target local-LLM serve rate under normal load.
0 → cap
cloud spend defaults to zero. Caps are per-turn, per-session, per-day, and per-month, with USD ceilings.
Source, license, start
Gestalt Workframe is dual-licensed and shipped on GitHub. Run the bundled generic deployment offline and walk the whole product before committing to anything.
Everything outside packages/ is FSL-1.1. Read, run, modify, and redistribute for your own use, internal services, and customer engagements. Each version auto-converts to Apache-2.0 two years after release.
Everything under packages/ is Apache-2.0. Build your own connectors against the schema and ship them without inheriting framework license terms.
Hosting Gestalt Workframe as a managed or SaaS offering, reselling it, or competing with it requires a commercial license. See COMMERCIAL.md for terms and our implementation-services path.
Start here. Clone the repo, run docker compose up, and open localhost:8080. Then copy the bundled deployment to your own and edit it.
github.com/GestaltWorks/GestaltWorkframe