Flywheel Platform

Run a niche AI employee in one line.

Open-weight models, fine-tuned per vertical. Self-host them on your own box for free, or call the OpenAI-compatible hosted API — change one line and the rest of your stack never notices.

⌘K
curl https://gyld.dev/api/v1/chat/completions \
  -H "Authorization: Bearer $FLYWHEEL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fitness",
    "messages": [
      { "role": "user", "content": "Beginner full-body workout?" }
    ]
  }'

Same OpenAI Chat Completions shape — point any OpenAI SDK at https://gyld.dev/api/v1.

Get started

Choose how you run it

Pick the surface that fits your stack. They both speak the OpenAI API, so you can move between them without touching your code.

Self-host

Apache-2.0 open weights on your own hardware. Run them with llama.cpp or vLLM in one command — free forever, and your data never leaves the machine.

Hosted API

Managed inference — no hardware to run. OpenAI-compatible, with keys minted in seconds. Plans scale with the number of AI employees you run.

Guide

From install to production

Follow the path that matches how you run it. Every step is a link.

Models

The Flywheel model family

Every model is a fine-tune of the same base (Qwen3.6-35B-A3B, Apache-2.0) with a point-of-use guardrail baked into the weights. Each sharpens from real, consented usage.

Read the models guide →