Flywheel Platform

Run a niche AI employee in one line.

Open-weight models, fine-tuned per vertical. Self-host them on your own box for free, or call the OpenAI-compatible hosted API — change one line and the rest of your stack never notices.

Quickstart Get an API key API reference

curl https://gyld.dev/api/v1/chat/completions \
  -H "Authorization: Bearer $FLYWHEEL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fitness",
    "messages": [
      { "role": "user", "content": "Beginner full-body workout?" }
    ]
  }'

from openai import OpenAI

client = OpenAI(base_url="https://gyld.dev/api/v1", api_key="fw_live_…")

resp = client.chat.completions.create(
    model="fitness",
    messages=[{"role": "user", "content": "Beginner full-body workout?"}],
)
print(resp.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://gyld.dev/api/v1",
  apiKey: process.env.FLYWHEEL_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "fitness",
  messages: [{ role: "user", content: "Beginner full-body workout?" }],
});
console.log(resp.choices[0].message.content);

cfg := openai.DefaultConfig("fw_live_…")
cfg.BaseURL = "https://gyld.dev/api/v1"
client := openai.NewClientWithConfig(cfg)

resp, _ := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
    Model:    "fitness",
    Messages: []openai.ChatCompletionMessage{{Role: "user", Content: "Beginner full-body workout?"}},
})
fmt.Println(resp.Choices[0].Message.Content)

Same OpenAI Chat Completions shape — point any OpenAI SDK at https://gyld.dev/api/v1.

Get started

Choose how you run it

Pick the surface that fits your stack. They both speak the OpenAI API, so you can move between them without touching your code.

Self-host

Apache-2.0 open weights on your own hardware. Run them with llama.cpp or vLLM in one command — free forever, and your data never leaves the machine.

Self-hosting guide→Pick a model→Weights on Hugging Face→

Hosted API

Managed inference — no hardware to run. OpenAI-compatible, with keys minted in seconds. Plans scale with the number of AI employees you run.

Quickstart→Chat completions→Authentication→

vLLM llama.cpp Hugging Face

Guide

From install to production

Follow the path that matches how you run it. Every step is a link.

Models

The Flywheel model family

Every model is a fine-tune of the same base (Qwen3.6-35B-A3B, Apache-2.0) with a point-of-use guardrail baked into the weights. Each sharpens from real, consented usage.

Live