Privacy-first · Self-hosted · Open Source

Your AI companion.
On your hardware.
Under your control.

A self-hosted, multi-user AI companion platform for people who take privacy seriously — not as a selling point, but as a minimum requirement.

GPLv3 Self-hosted Multi-user Ollama-compatible BYOK Docker Compose No telemetry

The landscape

AI is remarkable.
The status quo is not.

Most AI chat platforms ask you to hand over your conversations, your prompts, and your trust to a corporate pipeline — in exchange for convenience.

Shared API keys mean your queries sit alongside everyone else's. Privacy policies change. Your data might leave the building, silently.

Lock-in is dressed up as simplicity. That is a trade you should never have to make.

✗Conversations used as training data
✗Shared credentials — your prompts, everyone's risk
✗Corporate oversight on every message
✗No ownership of your AI's memory
✗Vendor lock-in disguised as UX
✗Privacy as a premium tier, never a default

Platform overview

What Chatsune is.

A privacy-first, self-hosted, multi-user AI companion platform. All inference runs on hardware you choose. Deployed in minutes via Docker Compose. GPLv3 throughout — no vendor lock-in, no telemetry, no surprises.

Architecture

Modular Monolith

One FastAPI process with strictly enforced module boundaries. Event-driven throughout. Scales simply — no Kubernetes required.

Inference

Ollama-Compatible

Works with self-hosted Ollama, Ollama Cloud, or any compatible endpoint. BYOK is not an optional feature — it is the only mode.

Storage

MongoDB + Redis

MongoDB with local vector search. Redis Streams for real-time events. No external vector database — everything stays on your machine.

Real-time

Event-First WebSocket

One persistent WebSocket connection per user. No polling, ever. Every state change is an event, replayed on reconnect via Redis Streams.

Credentials & access

Bring Your Own Keys.
Every user. No exceptions.

Each user manages their own LLM connections — named, encrypted, personal. The administrator sees nothing of your API keys. Nobody else's credentials share a namespace with yours.

Keys are encrypted with Fernet symmetric encryption, stored per-user in the database. Multiple providers, multiple models, one clean interface.

Your data. Your credentials. Your inference.

Encryption

Fernet symmetric encryption

Keys are never stored in plaintext. The admin has no visibility into user credentials — by design, not by policy.

Multi-provider

Ollama · Cloud · Custom

Connect any Ollama-compatible endpoint. Self-hosted at home, Ollama Cloud, or a custom backend — the same interface throughout.

Homelab integration

Your GPU is at home.
That should not stop you.

Most people with a home GPU face the same wall: dynamic IP, CGNAT, no port-forwarding. The machine sits idle unless you are physically on the local network.

☁ Chatsune ⟵ ⟶ ✦ Sidecar ⟶ ⬡ Ollama (home GPU)

Reverse WebSocket · No port-forwarding · Works through CGNAT and dynamic IPs

How it works

A small sidecar container runs beside your home Ollama. It opens a reverse outbound WebSocket to your Chatsune instance — no inbound ports required. The connection tunnels inference traffic both ways.

Security

One-time pairing keys, shown once at creation, hashed as argon2id. Instantly revocable. Reverse-only transport eliminates spoofing risk. Each pairing gets its own isolated namespace.

Community compute

Share your GPU
with people you trust.

The homelab system is not just for solo use. Hosts can provision compute capacity to invited users — directly, without a marketplace, without a platform intermediary cutting in.

⬡

Host

Provision your compute

Create a pairing, define a model allowlist, set concurrency limits. Your hardware, your rules — down to the model level.

↗

Invitation

Invite, not publish

No public marketplace. You invite a specific, trusted user. Revoke at any time. Zero intermediaries. No billing surprises.

◈

Multi-engine

Ollama, LM Studio, vLLM

The sidecar protocol is engine-agnostic. Connect whatever runtime is running on the box — the abstraction handles translation.

The friend-to-friend compute model. Hardware shared between people who know each other — no cloud middleman required.

Memory system

Memory that
actually accumulates.

Most AI platforms stuff recent chat history into the context window and call it memory. That is not memory — it is short-term recall with an expiry date.

Chatsune runs background consolidation jobs — dreaming — that synthesise episodic journal entries into persistent prose memory. Long-term, versioned, recoverable on failure.

Each persona has its own memory body. It grows over time. It survives sessions, restarts, and upgrades.

✦
Journal extraction
Conversations scanned in the background for significant statements and events.
✦
Dreaming (consolidation)
Episodic entries synthesised into coherent prose by a scheduled background job.
✦
Two-tier retrieval
Fresh episodic entries plus consolidated long-term memory injected at inference time.
✦
Versioned with rollback
Memory body is versioned. Consolidation failures roll back cleanly — no data loss.

Personas & knowledge

Companions with
real persistence.

Persona System

Distinct personalities

Named companions with avatars and a three-layer system prompt hierarchy — global guardrails, user additions, persona specifics. Persistent relationships, not throwaway chat sessions.

Knowledge Bases

Per-persona document libraries

Upload documents to a persona's private knowledge base. Semantic retrieval via MongoDB Vector Search — your documents stay where they are.

Local Embeddings

Arctic Embed M v2.0

768-dimensional embeddings generated on CPU via ONNX Runtime. No OpenAI embedding calls. No Cohere. No external API. Zero data leaves your server.

Projects

Organised context

Group conversations and knowledge by project. Switch context cleanly. Sanitised Mode keeps sensitive personas out of the main view when needed.

Tools & outputs

Tools that
actually do things.

◈
Web search
Pluggable adapters, per-user BYOK credentials. No shared search keys.
◈
Knowledge retrieval
Semantic search across a persona's document library, inline in conversation.
◈
Client-side JS sandbox
Arbitrary calculation executed in an isolated Web Worker. No DOM access, no network.
◈
Vision fallback
Non-vision models auto-delegate image tasks to a capable model — transparently.
◈
Artefacts
Code, Mermaid diagrams, HTML, SVG — captured, versioned, browsable after the fact.

Tool execution loop

Up to 5 iterations

Multi-step tool use with refusal detection. Server orchestrates, browser executes sandboxed code, results returned in the same WebSocket stream.

Soft chain-of-thought

Auto-injected reasoning

When a model lacks native CoT, Chatsune injects an analytical reasoning block. Better outputs from any model — no manual prompt engineering.

Integrations & plugins

Connect the world
around you.

Chatsune's plugin system extends the companion experience beyond text. External services, real-world hardware, and custom tools connect via a structured gateway — all under your control, all running on your instance.

MCP Gateway

Model Context Protocol

First-class MCP support connects your companions to external tools and services — home automation, calendars, custom APIs. The persona orchestrates; the tools execute.

Hardware integrations

Real-world feedback loops

Companions are not limited to the screen. The integration layer allows events to flow outward — to physical devices, local services, and anything reachable on your network.

Example integration

Lovense

Connected devices pair with the conversation layer via the Lovense API. The companion can trigger haptic events, react to context, and maintain continuity of experience — privately, locally, without cloud relay. Your hardware. Your intimacy. Your rules.

Open plugin system

Write your own

Plugins are first-class citizens. Implement the interface, register with the gateway, distribute under GPLv3. No proprietary extension market — just open code.

Technology

Built on solid ground.

Frontend

React 19 + TypeScript

Vite 8 · Tailwind CSS 4 · Zustand 5 · pnpm

Backend

Python 3.12 + FastAPI

Pydantic v2 · uv · fully async throughout

Database

MongoDB 7.0 (RS0)

Local Vector Search included · Single-node replica set

Cache & Events

Redis 7

Streams · LRU cache · Session store · Event replay

Embeddings

Arctic Embed M v2.0

ONNX Runtime · CPU-only · 768-dimensional vectors

Deployment

Docker Compose

MongoDB Atlas Local · Redis Alpine · Single command

          # Deploy Chatsune — everything included

          docker compose up -d            # MongoDB + Redis

          uv run uvicorn backend.main:app  # FastAPI backend

          cd frontend && pnpm dev           # React frontend

Why we built this

Privacy. Autonomy.
Self-determination.

We did not build Chatsune to compete with commercial AI platforms. We built it because we believe your relationship with an AI companion should belong to you — not to a company, not to a data pipeline, not to a terms-of-service agreement nobody reads.

The homelab concept is not a technical curiosity. It is a statement: your hardware should work for you, wherever you are. Convenience should never require surrendering control.

GPLv3. Not open-washing. Copyleft by conviction — freedom preserved all the way down the chain.

GPLv3 — your freedom, not ours to revoke

✦Your conversations are not training data.
✦Your keys are not the platform's business.
✦Your GPU earns its keep, wherever you are.
✦Your AI's memory belongs to you.
✦No telemetry. No subscription. No lock-in.
✦The source code is the product.

Get started

Your instance.
Your community.

Chatsune deploys with a single docker compose up -d. MongoDB, Redis, and the backend come up together. Add users, connect your first Ollama instance, and you are running.

Self-host for yourself, or run a private instance for a small group of people who trust each other with shared compute.

The homelab sidecar is a separate, lightweight container — bring your own GPU to the party from anywhere in the world.

Source on GitHub Issues welcome PRs considered

113Event topics

768dLocal vectors

v3GPL licence

0Telemetry calls

              # That's all it takes

              git clone https://github.com/symphonic-navigator/chatsune.git

              docker compose up -d

              # Visit http://localhost:5173

✦ Built by Tidesson Communications · GPLv3 · No surveillance. No compromise.

Your AI companion.On your hardware.Under your control.

AI is remarkable.The status quo is not.

What Chatsune is.

Modular Monolith

Ollama-Compatible

MongoDB + Redis

Event-First WebSocket

Bring Your Own Keys.Every user. No exceptions.

Fernet symmetric encryption

Ollama · Cloud · Custom

Your GPU is at home.That should not stop you.

Share your GPUwith people you trust.

Provision your compute

Invite, not publish

Ollama, LM Studio, vLLM

Memory thatactually accumulates.

Companions withreal persistence.

Distinct personalities

Per-persona document libraries

Arctic Embed M v2.0

Organised context

Tools thatactually do things.

Up to 5 iterations

Auto-injected reasoning

Connect the worldaround you.

Model Context Protocol

Real-world feedback loops

Lovense

Write your own

Built on solid ground.

Privacy. Autonomy.Self-determination.

Your instance.Your community.

Your AI companion.
On your hardware.
Under your control.

AI is remarkable.
The status quo is not.

Bring Your Own Keys.
Every user. No exceptions.

Your GPU is at home.
That should not stop you.

Share your GPU
with people you trust.

Memory that
actually accumulates.

Companions with
real persistence.

Tools that
actually do things.

Connect the world
around you.

Privacy. Autonomy.
Self-determination.

Your instance.
Your community.