Decorative header

AI Space

Your space. Your AI. Your device.

A privacy-first personal AI that lives entirely on your phone. No cloud. No account. No app store.

Private Local Offline

Built different.

Private by default

Your data never leaves your device. Zero cloud. Zero tracking.

Runs locally

AI model runs directly on your phone using WebGPU. No internet needed.

Device connected

iOS Shortcuts relay gives your AI eyes and hands on your phone.

Three steps. That's it.

1

Open this link

That's the install. No app store needed.

2

Choose your model

Downloads once, cached forever.

3

It's yours

Fully offline. Encrypted memory. Complete privacy.

Your rules.

Zero Trust

Local

Everything on device. No network calls. Ever.

Smart

Hybrid

Quick tasks local, heavy reasoning via cloud. You approve each call.

Max Power

Cloud

Full cloud AI. Fastest, smartest. Your choice.

See everything. Control everything.

Every API call logged. Every byte of data accounted for. Full audit trail, client-side encryption, and total transparency. You don't have to trust us — you can verify.

AI Space

Let's set up your space

How do you prefer?

Chat
Text-based, type your thoughts
Talk
Voice-first, speak naturally

What should I call you?

Just checking...

How should I talk to you?

Casual
Relaxed, friendly
Balanced
Warm but clear
Professional
Precise, efficient
Playful
Fun, creative

Pick a voice

Choose your AI model

Pick a model to download. Smaller = faster download. You can always switch later in Settings.

AI Space STD
Checking connection...
Your space is ready. Try a question below, or speak with the mic.
Settings
Theme

Choose a color palette for your AI Space.

AI Avatar

Choose your AI assistant's personality and appearance.

Mode
Local
Everything runs on your device. Maximum privacy.
Hybrid
Local first, cloud assist for complex tasks. You approve each call.
Cloud
Use cloud AI. Faster, but data leaves your device.
Relay Hub Beta

What is a Relay?
A Relay is a prompt-based instruction artifact you send to your AI. It tells the AI how to format a command for a specific channel — iOS Shortcuts, your browser, or your device — so you can automate tasks across your apps without any servers.

How to use (3 steps) ▸
  1. Choose a Relay (iOS Shortcuts, Browser, or Device).
  2. Choose an Action (e.g. Summarize, Draft Reply, Morning Briefing).
  3. Paste your content, tap Build Relay Artifact. The prompt is loaded in chat — review it, then Send.

Tip: You can also type in chat, e.g. "build a shortcuts relay to summarize this text: ..."

Build Artifact → loads the prompt in chat for review. Build + Send → builds and sends immediately.
Workflow Studio Imported Skill

What is Workflow Studio?
Inspired by the bundled-skill architecture from the imported source, it turns a complex routine into a reusable local skill with steps, approval checkpoints, and a relay-safe manifest.

Draft Skill → saves an encrypted local manifest and loads the full Workflow Studio prompt into chat.
Background Runtime Beta

What is the Runtime?
A sandboxed mini-terminal that runs scripts inside a Web Worker — off the UI thread, locally on your device. Use it to ping URLs, fetch data, navigate tabs, or chain commands without cloud.

DSL command reference ▸
LOG <text>Print a message to output
WAIT <ms>Pause execution (max 60 000 ms)
RUN fetch <url> -> vHTTP GET, store result in v
RUN json <url> -> vFetch & parse JSON into v
RUN echo <text>Echo text (useful for testing)
RUN nowPrint current ISO timestamp
NAVIGATE <url>Request browser navigation
RETURN <text>Return a string result & exit
RETURNJSON <json>Return a JSON object & exit

Use {{var.path}} to interpolate stored values.
Example: LOG Status: {{api.status}}


        
Local Internet Assist
Adds lightweight live context (Wikipedia search) to local responses. Stays optional and local-first.
Cloud API
⚡ TurboKV — Context Optimizer
Optimization Strategy
Standard
Direct token trimming. No transformation.
⟨⟩
Sliding Window
Attention-sink + recency window.
Semantic
Importance-scored context selection.
Turbo
Max efficiency — bullet synopsis.
Direct token trimming. Best for short conversations. Zero overhead.
Context Window (WebGPU KV Cache)
Takes effect on next model load · larger = more GPU memory
Live Metrics
Tokens In
Tokens Out
0
Compressions
Tok/s
0%
Context Fill
Custom KV Script
Write a JS function body. Receives (messages, budget), must return an array.
Compression Log (0 events)
No compressions yet. Start a long conversation.
Chat History
No conversations yet
Local Model
⚡ Tiny — Ultra Fast
SmolLM2 360M200 MB
Fastest option for lightweight tasks.
fastlightweight
TinyLlama 1.1B640 MB
Ultra-fast chat at 1.1B parameters.
fastchat
◈ Small — Balanced
Qwen 2.5 0.5B350 MB
Ultra-fast balance for everyday use.
fastmultilingual
Llama 3.2 1B700 MBRecommended
Best quality local reasoning. Recommended default.
reasoning8K ctx
Qwen 2.5 1.5B900 MB
Better quality than 0.5B, lighter than Llama 1B.
multilingualbalanced
Gemma 2 2B1.4 GB
Google Gemma 2 at 2B. Strong instruction following.
googleinstruction
◉ Medium — High Quality
Llama 3.2 3B2.0 GB
Significantly smarter than 1B. Best quality/size tradeoff.
reasoningquality
Phi 3.5 Mini 3.8B2.2 GB
Microsoft Phi-3.5. Excellent reasoning. 16K context.
microsoft16K ctx
🔥 Large — Needs 6 GB+ GPU RAM
Mistral 7B v0.34.1 GB
Classic Mistral 7B. Excellent code & instruction following.
code32K ctx
Llama 3.1 8B5.0 GB
Meta Llama 3.1 8B. Top-tier local quality.
qualityreasoning
DeepSeek-R1 7B4.4 GB
R1 reasoning distillation. Chain-of-thought, math, logic.
reasoningmathchain-of-thought
Runs entirely on your device via WebGPU
Trust Dashboard
Data Location
On device only
Active Model
Loading...
Cloud API Calls
0
Conversations Stored
0
Voice
Data
Export All Data
Clear All Data
AI Space v0.2.0
Privacy-first personal AI
Your data never leaves your device in local mode.
Live Activity
Idle