Flare LLM Browser Demo

Initializing WASM module...

Select a GGUF file. Try SmolLM2-135M Q8_0 (~138 MB).

Paste a direct URL to a GGUF file. Download progress shown in real time.

No GGUF file handy? Try SmolLM2-135M (~138 MB) — fills in model + tokenizer URLs from HuggingFace.

Connecting...

Load a HuggingFace tokenizer.json for decoded text output. Without it, raw token IDs are shown.

Apply chat template

Wraps your prompt in the model's instruction format. Disable for raw completion.

System prompt (optional, applied on first turn)

Sampling parameters

Temperature

0.70

Top-p

0.95

Top-k

Repeat penalty

1.10

Max tokens

128

Your message

Speak replies

Flare LLM