Your Private AI Assistant
On-Device. Network-Shared.

Run language models entirely on your device with multimodal vision & document processing. Access your models from any device on your WiFi.

Powerful AI, Complete Privacy

Run large language models entirely on your device with multimodal capabilities, local network sharing.

On-Device AI Processing

Run large language models completely on your device using llama.cpp. Zero data sent to cloud servers.

Local Network Server

Share your AI assistant across devices on your WiFi network. Access from tablets, laptops, or desktops via simple REST APIs with QR code setup.

Vision & Multimodal AI

Analyze images with vision-capable models like SmolVLM2. Built-in camera support, OCR text extraction, and multimodal projector integration.

Document Processing & RAG

Extract text from PDFs and images with local OCR. Retrieval-Augmented Generation for context-aware responses from your documents.

Remote Model Providers

Support for local models plus Gemini, ChatGPT, DeepSeek, Claude. Switch seamlessly between providers.

Optimized Performance

Background model downloads, auto-start server, persistent connections, and battery-efficient operation with wake lock support.

Experience InferrLM Today

Download InferrLM and run language models completely on your device. Share across your network, analyze images, process documents.

Get it on Google PlayGet it on App Store