Documentation

Install, configure, and run LocalPilot

LocalPilot is packaged as a VS Code extension and depends on a reachable Ollama host for local model inference.

Install the VSIX

Install the packaged extension with VS Code:

code --install-extension localpilot-0.0.1.vsix

Use Node.js 20 or newer. Compile the extension, then press F5 in VS Code.

npm install
npm run compile

Install Ollama from ollama.com/download, start the local server, and keep the default host unless you intentionally use a different endpoint.

ollama serve

# Default LocalPilot host
http://localhost:11434

Models

Profile	Model	Best for
Balanced	`qwen2.5-coder:1.5b`	Responsive local autocomplete and chat.
Quality	`qwen2.5-coder:7b`	Stronger coding help on machines with more memory.
Micro	`smollm2:360m`	Very low-resource machines.
Lite	`deepseek-coder:1.3b`	Older low-end setups.
Compact	`codegemma:2b`	Small general coding model.

Commands

LocalPilot exposes editor commands, setup commands, model selection, health checks, and chat access.

Settings

Setting	Default	Description
`localpilot.ollamaHost`	`http://localhost:11434`	Local Ollama REST API host.
`localpilot.inlineModel`	`qwen2.5-coder:1.5b`	Model used for inline code suggestions.
`localpilot.chatModel`	`qwen2.5-coder:7b`	Model used for chat and larger coding tasks.
`localpilot.lowRamModel`	`qwen2.5-coder:0.5b`	Smaller model preferred in low-resource mode.
`localpilot.mode`	`auto`	Resource profile: auto, micro, lite, standard, or custom.
`localpilot.enableInlineSuggestions`	`true`	Enables Copilot-style inline suggestions.
`localpilot.inlineCompletionMode`	`full`	Use full multiline autocomplete or conservative line suggestions.
`localpilot.maxContextLines`	`80`	Maximum nearby source lines LocalPilot may include in prompts.
`localpilot.maxOutputTokens`	`160`	Maximum tokens requested from Ollama.
`localpilot.temperature`	`0.2`	Generation temperature for Ollama requests.
`localpilot.inlineDebounceMs`	`250`	Delay before sending inline completion requests.
`localpilot.disableInlineForLargeFiles`	`true`	Skips inline suggestions for files at or above the size limit.
`localpilot.maxFileSizeKb`	`500`	Maximum file size LocalPilot may inspect.
`localpilot.enableLowRamWarnings`	`true`	Shows warnings when work is skipped to stay responsive.