Documentation

Install, configure, and run LocalPilot

LocalPilot is packaged as a VS Code extension and depends on a reachable Ollama host for local model inference.

Install the VSIX

Install the packaged extension with VS Code:

code --install-extension localpilot-0.0.1.vsix

Local development

Use Node.js 20 or newer. Compile the extension, then press F5 in VS Code.

npm install
npm run compile

Ollama setup

Install Ollama from ollama.com/download, start the local server, and keep the default host unless you intentionally use a different endpoint.

ollama serve

# Default LocalPilot host
http://localhost:11434

Models

Recommended local model profiles

ProfileModelBest for
Balancedqwen2.5-coder:1.5bResponsive local autocomplete and chat.
Qualityqwen2.5-coder:7bStronger coding help on machines with more memory.
Microsmollm2:360mVery low-resource machines.
Litedeepseek-coder:1.3bOlder low-end setups.
Compactcodegemma:2bSmall general coding model.

Commands

VS Code command palette entries

LocalPilot exposes editor commands, setup commands, model selection, health checks, and chat access.

  • LocalPilot: Open Chat
  • LocalPilot: Explain Selected Code
  • LocalPilot: Explain Error
  • LocalPilot: Add Comments to Selection
  • LocalPilot: Fix Selected Code
  • LocalPilot: Generate Tests
  • LocalPilot: Solve Coding Problem
  • LocalPilot: Check Ollama Status
  • LocalPilot: Select Local Model
  • LocalPilot: Switch Inline Completion Mode
  • LocalPilot: Open Status Menu
  • LocalPilot: Run Setup

Settings

Configuration reference

SettingDefaultDescription
localpilot.ollamaHosthttp://localhost:11434Local Ollama REST API host.
localpilot.inlineModelqwen2.5-coder:1.5bModel used for inline code suggestions.
localpilot.chatModelqwen2.5-coder:7bModel used for chat and larger coding tasks.
localpilot.lowRamModelqwen2.5-coder:0.5bSmaller model preferred in low-resource mode.
localpilot.modeautoResource profile: auto, micro, lite, standard, or custom.
localpilot.enableInlineSuggestionstrueEnables Copilot-style inline suggestions.
localpilot.inlineCompletionModefullUse full multiline autocomplete or conservative line suggestions.
localpilot.maxContextLines80Maximum nearby source lines LocalPilot may include in prompts.
localpilot.maxOutputTokens160Maximum tokens requested from Ollama.
localpilot.temperature0.2Generation temperature for Ollama requests.
localpilot.inlineDebounceMs250Delay before sending inline completion requests.
localpilot.disableInlineForLargeFilestrueSkips inline suggestions for files at or above the size limit.
localpilot.maxFileSizeKb500Maximum file size LocalPilot may inspect.
localpilot.enableLowRamWarningstrueShows warnings when work is skipped to stay responsive.