Copilot Chat BYOK (Ollama local) fails with copilotLanguageModelWrapper 404 while Ollama OpenAI endpoint works #186374
Replies: 2 comments 3 replies
-
|
💬 Your Product Feedback Has Been Submitted 🎉 Thank you for taking the time to share your insights with us! Your feedback is invaluable as we build a better GitHub experience for all our users. Here's what you can expect moving forward ⏩
Where to look to see what's shipping 👀
What you can do in the meantime 💻
As a member of the GitHub community, your participation is essential. While we can't promise that every suggestion will be implemented, we want to emphasize that your feedback is instrumental in guiding our decisions and priorities. Thank you once again for your contribution to making GitHub even better! We're grateful for your ongoing support and collaboration in shaping the future of our platform. ⭐ |
Beta Was this translation helpful? Give feedback.
-
|
That 404 from the One of the weirdest things about these pre-release versions of Copilot Chat is how they handle URLs. Some versions of the extension automatically append Try setting your endpoint to just Local models through Ollama usually only work in "Ask" mode. If you’re trying to use "Agent" mode (the one where it can search your files or run commands), it often breaks because those local models don't always support the specific "tools" and "functions" that the Copilot Agent expects. When the wrapper tries to call a tool-specific endpoint that doesn't exist on your local model, it kicks back that 404. Stick to standard chat and see if the error clears up. Since you're on the Insiders build, the internal language model provider can sometimes get stuck in a weird state where it thinks it needs to route through GitHub's proxy instead of your local machine. You can force a refresh by signing out of your GitHub account in the VS Code accounts icon at the bottom left. After you sign out, run the command Make sure the model name in your VS Code settings is a 100% match for what you see when you run |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Select Topic Area
Bug
Copilot Feature Area
VS Code
Body
Summary
Copilot Chat fails with 404 page not found coming from copilotLanguageModelWrapper when using a local Ollama model. The same model works via Ollama’s OpenAI-compatible API (/v1/chat/completions) and also works from VS Code AI Toolkit Playground (“Local via Ollama”).
Environment
• OS: macOS (Apple Silicon)
• VS Code: 1.109.0-insider
• GitHub Copilot Chat extension: 0.37.2026020406 (pre-release)
• GitHub Copilot extension: (version from Extensions view)
• Ollama: 0.15.5-rc2 (installed from DMG, not brew)
• Ollama endpoint: http://127.0.0.1:11434/v1
• Models present in Ollama:
• llama3.2:latest
• qwen2.5-coder:latest
• kimi-k2.5:cloud
Steps to reproduce
1. Run Ollama locally and confirm OpenAI-compatible endpoint is reachable:
• curl http://127.0.0.1:11434/v1/models returns the models list (200 OK).
2. In VS Code Insiders, configure a local model via Ollama (shows as “Local via Ollama”).
3. Select that model in Copilot Chat (Ask mode).
4. Send any message (e.g., “test” / “hello”).
Expected behavior
Copilot Chat should call the local model through the configured Ollama OpenAI-compatible endpoint and return a response.
Actual behavior
Copilot Chat fails immediately with:
• UI: “Sorry, your request failed. Please try again.”
• Logs show 404 page not found from copilotLanguageModelWrapper / _provideLanguageModelResponse. Example:
[error] Server error: 404 404 page not found
[info] ... | notFound | ... | [copilotLanguageModelWrapper]
The same request works outside Copilot Chat:
• Models list:
• curl -sS http://127.0.0.1:11434/v1/models → 200 OK, returns models.
• Chat completion:
• curl -sS http://127.0.0.1:11434/v1/chat/completions -H "Content-Type: application/json" -d '{"model":"llama3.2:latest","messages":[{"role":"user","content":"say hello"}]}'
→ returns a normal completion JSON (200 OK).
Also, AI Toolkit Playground can successfully chat with “Local via Ollama” using the same model, but Copilot Chat fails with the 404 wrapper error.
Additional notes
• This looks like a routing/compatibility regression in Copilot Chat BYOK / language model access, since Ollama responds correctly to OpenAI-compatible endpoints.
• Happy to provide full logs / screenshots if needed.
Beta Was this translation helpful? Give feedback.
All reactions