Skip to content

newparad1gm/wsl_ollama

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Running Ollama in Windows 11 with WSL2 on an NVIDIA GPU

BIOS/UEFI VM Features enabled

  • SVM for AMD
  • Intel Virtualization Technology or VT-d for Intel

Windows Features

  • Search for Turn Windows Features on or off in Windows Search
  • Enable Virtual Machine Platform and Windows Subsystem for Linux

NVIDIA Drivers

Powershell and WSL2

  • Start Powershell as admin
  • Install Ubuntu on WSL2
    • wsl --install -d ubuntu
  • Verify it is installed with WSL2
    • wsl -l -v

Ubuntu NVIDIA Setup

Ollama Setup and Run

  • Ollama setup
    • curl https://ollama.ai/install.sh | sh
  • Serve Ollama for network access
    • OLLAMA_HOST=0.0.0.0:11434 ollama serve
  • May need to turn off Ollama first
    • sudo lsof -i :11434
    • kill -9 <PID>
    • If Ollama comes back up by itself, means systemd is keeping it up
  • Ollama run model

Network Communication with Ollama

  • Ubuntu Terminal to get IP of WSL2 distribution
    • hostname -I
  • ipconfig in PowerShell or Terminal in Windows
    • Look for IPv4 Address of something like 192.168.x.x - IP Address of host machine in network
  • Set up Network Firewall rule, either of the 2 commands will work
    • New-NetFirewallRule -DisplayName "Ollama Port 11434" -Direction Inbound -Action Allow -Protocol TCP -LocalPort 11434
    • netsh advfirewall firewall add rule name="Ollama Port 11434" dir=in action=allow protocol=TCP localport=11434
  • Forward traffic from Host IP to WSL2 IP
    • netsh interface portproxy add v4tov4 listenaddress=0.0.0.0 listenport=11434 connectaddress=<WSL2 IP through hostname -I> connectport=11434
  • On other machine, run curl http://192.168.x.x:11434/api/tags
    • Should successfully give models currently deployed in Ollama

Open WebUI

  • Front end like ChatGPT to use the LLM: https://github.com/open-webui/open-webui
  • Docker installed on other machine that can contact Ollama server
  • Pull latest Open WebUI image
    • docker pull ghcr.io/open-webui/open-webui:main
  • Current Docker command to get Open WebUI container running
    • docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://192.168.x.x:11434 -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
      
  • localhost:3000 or ip of this machine at port 3000 for other machines on the network to access

Turn off Ollama

  • ollama serve - CTRL+C to exit
  • ollama run - /bye the model or CTRL+C
  • Ollama could be running as a systemd service that restarts it if it is closed
    • sudo systemctl stop ollama
    • sudo systemctl disable ollama
  • Exiting Ubuntu terminal should stop the VM if no processes are running
  • wsl -l -v in Powershell to make sure Ubuntu is in Stopped state

MCP Testing

  • pip install mcp with Python version specified
  • python mcp_server.py to start MCP Server
  • python mcp_client.py to run MCP client to test the server

Continue.dev VSCode Integration

  • Install Continue.dev plugin for VSCode
  • Open config.yaml in ~/.continue (or wherever this yaml file is, can open with Ctrl+Shift+P -> Continue: Open Settings)
  • In models section of yml
    • models:
        - name: Qwen3 32B
          provider: ollama
          model: qwen3:32b (name of model deployed)
          roles:
            - chat
            - edit
            - apply
          apiBase: http://192.168.x.x:11434
      

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages