Setting Up a Local Chat AI with Ollama and Open Web UI

As a software developer and architect, I’m always excited to explore new technologies that can revolutionize the way we interact with computers. AI is taking the technology world by storm, and for good reason, it can be a very powerful tool. Sometimes however, using a public service like ChatGPT or Microsoft’s Co-Pilot doesn’t work for a number of reasons (usually privacy related).

In this article, I’ll guide you through setting up a chat AI using Ollama and Open Web UI that you can quickly and easily run on your local, Windows based machine . We’ll use Docker Compose to spin up the environment, and I’ll walk you through the initial launch of the web UI, configuring models in settings, and generally getting things up and running.

Prerequisites

Before we dive into the setup process, make sure you have:

  • Docker installed on your machine (you can download it from the official Docker website)
  • A basic understanding of Docker Compose and its syntax. (Not necessarily required, but helpful if you run into issues or want to tweak things)
  • A compatible graphics card (GPU) to run Ollama efficiently. While this is not strictly required, your experience will not be very good without one. My example is configured to use an Nvidia graphics card.

Step 1: Create a Docker Compose File

Create a new file named docker-compose.yml in a directory of your choice. Copy the following content into this file:

services:
  openWebUI:
    image: ghcr.io/open-webui/open-webui:main
    restart: unless-stopped
    ports:
      - "3000:8080"
    environment:
      OLLAMA_BASE_URL: http://host.docker.internal:11434
    extra_hosts:
      - "host.docker.internal:host-gateway"
    volumes:
      - c:\tmp\owu:/app/backend/data

  ollama:
    image: ollama/ollama:latest    
    environment:
      NVIDIA_VISIBLE_DEVICES: all
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: ["gpu"]
              driver: nvidia
              count: all    
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - c:\tmp\ollama:/root/.ollama

This Docker Compose file defines two services:

  • openWebUI: runs the Open Web UI container with the necessary environment variables and port mappings.
  • ollama: runs the Ollama container with NVIDIA GPU support, exposes a port for communication between the containers, and mounts a volume to store model data.
A couple things to note here: 
Lines 12 and 29 are mapping local directories into the containers as volumes. That allows the data from your sessions to persist, even if the docker images are restarted or your machine is rebooted. C:\tmp\ollama and C:\tmp\owu can be changed to any empty directory you choose, but remember that in the following steps if you choose to change them.

Lines 16-24 configure the container to take advantage of your Nvidia GPU. If you don't have one, or don't want to use it, you can remove these lines and everything should still work, albeit much slower. If you have another GPU, this is where you will want to look into making changes to use your GPU - In particular lines 17 & 23 will likely need to change.

Lines 4 & 25 configure the containers to automatically restart unless they were manually stopped. That means they should restart if you reboot your machine, they crash, or you update docker and it restarts.

Step 2: Create Data Directories

As mentioned above, the directories C:\tmp\ollama and C:\tmp\owu will be mapped into the running containers and used for data storage. You will want to create these directories ahead of launching the containers to avoid any potential issues.

Step 3: Launch the Environment

Open your terminal or command prompt and navigate to the directory where you saved the docker-compose.yml file. Run the following command:

docker-compose up -d

This will start both containers in detached mode (i.e., they’ll run in the background).

Step 4: Launch the Web UI

Once the environment is set up, navigate to http://localhost:3000 in your web browser. This should open the Open Web UI interface.

You should be presented with a screen to log in, but you won’t have an account yet. Just click the “Don’t have an account? Sign up” link under the “Sign in” button. Since this will be the first account, it will automatically become the administrator. Simply enter your name, email address and a password and create your account.

Once you account is created you will be logged in and should now she the Chat Interface, which should look pretty familiar if you have been using ChatGPT.

Step 5: Setting/Adding up Models

Before you can start chatting it up with your new application, you’ll need to install some models to use. To get started I would install the llama3.1 model. To do this click on your name in the lower left corner of the UI, select “Settings” and then in the dialog, select “Admin Settings” on the left.

Now select “Models” in the Admin Panel and enter llama3.1 in the “Pull a model from Ollama.com” field and click the download button on the right of the field. (You can see a list of available models on Ollama.com: https://ollama.com/library)

You should see a small progress bar appear. Wait for the progress bar to get to 100%, then it will verify the hash and eventually you should see a green pop-up notifying you that it has successfully been added. Now you can click “New Chat” on the far top-left and select llama3.1 from the “Select a model” dropdown.

Next Steps

At this point you should now have a functioning Chat AI interface!

Going forward, I’ll be playing with this configuration and attempting to add on more functionality and potentially convert the docker compose above into kubernetes manifests so that I can run my service on a local kind cluster.

Resources

Source Code

https://github.com/DotNet-Ninja/DotNetNinja.Docker.Ollama

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.