Ollama LLM Engine Docker Setup

June 16, 2024

This JSON object contains information about setting up the Ollama LLM engine in a Docker container. The process involves creating a Docker Compose file, running the Docker container, and accessing the application. The smallest model that can be loaded into Ollama is the 7B model, which requires approximately 8 GB of RAM.

ollama llm docker container setup 7b model

Mike Nichols

Pair Programmer at Olympia

use realtime search to find out what LLM engine i can run in a docker container and instructions for doing so

Obie Fernandez about 1 year ago

To run an LLM engine in a Docker container, you can use Ollama. Here are the steps to set up Ollama in a Docker container:

Prerequisites

Ensure you have Docker installed on your system.
For Linux, install the NVIDIA Container Toolkit.
For Windows 10/11, install the latest NVIDIA driver and use the WSL2 backend.

Create a Docker Compose File

Create a compose.yaml file with the following content:

services:
  server:
    build: .
    ports:
      - 8000:8000
    env_file:
      - .env
    depends_on:
      database:
        condition: service_healthy
  database:
    image: neo4j:5.11
    ports:
      - 7474:7474
      - 7687:7687
    environment:
      - NEO4J_AUTH=${NEO4J_USERNAME}/${NEO4J_PASSWORD}
    healthcheck:
      test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider localhost:7474 || exit 1"]
      interval: 5s
      timeout: 3s
      retries: 5
  ollama:
    build: .
    ports:
      - 7860:7860
    env_file:
      - .env
    depends_on:
      server:
        condition: service_started

Run the Docker Container

Run the Docker container using the following command:

docker compose up --build

Access the Application

Open a browser and navigate to http://localhost:8000 to access the application.

For more detailed instructions and additional setup options, refer to the Docker documentation on using containers for generative AI development.

Mike Nichols about 1 year ago

𝒇

Function Calls

Used by the assistant while replying.

realtime_search(What LLM engine can I run in a Docker container and how do I set it up?)

To run an LLM engine in a Docker container, you can use Ollama or OpenAI. Here are the steps to set up Ollama in a Docker container: 1. **Prerequisites**: - Ensure you have Docker installed on your system. - For Linux, install the NVIDIA Container Toolkit. - For Windows 10/11, install the latest NVIDIA driver and use the WSL2 backend. 2. **Create a Docker Compose File**: - Create a `compose.yaml` file with the following content: ```yaml services: server: build: . ports: - 8000:8000 env_file: - .env depends_on: database: condition: service_healthy database: image: neo4j:5.11 ports: - 7474:7474 - 7687:7687 environment: - NEO4J_AUTH=${NEO4J_USERNAME}/${NEO4J_PASSWORD} healthcheck: test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider localhost:7474 || exit 1"] interval: 5s timeout: 3s retries: 5 ``` 3. **Add the Ollama Service**: - Update the `compose.yaml` file to include the Ollama service: ```yaml services: server: build: . ports: - 8000:8000 env_file: - .env depends_on: database: condition: service_healthy database: image: neo4j:5.11 ports: - 7474:7474 - 7687:7687 environment: - NEO4J_AUTH=${NEO4J_USERNAME}/${NEO4J_PASSWORD} healthcheck: test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider localhost:7474 || exit 1"] interval: 5s timeout: 3s retries: 5 ollama: build: . ports: - 7860:7860 env_file: - .env depends_on: server: condition: service_started ``` 4. **Run the Docker Container**: - Run the Docker container using the following command: ```bash docker compose up --build ``` 5. **Access the Application**: - Open a browser and navigate to `http://localhost:8000` to access the application. For more detailed instructions and additional setup options, refer to the Docker documentation on using containers for generative AI development.

rtsearch: what is the smallest model that can be loaded into ollama (in terms of memory reqs)

Obie Fernandez about 1 year ago

The smallest model that can be loaded into Ollama is the 7B model, which requires approximately 8 GB of RAM to run.

Mike Nichols about 1 year ago

𝒇

Hire Mike Nichols for your team

Report content | Terms of use | Privacy policy