
With the recent (but not unexpected) discovery that OpenAI never deleted your chats and now courts are demanding chat histories for upcoming legal cases… I thought that now would be a good time to write an article helping people who want to explore large language models (LLMs) like LLaMA, Mistral, or Gemma without relying on cloud services or APIs 😬.
This guide is a fast and dirty walk through on how to get started running LLMs locally using Ollama and give them a sleek, user-friendly frontend using OpenWebUI. This article will be targeted towards Mac users, but the Windows steps should be similar.
Prerequisites
Hardware
- Apple Silicon (M1/M2/M3) or Intel Mac with macOS 12+
- At least 8GB of RAM (16GB+ recommended)
- 10–20GB of free disk space (keep an eye on the size of the LLMs you download!)
Software
- Homebrew
- Docker Desktop or Podman
- Terminal (macOS Terminal or iTerm2)
- (Optional) Visual Studio Code
Step 1: Install Ollama
Ollama makes it easy to run LLMs on your local machine. It’s is like a CLI + runtime environment that manages downloading, running, and interacting with LLMs on your machine. It supports a large and growing list of open models.
Install via Homebrew
brew install ollama
Verify the installation:
ollama --version
Note: Ollama also starts a background service automatically. You can use ollama list
to see installed models.
Step 2: Run Your First LLM Locally
Example: Run Gemma 3 (1B)
ollama run gemma3:1b
This pulls the model (if needed) and opens a terminal-based chat session.
First-time downloads may take a few minutes depending on your connection speed.
Check out all the models available in the library… but pay attention to the disk space needed! (I almost ran out of space because I got too excited with all the models😅): https://ollama.com/library
Step 3: Set Up OpenWebUI (Optional Web Interface)
If you (like me) prefer a user interface instead of chatting through the terminal, OpenWebUI is perfect.
Step 3.1: Install Docker Desktop or Podman Desktop
Follow the directions on either of the corresponding websites to get your container environment running (note you can have both of them installed without issue)
Step 3.2: Run OpenWebUI Using Docker
docker run -d \
--name openwebui \
-p 3000:3000 \
-e 'OLLAMA_BASE_URL=http://host.docker.internal:11434' \
ghcr.io/open-webui/open-webui:main
Then open your browser and go to:
http://localhost:3000
You’ll now have a full-featured interface to interact with your local models.
If anything goes wrong, check logs with:
docker logs -f openwebui
Managing Models with Ollama
Want to manage your models? Here are a few commands:
- List installed models:
ollama list
- Delete a model:
ollama rm gemma3:1b
- Download a model without running it:
ollama pull mistral
Congratulations! You are now running a local LLM to which you can divulge all of your dirty secrets without worrying that OpenAI is going to scrape them for future nefarious reasons!
I recommend that you play with a bunch of different models to see which ones suit you. I’ve found that different models are good for different reasons & tasks, so play around and have fun! Keep an eye on your resource consumption as well, the bigger the LLM, the more CPU/RAM/Disk it’ll need.
