
Ollama
Ollama is a local runtime for Large Language Models that lets you run AI models directly on your computer.
You can interact with Ollama either through its graphical interface or by making code-based API calls.
Installation
https://ollama.com/download
What model to choose?
Here are some questions that you can ask when picking a model:
- What is my goal?
- What hardware do I have?
- Do I want more speed or more intelligence?
- Do I need multimodality (images)?
- Do I need strong programming capabilities?
- Do I want to operate 100% offline with full privacy?
- Do I want the option to fine-tune the model in the future
Available Models
https://ollama.com/search
Installing and Running a Model
For example, installing the model: gemma3:4b
ollama pull gemma3:4b
Once the model is installed, you can run it with:
ollama run gemma3:4b
Other useful commands
List all installed models
ollama list
Remove a model
ollama rm gemma3:4b
Example
import requests
def generate(prompt):
response = requests.post(
"http://localhost:11434/api/generate",
json={
"model": "gemma3:4b",
"prompt": prompt,
"stream": False
}
)
data = response.json()
return data["response"]
response = generate("Write a short poem about the ocean.")
print(response)
Output
13 Nov. 2025
|
Last Updated: 03 Dec. 2025
|
jaimedcsilva Related