# Ollama Setup Guide

## Ollama Installation Guide

Ollama is a tool for running large language models (LLMs) locally. Follow these steps to install and run Ollama on your system.

### Installation

1. Use the [official Ollama installation script](https://ollama.com/download):

   ```bash
   curl -fsSL https://ollama.com/install.sh | sh
   ```
2. Pull the models you want to run by using the following command:

   `ollama pull llama3:8b`

{% hint style="info" %}
Different models can be found in the [Ollama Library](https://ollama.com/library)
{% endhint %}

### Running an LLM

2. To run an instance of `llama3:8b`, use the following command:

   ```bash
   ollama run llama3:8b
   ```

### Notes

* These instructions can be modified to run other LLMs listed in the MODELS environment variable.
* The speed and performance of the model depend on your hardware's capabilities.
* You can run different models by replacing `llama3:8b` with the name of another model from the MODELS list.

### Customizing Your Model Selection

To run a different model, use the format:

```bash
ollama run [model_name]
```

Replace `[model_name]` with your chosen model from the MODELS list in your `.env` file.

## Rough Hardware Guidelines

1. General Requirements:
   * CPU: Modern multi-core processor (6+ cores recommended)
   * RAM: Minimum 16GB, 32GB or more recommended
   * Storage: SSD with at least 20GB free space
   * Operating System: Linux, macOS, or Windows with WSL2
2. Model-Specific Guidelines:

   a) StableLM-2-Zephyr-1.6B:

   * Minimum: 8GB RAM, 4-core CPU
   * Recommended: 16GB RAM, 6-core CPU
   * GPU: Not strictly necessary, but a mid-range GPU can improve performance

   b) Llama-3-8B:

   * Minimum: 16GB RAM, 6-core CPU
   * Recommended: 32GB RAM, 8-core CPU
   * GPU: Mid-range GPU (e.g., NVIDIA GTX 1660 or better) recommended

   c) Qwen-4B:

   * Minimum: 12GB RAM, 4-core CPU
   * Recommended: 24GB RAM, 6-core CPU
   * GPU: Entry-level GPU can help, but not essential

   d) Gemma2-9B:

   * Minimum: 32GB RAM, 6-core CPU
   * Recommended: 64GB RAM, 8-core CPU
   * GPU: Mid-range GPU recommended (e.g., NVIDIA RTX 3060 or better)

   e) Mistral-7B:

   * Minimum: 16GB RAM, 6-core CPU
   * Recommended: 32GB RAM, 8-core CPU
   * GPU: Mid-range to high-end GPU recommended (e.g., NVIDIA RTX 3070 or better)

   f) Phi-3-3.8B:

   * Minimum: 12GB RAM, 4-core CPU
   * Recommended: 24GB RAM, 6-core CPU
   * GPU: Entry-level to mid-range GPU can improve performance

Notes:

1. These are rough estimates and actual performance may vary.
2. GPU acceleration can significantly improve inference speed for all models.
3. For optimal performance, especially with larger models like 7B and 8B, a dedicated GPU with at least 8GB VRAM is recommended.
4. Users can run these models on CPUs, but inference times will be slower compared to GPU-accelerated setups.
5. SSD storage is strongly recommended for faster model loading times.

Always check the [official Ollama documentation](https://github.com/ollama/ollama/tree/main/docs) and specific model requirements for the most up-to-date information. Performance can be optimized by adjusting model parameters and quantization levels.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://network-docs.chasm.net/chasm-scout-season-0/dispute-scout-setup-guide/ollama-setup-guide.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
