From af4ea6b7b85d595d11e0846fed08ce0b77c577d1 Mon Sep 17 00:00:00 2001 From: Amanda-dong <159391549+Amanda-dong@users.noreply.github.com> Date: Tue, 6 Jan 2026 20:43:56 -0500 Subject: [PATCH 1/6] Add documentation for Ollama command line tool Added comprehensive documentation for Ollama, a command line tool for running large language models, including installation instructions, environment variables, and usage examples. --- docs/hpc/08_ml_ai_hpc/07_ollama.md | 71 ++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) create mode 100644 docs/hpc/08_ml_ai_hpc/07_ollama.md diff --git a/docs/hpc/08_ml_ai_hpc/07_ollama.md b/docs/hpc/08_ml_ai_hpc/07_ollama.md new file mode 100644 index 0000000000..91ac95a835 --- /dev/null +++ b/docs/hpc/08_ml_ai_hpc/07_ollama.md @@ -0,0 +1,71 @@ +# Ollama - A Command Line LLM Tool +## What is Ollama? +[Ollama](https://github.com/ollama/ollama) is a developing command line tool designed to run large language models. +Ollama Installation Instructions +Create an Ollama directory, such as in your /scratch or /vast directories, then download the ollama files: +``` +curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz +tar -vxzf ollama-linux-amd64.tgz +``` +### Use VAST Storage for Best Performance +There are several environment variables that can be changed: +``` +ollama serve --help +#Environment Variables: +#OLLAMA_HOST The host:port to bind to (default "127.0.0.1:11434") +#OLLAMA_ORIGINS A comma separated list of allowed origins. +#OLLAMA_MODELS The path to the models directory (default is "~/.ollama/models") +#OLLAMA_KEEP_ALIVE The duration that models stay loaded in memory (default is "5m") +``` +LLMs require very fast storage. The fastest storage on the HPC clusters is the currently the all-flash VAST storage service. This storage is designed for AI workloads and can greatly speed up performance. You should change your model download directory accordingly: +``` +export OLLAMA_MODELS=$VAST/ollama_models +``` +You should run this to configure ollama to always use your VAST storage for consistent use: +``` +echo "export OLLAMA_MODELS=$VAST/ollama_models" >> ~/.bashrc file +``` + +## Run Ollama +### Batch Style Jobs +You can run ollama on a random port: +``` +export OLPORT=$(python3 -c "import socket; sock=socket.socket(); sock.bind(('',0)); print(sock.getsockname()[1])") +OLLAMA_HOST=127.0.0.1:$OLPORT ./bin/ollama serve +``` +You can use the above as part of a Slurm batch job like the example below: +``` +#!/bin/bash +#SBATCH --job-name=ollama +#SBATCH --output=ollama_%j.log +#SBATCH --ntasks=1 +#SBATCH --mem=8gb +#SBATCH --gres=gpu:a100:1 +#SBATCH --time=01:00:00 + +export OLPORT=$(python3 -c "import socket; sock=socket.socket(); sock.bind(('',0)); print(sock.getsockname()[1])") +export OLLAMA_HOST=127.0.0.1:$OLPORT + +./bin/ollama serve > ollama-server.log 2>&1 && +wait 10 +./bin/ollama pull mistral +python my_ollama_python_script.py >> my_ollama_output.txt +``` +In the above example, your python script will be able to talk to the ollama server. +### Interactive Ollama Sessions +If you want to run Ollama and chat with it, open a Desktop session on a GPU node via Open Ondemand (https://ood.hpc.nyu.edu/) and launch two terminals, one to start the ollama server and the other to chat with LLMs. +**In Terminal 1:** +Start ollama +``` +export OLPORT=$(python3 -c "import socket; sock=socket.socket(); sock.bind(('',0)); print(sock.getsockname()[1])") +echo $OLPORT #so you know what port Ollama is running on +OLLAMA_HOST=127.0.0.1:$OLPORT ./bin/ollama serve +``` +**In Terminal 2:** +Pull a model and begin chatting +``` +export OLLAMA_HOST=127.0.0.1:$OLPORT +./bin/ollama pull llama3.2 +./bin/ollama run llama3.2 +``` +Note that you may have to redefine OLPORT in the second terminal, if you do, make sure you manually set it to the same port as the other terminal window. From 652a3c81c32120ad28b344bd130367438be2cb0d Mon Sep 17 00:00:00 2001 From: Amanda-dong <159391549+Amanda-dong@users.noreply.github.com> Date: Tue, 6 Jan 2026 20:45:30 -0500 Subject: [PATCH 2/6] Add installation instructions for Ollama --- docs/hpc/08_ml_ai_hpc/07_ollama.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/hpc/08_ml_ai_hpc/07_ollama.md b/docs/hpc/08_ml_ai_hpc/07_ollama.md index 91ac95a835..6ad3576798 100644 --- a/docs/hpc/08_ml_ai_hpc/07_ollama.md +++ b/docs/hpc/08_ml_ai_hpc/07_ollama.md @@ -1,7 +1,8 @@ # Ollama - A Command Line LLM Tool ## What is Ollama? [Ollama](https://github.com/ollama/ollama) is a developing command line tool designed to run large language models. -Ollama Installation Instructions + +## Ollama Installation Instructions Create an Ollama directory, such as in your /scratch or /vast directories, then download the ollama files: ``` curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz From d68bf3c9fe43eb362b6576ab1894e26a1e5b357e Mon Sep 17 00:00:00 2001 From: Amanda-dong <159391549+Amanda-dong@users.noreply.github.com> Date: Tue, 6 Jan 2026 20:46:34 -0500 Subject: [PATCH 3/6] Document interactive Ollama sessions setup Added instructions for starting interactive Ollama sessions on a GPU node. --- docs/hpc/08_ml_ai_hpc/07_ollama.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/hpc/08_ml_ai_hpc/07_ollama.md b/docs/hpc/08_ml_ai_hpc/07_ollama.md index 6ad3576798..268cfc543b 100644 --- a/docs/hpc/08_ml_ai_hpc/07_ollama.md +++ b/docs/hpc/08_ml_ai_hpc/07_ollama.md @@ -55,6 +55,7 @@ python my_ollama_python_script.py >> my_ollama_output.txt In the above example, your python script will be able to talk to the ollama server. ### Interactive Ollama Sessions If you want to run Ollama and chat with it, open a Desktop session on a GPU node via Open Ondemand (https://ood.hpc.nyu.edu/) and launch two terminals, one to start the ollama server and the other to chat with LLMs. + **In Terminal 1:** Start ollama ``` From 2928885feeaaac4e645e166c43cd74ca9c038e7f Mon Sep 17 00:00:00 2001 From: Amanda-dong <159391549+Amanda-dong@users.noreply.github.com> Date: Tue, 6 Jan 2026 20:47:01 -0500 Subject: [PATCH 4/6] Update interactive Ollama session instructions Added instructions for starting Ollama server and chatting with LLMs in interactive sessions. --- docs/hpc/08_ml_ai_hpc/07_ollama.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/hpc/08_ml_ai_hpc/07_ollama.md b/docs/hpc/08_ml_ai_hpc/07_ollama.md index 268cfc543b..eed1dbddbd 100644 --- a/docs/hpc/08_ml_ai_hpc/07_ollama.md +++ b/docs/hpc/08_ml_ai_hpc/07_ollama.md @@ -57,6 +57,7 @@ In the above example, your python script will be able to talk to the ollama serv If you want to run Ollama and chat with it, open a Desktop session on a GPU node via Open Ondemand (https://ood.hpc.nyu.edu/) and launch two terminals, one to start the ollama server and the other to chat with LLMs. **In Terminal 1:** + Start ollama ``` export OLPORT=$(python3 -c "import socket; sock=socket.socket(); sock.bind(('',0)); print(sock.getsockname()[1])") @@ -64,6 +65,7 @@ echo $OLPORT #so you know what port Ollama is running on OLLAMA_HOST=127.0.0.1:$OLPORT ./bin/ollama serve ``` **In Terminal 2:** + Pull a model and begin chatting ``` export OLLAMA_HOST=127.0.0.1:$OLPORT From e3ac668962450b67d2828f9d065894c77fae7d22 Mon Sep 17 00:00:00 2001 From: Amanda-dong <159391549+Amanda-dong@users.noreply.github.com> Date: Wed, 7 Jan 2026 17:57:04 -0500 Subject: [PATCH 5/6] Revise Ollama installation and storage guidance Updated installation instructions and storage recommendations for Ollama. --- docs/hpc/08_ml_ai_hpc/07_ollama.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/hpc/08_ml_ai_hpc/07_ollama.md b/docs/hpc/08_ml_ai_hpc/07_ollama.md index eed1dbddbd..f0583b6653 100644 --- a/docs/hpc/08_ml_ai_hpc/07_ollama.md +++ b/docs/hpc/08_ml_ai_hpc/07_ollama.md @@ -3,12 +3,12 @@ [Ollama](https://github.com/ollama/ollama) is a developing command line tool designed to run large language models. ## Ollama Installation Instructions -Create an Ollama directory, such as in your /scratch or /vast directories, then download the ollama files: +Create an Ollama directory in your /scratch directories, then download the ollama files: ``` curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz tar -vxzf ollama-linux-amd64.tgz ``` -### Use VAST Storage for Best Performance +### Use High-Performance SCRATCH Storage There are several environment variables that can be changed: ``` ollama serve --help @@ -18,13 +18,13 @@ ollama serve --help #OLLAMA_MODELS The path to the models directory (default is "~/.ollama/models") #OLLAMA_KEEP_ALIVE The duration that models stay loaded in memory (default is "5m") ``` -LLMs require very fast storage. The fastest storage on the HPC clusters is the currently the all-flash VAST storage service. This storage is designed for AI workloads and can greatly speed up performance. You should change your model download directory accordingly: +LLMs require very fast storage. On Torch, the SCRATCH filesystem is an all-flash system designed for AI workloads, providing excellent performance. You should change your model download directory to your scratch space: ``` -export OLLAMA_MODELS=$VAST/ollama_models +export OLLAMA_MODELS=/scratch/$USER/ollama_models ``` -You should run this to configure ollama to always use your VAST storage for consistent use: +You should run this to configure ollama to always use your SCRATCH storage for consistent use: ``` -echo "export OLLAMA_MODELS=$VAST/ollama_models" >> ~/.bashrc file +echo "export OLLAMA_MODELS=/scratch/$USER/ollama_models" >> ~/.bashrc ``` ## Run Ollama From 787122de3f3ae9368fc91c39faa3830d8e3fc133 Mon Sep 17 00:00:00 2001 From: Amanda-dong <159391549+Amanda-dong@users.noreply.github.com> Date: Wed, 7 Jan 2026 18:23:55 -0500 Subject: [PATCH 6/6] Fix typo in Ollama installation instructions --- docs/hpc/08_ml_ai_hpc/07_ollama.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/08_ml_ai_hpc/07_ollama.md b/docs/hpc/08_ml_ai_hpc/07_ollama.md index f0583b6653..2fd4596fec 100644 --- a/docs/hpc/08_ml_ai_hpc/07_ollama.md +++ b/docs/hpc/08_ml_ai_hpc/07_ollama.md @@ -3,7 +3,7 @@ [Ollama](https://github.com/ollama/ollama) is a developing command line tool designed to run large language models. ## Ollama Installation Instructions -Create an Ollama directory in your /scratch directories, then download the ollama files: +Create an Ollama directory in your /scratch directory, then download the ollama files: ``` curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz tar -vxzf ollama-linux-amd64.tgz