Warning
This is as a Lab Notebook which describes how to solve a specific problem at a specific time. Please keep this in mind as you read and use the content. Please pay close attention to the date, version information and other details.
How to use LLMs on the HPCC
Large language models (LLMs) are powerful tools that we want to make available to the ICER community. This notebook outlines the different ways to run LLMs locally on the HPCC, so you can use them without needing to upload your data to a third party.
OnDemand Apps
Both 'LM Studio' and 'Ollama' have OnDemand apps. You can access them by going to the HPCC OnDemand and clicking on the 'Interactive Apps' tab. When requesting a session, it is advised to request a GPU so the models can run faster. Compatible nodes are listed below. This is the easiest way to use the LLMs on the HPCC.
Modules
There are currently two modules for LLMs on the HPCC, LM-Studio
and Ollama
. Both can be run using the module load
command (case-sensitive).
1 2 |
|
1 2 |
|
Note: When running these modules for the first time, you may not see any models available to you. You can use the lm_studio_link_models
and ollama_link_models
powertools to symbolically link the models to your home directory. This means the models will not take up any of your storage space, but you can still access them.
Models
There are many models on the HPCC in the /mnt/research/common-data/llms
directory. The best way to use them is to use the lm_studio_link_models
and ollama_link_models
powertools. Learn more about the powertools below.
Pre-installed Models
Maker | Models | Availability |
---|---|---|
Meta | llama2 | Ollama |
Meta | llama3 | Both |
Meta | llama3.1 | Both |
Meta | llama3.2 | Both |
Meta | llama3.3 | Both |
Meta | codellama | Both |
StabilityAI | stable-code | Both |
Mistral | mistral-7B | Both |
Mistral | codestral | Both |
Mistral | mathstral | Both |
Mistral | mistral-Nemo | Both |
Mistral | mistral-small | Ollama |
DeepseekAI | deepseek-coder | Ollama |
DeepseekAI | deepseek-coder-v2 | Both |
DeepseekAI | deepseek-math | LM Studio |
gemma2 | Both | |
gemma | Ollama | |
codegemma | Ollama | |
Microsoft | phi | Ollama |
Microsoft | phi3 | Ollama |
Microsoft | phi3.1 | LM Studio |
Microsoft | phi3.5 | Ollama |
Nous Research | hermes3 | Both |
Alibaba Cloud | Qwen | Both |
Downloading Models
You can also install your own models through LM-Studio or Ollama. You can browse which models are available using their websites:
With Ollama, you can also use the ollama pull
command to download models from the Ollama Library, then use the 'ollama run' command to run the model. You can also just use the ollama run
, and if you don't have the model, it will download it for you. If you need extra space and want to use your $SCRATCH
directory, you can set the environment variable OLLAMA_MODELS
to the path of your choice, and it will download all models to there. You will not be able to see any of the models that are in the default location, which is $HOME/.ollama/models
. To reset this, you can unset the environment variable. To see installed models, you can use the ollama list
command. Example:
1 2 3 |
|
You will now only be able to see and run the models in the $SCRATCH/ollama_models
directory. If you want to go back to the default location, you can unset the environment variable.
1 |
|
For LM Studio, navigate to the Discover
tab (purple magnifying glass icon) on the left side of the app. There, you can browse the LM Studio Catalog, search, and install models. To view and manage installed models, navigate to the My Models
tab (red folder icon) on the left side of the app. Here, you can delete models or change the path to where they are stored. If you need more space, you can change the path to your $SCRATCH
directory. The default path is $HOME/.cache/lm-studio/models
.
Note: In the My Models
tab of LM Studio, you will see an estimate of how much disk space the models are taking up. This will be inaccurate if you have run the powertool, so use the quota
command to know if you are running out of space.
Nodes and VRAM requirements
Every model requires a different amount of VRAM (Video Random Access Memory). This varies based on the size of the model and what it is doing. The HPCC currently has these GPUs:
Node | GPU Type | GPU VRAM |
---|---|---|
intel18 | Nvidia V100 | 32GB |
amd20 | Nvidia V100 | 32GB |
intel21 | Nvidia A100 | 40GB |
amd21 | Nvidia A100-SMX | 80GB |
intel16 | CUDA GPU too OLD | - |
amd22 | No GPU | - |
More info about ICER nodes here: The HPCC GPU Resources
To see how much VRAM a model needs, a few have been tested. Actual numbers may vary, and the longer the conversation, the more VRAM it will need. If a model needs more VRAM than the GPU has, it will offload some of the layers to the CPU, causing it to slow down. Or if there are multiple GPUs, it will offload to the other GPU.
Model | Parameter Counts | VRAM Requirements |
---|---|---|
Phi 3.1 | 3.8B | 6,676 MiB |
Code Gemma | 9B | 9,670 MiB |
Llama 3.1 | 8B | 6,822 MiB |
Llama 3 | 8B | 6,786 MiB |
Gemma 3 | 9.2B | 9,490 MiB |
Code Llama | 6.7B | 9,062 MiB |
Code Llama | 34B | 21,474 MiB |
Llama 3.1 | 70B | 42,018 MiB |
Note: I evaluated the Retrieval-Augmented Generation (RAG), which uses a vector database to facilitate interaction with stored documents. For 6.3 megabytes of Markdown documents (approximately 45,000 lines), only about 100 megabytes of extra VRAM is needed.
Powertools
There are currently 2 powertools for the LLMs on the HPCC, lm_studio_link_models
and ollama_link_models
. These will create a symbolic link from the models in /mnt/research/common-data/llms
to your home directory /mnt/home/$USER/.cache/lm-studio/models
and /mnt/home/$USER/.ollama/models
respectively. This means the models will not take up any of your storage space, but you can still access them. You should only need to run these once, and you will have access to all the models.
- If you remove the symbolic links, all you have to do is run the command again to get the models back.
- If you add a model that we later add to the common folder, you can run the command again to remove your local version, saving you space.
- If you add a model that we don't have in the common folder, it will not be touched by the powertool scripts.
- To remove a symbolic link, you can use the
rm
command.
Links models in /mnt/research/common-data/llms/lm_studio/models
to /mnt/home/$USER/.cache/lm-studio/models
1 |
|
Links models in /mnt/research/common-data/llms/ollama/models
to /mnt/home/$USER/.ollama/models
1 |
|
If you try to run these commands and receive an error such as “command not found”, make sure you have the powertools
module loaded.
1 |
|
And try again. If you still have issues, make sure you have the latest version of the powertools module; these tools are available in powertools version 1.3.5 and later.