(2025-05-13) Lab Notebook: AI Models Available on the HPCC

Models on the HPCC

ICER offers two primary platforms designed to deliver advanced capabilities for language and computational models: LM Studio and OpenwebUI. These platforms can be accessed both via command line interface or OnDemand.

OpenWebUI: ICER offers 30 preloaded models. These models are sourced from the Ollama Library. In OpenWebUI, you also have the option to run multiple models simultaneously, allowing you to compare their outputs in real-time and choose the best performing model. You can explore the Ollama Library on their website.

LM Studio: ICER offers with 25 models. In LM Studio, it is incredibly easy to search the model library and find the models that are the best fit for your Generative AI needs.

You can explore the LM Studio model library on their website.

Both platforms support a wide range of use cases as demonstrated by performance benchmarking across key categories in the remainder of this page.

Testing Criteria

Disclaimer

All Large Language Models (LLMs) referenced in this evaluation were tested and assessed by a human evaluator using qualitative methods only. No rigorous or formal benchmarking procedures were used during testing. Consequently, the findings and observations presented are based solely on subjective, qualitative data.

All models available on the HPCC Systems have been benchmarked based on the following five categories:

Coding:
Evaluating the models' ability to generate and debug code in a variety of different coding languages.
Language Translation:
Assessing proficiency in converting text between different languages.
Math:
Testing the models’ capability in handling mathematical problems and computations.
Reasoning:
Measuring logical reasoning and problem-solving skills across various contexts.
Current / Historical Events:
Analyzing the models’ understanding and contextual interpretation of both current and historical events.

When testing the models, these prompts were used. From the testing results and findings the models across both LM Studio and OpenWebUI were ranked in each category on a Optimal, Average and Poor.

This spreadsheet was used to evaluate the various LLM models available on ICER's HPCC. It includes detailed information such as token information, parameter sizes, and configuration settings for each tested model.

Table with model on the category

LM Studio

Model Name	Coding	Reasoning	Languages	Math	Current Events & Historical Events
Meta-Llama-3.1-70B	Optimal	Average	Optimal	Optimal	Optimal
Meta-Llama-3.1-8B	Optimal	Average	Average	Optimal	Optimal
Llama-3.2-1B	Optimal	Poor	Average	Optimal	Average
Llama-3.2-3B	Optimal	Average	Average	Optimal	Optimal
Llama-3.3-70B	Optimal	Average	Optimal	Optimal	Optimal
stable-code	Optimal	Poor	Average	Optimal	Average
mathstral-7B-v0.1	Optimal	Average	Average	Optimal	Average
Codestral-22B-v0.1	Optimal	Average	Average	Optimal	Poor
DeepSeek-Coder-V2-Lite	Optimal	Poor	Average	Optimal	Average
deepseek-math-7B	Optimal	Poor	Poor	Optimal	Average
gemma-2-27B	Average	Average	Poor	Average	Optimal
gemma-2-2B	Optimal	Poor	Poor	Average	Poor
gemma-2-9B	Average	Poor	Poor	Average	Poor
Mistral-7B	Average	Poor	Poor	Average	Poor
Mistral-Nemo	Average	Average	Poor	Average	Average
Phi-3.1-mini-128k	Optimal	Average	Average	Optimal	Optimal
Yi-Coder-9B-Chat	Average	Poor	Average	Average	Poor
Hermes-3-Llama-3.1-8B	Optimal	Optimal	Average	Optimal	Optimal
Hermes-3-Llama-3.2-3B	Optimal	Optimal	Average	Optimal	Optimal
Qwen2.5-Coder-14B	Optimal	Average	Average	Optimal	Optimal
Qwen2.5-Coder-32B	Optimal	Poor	Average	Optimal	Optimal
Qwen2.5-Coder-3B	Optimal	Average	Average	Optimal	Optimal
CodeLlama-13B	Green	Average	Average	Optimal	Average
CodeLlama-70B	Green	Average	Average	Optimal	Average
Deepseek-R1-8B	Green	Optimal	Optimal	Optimal	Optimal

OpenWebUI

*Note: The deepseek-coder models are limited to programming, computer science, and math related queries

Model Name	Coding	Reasoning	Languages	Math	Current & Historical Events
codegemma:2b	Optimal	Poor	Poor	Optimal	Poor
codegemma:7b	Optimal	Poor	Poor	Optimal	Poor
codegemma:13b	Optimal	Poor	Poor	Optimal	Poor
codegemma:34b	Optimal	Poor	Average	Optimal	Average
codegemma:70b	Optimal	Poor	Average	Average	Average
codellama:7b	Optimal	Poor	Average	Average	Poor
* deepseek-coder-v2:16b	Optimal	Poor	Poor	Optimal	Poor
* deepseek-coder:1.3b	Optimal	Poor	Poor	Optimal	Poor
* deepseek-coder:6.7b	Optimal	Poor	Poor	Optimal	Poor
gemma:2b	Optimal	Optimal	Average	Optimal	Average
gemma:7b	Optimal	Optimal	Optimal	Average	Optimal
gemma2:27b	Optimal	Optimal	Average	Optimal	Average
gemma2:2b	Average	Average	Average	Average	Poor
gemma2:9b	Optimal	Average	Average	Optimal	Average
hermes3:70b	Optimal	Average	Average	Optimal	Optimal
hermes3:8b	Optimal	Optimal	Average	Optimal	Optimal
llama2:13b	Optimal	Poor	Average	Optimal	Optimal
llama2:70b	Optimal	Optimal	Optimal	Optimal	Optimal
llama2:7b	Optimal	Average	Optimal	Average	Optimal
llama3.1:70b	Optimal	Optimal	Average	Optimal	Optimal
llama3.1:8b	Optimal	Average	Average	Optimal	Optimal
llama3:70b	Optimal	Optimal	Optimal	Optimal	Optimal
llama3:8b	Optimal	Average	Average	Optimal	Optimal
mistral-nemo:12b	Average	Average	Poor	Average	Optimal
mistral-small:22b	Optimal	Average	Average	Green	Optimal
phi3.5:3.8b	Green	Average	Optimal	Average	Optimal
phi3:14b	Green	Average	Average	Optimal	Average
phi:2.7b	Green	Average	Average	Optimal	Poor
stable-code:3b	Optimal	Average	Average	Optimal	Optimal