Skip to content

(2025-05-13) Lab Notebook: AI Models Available on the HPCC

Models on the HPCC

ICER offers two primary platforms designed to deliver advanced capabilities for language and computational models: LM Studio and OpenwebUI. These platforms can be accessed both via command line interface or OnDemand.

OpenWebUI: ICER offers 30 preloaded models. These models are sourced from the Ollama Library. In OpenWebUI, you also have the option to run multiple models simultaneously, allowing you to compare their outputs in real-time and choose the best performing model. You can explore the Ollama Library on their website.

LM Studio: ICER offers with 25 models. In LM Studio, it is incredibly easy to search the model library and find the models that are the best fit for your Generative AI needs.

You can explore the LM Studio model library on their website.

Both platforms support a wide range of use cases as demonstrated by performance benchmarking across key categories in the remainder of this page.

Testing Criteria

Disclaimer

All Large Language Models (LLMs) referenced in this evaluation were tested and assessed by a human evaluator using qualitative methods only. No rigorous or formal benchmarking procedures were used during testing. Consequently, the findings and observations presented are based solely on subjective, qualitative data.

All models available on the HPCC Systems have been benchmarked based on the following five categories:

  • Coding:
    Evaluating the models' ability to generate and debug code in a variety of different coding languages.

  • Language Translation:
    Assessing proficiency in converting text between different languages.

  • Math:
    Testing the models’ capability in handling mathematical problems and computations.

  • Reasoning:
    Measuring logical reasoning and problem-solving skills across various contexts.

  • Current / Historical Events:
    Analyzing the models’ understanding and contextual interpretation of both current and historical events.

When testing the models, these prompts were used. From the testing results and findings the models across both LM Studio and OpenWebUI were ranked in each category on a Optimal, Average and Poor.

This spreadsheet was used to evaluate the various LLM models available on ICER's HPCC. It includes detailed information such as token information, parameter sizes, and configuration settings for each tested model.

Table with model on the category

LM Studio

Model Name Coding Reasoning Languages Math Current Events & Historical Events
Meta-Llama-3.1-70B Optimal Average Optimal Optimal Optimal
Meta-Llama-3.1-8B Optimal Average Average Optimal Optimal
Llama-3.2-1B Optimal Poor Average Optimal Average
Llama-3.2-3B Optimal Average Average Optimal Optimal
Llama-3.3-70B Optimal Average Optimal Optimal Optimal
stable-code Optimal Poor Average Optimal Average
mathstral-7B-v0.1 Optimal Average Average Optimal Average
Codestral-22B-v0.1 Optimal Average Average Optimal Poor
DeepSeek-Coder-V2-Lite Optimal Poor Average Optimal Average
deepseek-math-7B Optimal Poor Poor Optimal Average
gemma-2-27B Average Average Poor Average Optimal
gemma-2-2B Optimal Poor Poor Average Poor
gemma-2-9B Average Poor Poor Average Poor
Mistral-7B Average Poor Poor Average Poor
Mistral-Nemo Average Average Poor Average Average
Phi-3.1-mini-128k Optimal Average Average Optimal Optimal
Yi-Coder-9B-Chat Average Poor Average Average Poor
Hermes-3-Llama-3.1-8B Optimal Optimal Average Optimal Optimal
Hermes-3-Llama-3.2-3B Optimal Optimal Average Optimal Optimal
Qwen2.5-Coder-14B Optimal Average Average Optimal Optimal
Qwen2.5-Coder-32B Optimal Poor Average Optimal Optimal
Qwen2.5-Coder-3B Optimal Average Average Optimal Optimal
CodeLlama-13B Green Average Average Optimal Average
CodeLlama-70B Green Average Average Optimal Average
Deepseek-R1-8B Green Optimal Optimal Optimal Optimal

OpenWebUI

*Note: The deepseek-coder models are limited to programming, computer science, and math related queries

Model Name Coding Reasoning Languages Math Current & Historical Events
codegemma:2b Optimal Poor Poor Optimal Poor
codegemma:7b Optimal Poor Poor Optimal Poor
codegemma:13b Optimal Poor Poor Optimal Poor
codegemma:34b Optimal Poor Average Optimal Average
codegemma:70b Optimal Poor Average Average Average
codellama:7b Optimal Poor Average Average Poor
* deepseek-coder-v2:16b Optimal Poor Poor Optimal Poor
* deepseek-coder:1.3b Optimal Poor Poor Optimal Poor
* deepseek-coder:6.7b Optimal Poor Poor Optimal Poor
gemma:2b Optimal Optimal Average Optimal Average
gemma:7b Optimal Optimal Optimal Average Optimal
gemma2:27b Optimal Optimal Average Optimal Average
gemma2:2b Average Average Average Average Poor
gemma2:9b Optimal Average Average Optimal Average
hermes3:70b Optimal Average Average Optimal Optimal
hermes3:8b Optimal Optimal Average Optimal Optimal
llama2:13b Optimal Poor Average Optimal Optimal
llama2:70b Optimal Optimal Optimal Optimal Optimal
llama2:7b Optimal Average Optimal Average Optimal
llama3.1:70b Optimal Optimal Average Optimal Optimal
llama3.1:8b Optimal Average Average Optimal Optimal
llama3:70b Optimal Optimal Optimal Optimal Optimal
llama3:8b Optimal Average Average Optimal Optimal
mistral-nemo:12b Average Average Poor Average Optimal
mistral-small:22b Optimal Average Average Green Optimal
phi3.5:3.8b Green Average Optimal Average Optimal
phi3:14b Green Average Average Optimal Average
phi:2.7b Green Average Average Optimal Poor
stable-code:3b Optimal Average Average Optimal Optimal