Embedding Models and Knowledge Base Benchmarking

Embedding Models

Embedding models convert text into numerical vectors that capture meaning and relationships. In the larger LLM ecosystem, they power fast semantic search, and serve as the bridge between raw text and the generator by pulling in the most relevant information. They are a major component of Retrieval-Augmented Generation (RAG) systems.

How to Load Embedding Models in OpenWebUI (RAG Setup)

Embedding models are core components of RAG techniques, which supplement LLMs with additional information beyond the scope of the training data. The store where this supplemental information is retrieved from is often called a knowledge base. In OpenWebUI, you can choose the embedding model for your knowledge base.

Follow the steps below to configure your embedding model:

Load Your Knowledge Base
- Launch OpenWebUI via OnDemand
- In the left menu, select Workspace.
- Then, click the "Knowledge" header.
- In the upper-right corner, click the ➕ “Create” icon.
- Fill out the information to set up a new Knowledge Base.
Access the Admin Panel
- Return to the homepage, then click the person icon in the top right.
- From the dropdown, select "Admin Panel."
Set Your Embedding Model
- In the admin panel, in the Settings tab, select Documents in the left menu.
- Scroll to the “Embedding Model” section.
- The default preloaded model is sentence-transformers/all-MiniLM-L6-v2
- To use a different model:
  - Go to Hugging Face’s embedding model library (see the models section towards the bottom).
  - Copy the full model name (e.g., sentence-transformers/parallel-sentences-talks)
  - Paste it into the embedding model field.
  - Click the download icon on the right end of the model input line.

Once downloaded, the new embedding model will be active and ready to use with your knowledge base.

Embedding Models and Datasets Available on the HPCC

These models convert text into vectors that RAG uses to retrieve relevant information.

Note: HPCC currently supports embedding models with a maximum embedding dimension of 384, or models that have been explicitly verified to work despite having higher documented limits.

Model Name	Architecture	Embedding Dimensions	Language Support	Notes
sentence-transformers/all-MiniLM-L6-v2	MiniLM	384	English	Default in OpenWebUI; very fast & light
sentence-transformers/paraphrase-MiniLM-L6-v2	MiniLM	384	English	Optimized for high-quality sentence similarity
sentence-transformers/multi-qa-MiniLM-L6-cos-v1	MiniLM	384	Multilingual	Tuned specifically for multilingual QA
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2	MiniLM	384	Multilingual	Embedding Dimensions of 512; confirmed to run reliably