New AI Models for the University AI Chat will be launched on August 25

Starting August 25, the Data Center will make two new models available for the AI chat.

GPT-OSS 120B and Qwen3 235B VL replace Nemotron Ultra 253B

The GPT-OSS 120B model is characterized by its particularly efficient use of computing resources and delivers results with comparatively low energy consumption. It responds quickly and reliably without placing unnecessary strain on servers.
GPT-OSS 120B provides solutions for various tasks, such as summarizing texts, answering knowledge questions, or generating creative content.

With reasoning enabled, Qwen3 235B VL delivers even better results for applications in science, technology, and software development. It impresses with advanced capabilities in solving complex problems, technical analysis, and code architecture. The variant without reasoning can be a valuable alternative when working with text or implementing code.

Overall, the models perform very well in key comparison categories and are among the best freely available models. A core selection of eight important benchmarks, which evaluate skills such as mathematical problem solving and programming tasks, can be found at https://artificialanalysis.ai/.

The following models are also available:
- Gemma3 27B for processing text and images
- Qwen3 Coder 30B for fast and accurate coding with low complexity

Standardized Limits for processing large Amounts of Text

The amount of text that the models can process at one time has been standardized. All models have around 64,000 tokens available, which is approximately 50,000 words. Tokens are small units into which the model breaks down text in order to understand it. The more tokens, the more information the chat can process and take into account at the same time.

When working with language models, there is an upper limit to how much text can be processed at one time. When you upload a document, the system automatically selects only the text passages that are most relevant to your question. These selected text passages are so short that they always remain within the permissible upper limit of 64,000 tokens.

The upper limit becomes relevant if, after uploading, you select the option to send the entire text of the document to the model at once. If the entire text exceeds the upper limit, the request will be rejected and you will receive an error message. The limit is also relevant for those who use our API interface for their own applications, as the maximum context size usually has to be stored here (output tokens are not included here).

New Features and Adjustments in recent Months

We are continuously updating and optimizing the system to offer you the highest possible level of AI chat platform. The following summary shows you the most important user-side changes in recent months.

Optimized web search mechanism: Our search function has been redesigned to deliver relevant results even faster and avoid technical issues.

Advanced Features

Embedding API: Developers can use our embedding model bge-m3 to integrate their applications even better with our system.
OCR function: Text from scanned documents is also extracted after uploading and made available to the language model (may take 1-2 minutes).

Advanced Language Features

Speech-to-text/transcription with Whisper: Create transcripts from audio and video files with maximum accuracy or speak prompts into a microphone.
Text-to-speech (English) with Kokoro: Have English texts read aloud.

User Interface & Login:

The user interface has been improved and the buttons are easier to find.
Login now takes place via login.rlp.net.

For more information about the models and other important information, please visit our website: https://www.en-zdv.uni-mainz.de/ai-at-jgu/

More news from the Data Center → may be found here.