Big Techday 25: Talk on "How to run your LLM and how to benchmark it"

March 6th, 2026

In this talk, our colleagues share how TNG operates its GPU cluster comprising more than 50 GPUs and how we evaluate Large Language Models.
Key aspects covered:
• Hardware requirements and common pitfalls of an in‑house AI data center
• Performance optimizations for hundreds of concurrent users
• Practical methods to assess LLMs
• Balancing latency, cost, and output quality when selecting models

This talk was recorded at our Big Techday 25. You can find the full video on YouTube.

On May 22nd, Big Techday 26 will take place with talks on AI research and many other topics. For more information, see.