Tensorrt LLM Out of Memory

NVIDIA's new Hopper H200 AI GPU tested: 3x faster GenAI with TensorRT-LLM in MLPerf 4.0 results

Using these new TensorRT-LLM optimizations, NVIDIA has pulled out a huge 2.4x performance leap with its current H100 AI GPU in MLPerf Inference 3.1 to 4.0 with GPT-J tests using an offline scenario.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

NVIDIA's new Hopper H200 AI GPU tested: 3x faster GenAI with TensorRT-LLM in MLPerf 4.0 results

Trending now