Invest In Crypto News
  • Home
  • Latest News
    • Bitcoin News
    • Altcoin News
    • Ethereum News
    • Blockchain News
    • Doge News
    • NFT News
    • Video
    • Market Analysis
    • Business
    • Finance
    • Politics
    • Mining
    • Regulation
    • Technology
  • Top 10 Cryptos
  • Market Cap List
  • IC DAO
  • Donations
  • Contact
  • Buy Crypto
  • IC DAO
No Result
View All Result
Invest In Crypto News
  • Home
  • Latest News
    • Bitcoin News
    • Altcoin News
    • Ethereum News
    • Blockchain News
    • Doge News
    • NFT News
    • Video
    • Market Analysis
    • Business
    • Finance
    • Politics
    • Mining
    • Regulation
    • Technology
  • Top 10 Cryptos
  • Market Cap List
  • IC DAO
  • Donations
  • Contact
  • Buy Crypto
  • IC DAO
No Result
View All Result
Invest In Crypto News
No Result
View All Result

NVIDIA Surpasses 1,000 TPS/User with Llama 4 Maverick and Blackwell GPUs

CryptoExpert by CryptoExpert
May 24, 2025
in Blockchain News
0
Nvidia's Soaring Data Center Revenue Signals Strong AI and GPU Market Position
  • Facebook
  • Twitter
  • Pinterest


You might also like

NYSE Parent ICE Seeks ‘Level Playing Field’ for 24/7 Onchain Perps

SEC’s Hester Peirce Defends Crypto Privacy Tools Amid Scrutiny

Google Software Engineer Faces Charges Over Polymarket Bets



Lawrence Jengar
May 23, 2025 02:10

NVIDIA achieves a world-record inference speed of over 1,000 TPS/user using Blackwell GPUs and Llama 4 Maverick, setting a new standard for AI model performance.





NVIDIA has set a new benchmark in artificial intelligence performance with its latest achievement, breaking the 1,000 tokens per second (TPS) per user barrier using the Llama 4 Maverick model and Blackwell GPUs. This accomplishment was independently verified by the AI benchmarking service Artificial Analysis, marking a significant milestone in large language model (LLM) inference speed.

Technological Advancements

The breakthrough was achieved on a single NVIDIA DGX B200 node equipped with eight NVIDIA Blackwell GPUs, which managed to handle over 1,000 TPS per user on the Llama 4 Maverick, a 400-billion-parameter model. This performance makes Blackwell the optimal hardware for deploying Llama 4, either for maximizing throughput or minimizing latency, reaching up to 72,000 TPS/server in high throughput configurations.

Optimization Techniques

NVIDIA implemented extensive software optimizations using TensorRT-LLM to fully utilize the Blackwell GPUs. The company also trained a speculative decoding draft model using EAGLE-3 techniques, resulting in a fourfold speed increase compared to previous baselines. These enhancements maintain response accuracy while boosting performance, leveraging FP8 data types for operations like GEMMs and Mixture of Experts, ensuring accuracy comparable to BF16 metrics.

Importance of Low Latency

In generative AI applications, balancing throughput and latency is crucial. For critical applications requiring rapid decision-making, NVIDIA’s Blackwell GPUs excel by minimizing latency, as demonstrated by the TPS/user record. The hardware’s ability to handle high throughput and low latency makes it ideal for various AI tasks.

okex

Cuda Kernel and Speculative Decoding

NVIDIA optimized CUDA kernels for GEMMs, MoE, and Attention operations, utilizing spatial partitioning and efficient memory data loading to maximize performance. Speculative decoding was employed to accelerate LLM inference speed by using a smaller, faster draft model to predict speculative tokens, verified by the larger target LLM. This approach yields significant speed-ups, particularly when the draft model’s predictions are accurate.

Programmatic Dependent Launch

To further enhance performance, NVIDIA utilized Programmatic Dependent Launch (PDL) to reduce GPU idle time between consecutive CUDA kernels. This technique allows overlapping kernel execution, improving GPU utilization and eliminating performance gaps.

NVIDIA’s achievements underscore its leadership in AI infrastructure and data center technology, setting new standards for speed and efficiency in AI model deployment. The innovations in Blackwell architecture and software optimization continue to push the boundaries of what’s possible in AI performance, ensuring responsive, real-time user experiences and robust AI applications.

For more detailed information, visit the NVIDIA official blog.

Image source: Shutterstock



Source link

  • Facebook
  • Twitter
  • Pinterest
CryptoExpert

CryptoExpert

Recommended For You

NYSE Parent ICE Seeks ‘Level Playing Field’ for 24/7 Onchain Perps

by CryptoExpert
May 29, 2026
0
Cointelegraph

Intercontinental Exchange, the parent company of the New York Stock Exchange (NYSE), is urging regulators to allow regulated exchanges to offer 24/7 onchain perpetual futures trading, according to...

Read more

SEC’s Hester Peirce Defends Crypto Privacy Tools Amid Scrutiny

by CryptoExpert
May 29, 2026
0
CGV Leads Expansion in Bitcoin Wallet Sector with UniSat Investment

Joerg Hiller May 28, 2026 20:20 Hester Peirce asserts privacy-enhancing crypto tools protect investors and compliance, calling for balanced regulation amid rising surveillance concerns. ...

Read more

Google Software Engineer Faces Charges Over Polymarket Bets

by CryptoExpert
May 29, 2026
0
Cointelegraph

US authorities have charged a Google employee with allegedly using information from the company to make bets on Polymarket and profit $1.2 million.The Justice Department said on Wednesday...

Read more

BIS Project Agorá Shows Tokenized Payments Cut Settlement Risk

by CryptoExpert
May 28, 2026
0
Cointelegraph

The Bank for International Settlements (BIS) released a report Wednesday on Project Agorá, an experimental prototype for cross-border wholesale payment.The BIS said the report shows how seven central...

Read more

Streamex and Orca Debut Solana-Based Trading for GLDY Token

by CryptoExpert
May 28, 2026
0
CGV Leads Expansion in Bitcoin Wallet Sector with UniSat Investment

Felix Pinkston May 27, 2026 20:47 Streamex and Orca launch a Solana-based trading system for tokenized securities, starting with the gold-backed GLDY token. Here’s...

Read more
Next Post
Donald Trump's Treasury Secretary Predicts $2 Trillion Stablecoin Boom

Donald Trump's Treasury Secretary Predicts $2 Trillion Stablecoin Boom

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Browse by Category

  • Altcoin News
  • Bitcoin News
  • Blockchain News
  • Business
  • Doge News
  • Ethereum News
  • Finance
  • Market Analysis
  • Mining
  • NFT News
  • Politics
  • Regulation
  • Technology
  • Trending Cryptos
  • Video

Sitemap

  • Market Cap
  • Donations
  • Trading
  • Mining
  • Contact

Legal Information

  • Privacy Policy
  • Anti-Spam Policy
  • Copyright Notice
  • DMCA Compliance
  • Social Media Disclaimer
  • Terms Of Service

Categories

  • Altcoin News
  • Bitcoin News
  • Blockchain News
  • Business
  • Doge News
  • Ethereum News
  • Finance
  • Market Analysis
  • Mining
  • NFT News
  • Politics
  • Regulation
  • Technology
  • Trending Cryptos
  • Video

© Copyright 2024 InvestInCryptoNews.com

No Result
View All Result
  • Home
  • Latest News
    • Bitcoin News
    • Altcoin News
    • Ethereum News
    • Blockchain News
    • Doge News
    • NFT News
    • Video
    • Market Analysis
    • Business
    • Finance
    • Politics
    • Mining
    • Regulation
    • Technology
  • Top 10 Cryptos
  • Market Cap List
  • IC DAO
  • Donations
  • Contact
  • Buy Crypto
  • IC DAO

© Copyright 2024 InvestInCryptoNews.com

This website is using cookies to improve the user-friendliness. You agree by using the website further.

Privacy policy
bitcoin
Bitcoin (BTC) $ 73,876.00
ethereum
Ethereum (ETH) $ 2,024.71
tether
Tether (USDT) $ 0.998589
bnb
BNB (BNB) $ 641.46
xrp
XRP (XRP) $ 1.32
usd-coin
USDC (USDC) $ 0.999651
solana
Solana (SOL) $ 82.66
tron
TRON (TRX) $ 0.344275
figure-heloc
Figure Heloc (FIGR_HELOC) $ 1.03
staked-ether
Lido Staked Ether (STETH) $ 2,265.05

Pin It on Pinterest

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?