Google’s Gemini 1.5 Pro dethrones GPT-4o

Google’s Gemini 1.5 Pro dethrones GPT-4o


Google’s experimental Gemini 1.5 Pro model has surpassed OpenAI’s GPT-4o in generative AI benchmarks.

For the past year, OpenAI’s GPT-4o and Anthropic’s Claude-3 have dominated the landscape. However, the latest version of Gemini 1.5 Pro appears to have taken the lead.

One of the most widely recognised benchmarks in the AI community is the LMSYS Chatbot Arena, which evaluates models on various tasks and assigns an overall competency score. On this leaderboard, GPT-4o achieved a score of 1,286, while Claude-3 secured a commendable 1,271. A previous iteration of Gemini 1.5 Pro had scored 1,261.

The experimental version of Gemini 1.5 Pro (designated as Gemini 1.5 Pro 0801) surpassed its closest rivals with an impressive score of 1,300. This significant improvement suggests that Google’s latest model may possess greater overall capabilities than its competitors.

bybit

It’s worth noting that while benchmarks provide valuable insights into an AI model’s performance, they may not always accurately represent the full spectrum of its abilities or limitations in real-world applications.

Despite Gemini 1.5 Pro’s current availability, the fact that it’s labelled as an early release or in a testing phase suggests that Google may still make adjustments or even withdraw the model for safety or alignment reasons.

This development marks a significant milestone in the ongoing race for AI supremacy among tech giants. Google’s ability to surpass OpenAI and Anthropic in benchmark scores demonstrates the rapid pace of innovation in the field and the intense competition driving these advancements.

As the AI landscape continues to evolve, it will be interesting to see how OpenAI and Anthropic respond to this challenge from Google. Will they be able to reclaim their positions at the top of the leaderboard, or has Google established a new standard for generative AI performance?

(Photo by Yuliya Strizhkina)

See also: Meta’s AI strategy: Building for tomorrow, not immediate profits

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: ai, artificial intelligence, benchmark, chatbot arena, gemini, gemini 1.5 pro, Google, large language model, llm, lmsys, Model



Source link

[wp-stealth-ads rows="2" mobile-rows="3"]

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

#GlobalNewsIt
Coinbase
#GlobalNewsIt
Google’s Gemini 1.5 Pro dethrones GPT-4o
bybit
Fiverr
ChatGPT hits record usage after viral Ghibli feature—Here are four risks to know first
RARE (Retrieval-Augmented Reasoning Modeling): A Scalable AI Framework for Domain-Specific Reasoning in Lightweight Language Models
Return Entertainment launches Rivals Arena smart TV trivia game on Amazon Fire TV in UK
AWISEE.com Analyzes Gmail's AI-Powered Search Update and Its Impact on Influencer Marketing
Anthropic’s Evaluation of Chain-of-Thought Faithfulness: Investigating Hidden Reasoning, Reward Hacks, and the Limitations of Verbal AI Transparency in Reasoning Models
Meta's answer to DeepSeek is here: Llama 4 launches with long context Scout and Maverick models, and 2T parameter Behemoth on the way!
bitcoin
ethereum
bnb
xrp
cardano
solana
dogecoin
polkadot
shiba-inu
dai
Trump’s Tariffs Stir Emergency Rate Cut Bets as Recession Fears Mount
ChatGPT hits record usage after viral Ghibli feature—Here are four risks to know first
Stablecoin loan repayments flag early signs of Ethereum volatility, report finds
Bitcoin hashrate tops 1 Zetahash in historic first, trackers show
Whales Increase Holdings by 12% Despite Market Downturn
Trump’s Tariffs Stir Emergency Rate Cut Bets as Recession Fears Mount
ChatGPT hits record usage after viral Ghibli feature—Here are four risks to know first
Stablecoin loan repayments flag early signs of Ethereum volatility, report finds
Bitcoin hashrate tops 1 Zetahash in historic first, trackers show
bitcoin
ethereum
tether
xrp
bnb
usd-coin
solana
dogecoin
tron
cardano
bitcoin
ethereum
tether
xrp
bnb
usd-coin
solana
dogecoin
tron
cardano