Google unveils Gemini 2.0 Flash Thinking to rival OpenAI o1

Google unveils Gemini 2.0 Flash Thinking to rival OpenAI o1


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

In its latest push to redefine the AI landscape, Google has announced Gemini 2.0 Flash Thinking, a multimodal reasoning model capable of tackling complex problems with both speed and transparency.

In a post on the social network X, Google CEO Sundar Pichai wrote that it was: “Our most thoughtful model yet:)”

And on the developer documentation, Google explains, “Thinking Mode is capable of stronger reasoning capabilities in its responses than the base Gemini 2.0 Flash model,” which was previously Google’s latest and greatest, released only eight days ago.

itrust

The new model supports just 32,000 tokens of input (about 50-60 pages worth of text) and can produce 8,000 tokens per output response. In a side panel on Google AI Studio, the company claims it is best for “multimodal understanding, reasoning” and “coding.”

Full details of the model’s training process, architecture, licensing, and costs have yet to be released. Right now, it shows zero cost per token in the Google AI Studio.

Accessible and more transparent reasoning

Unlike competitor reasoning models o1 and o1 mini from OpenAI, Gemini 2.0 enables users to access its step-by-step reasoning through a dropdown menu, offering clearer, more transparent insight into how the model arrives at its conclusions.

By allowing users to see how decisions are made, Gemini 2.0 addresses longstanding concerns about AI functioning as a “black box,” and brings this model — licensing terms still unclear — to parity with other open-source models fielded by competitors.

My early simple tests of the model showed it correctly and speedily (within one to three seconds) answered some questions that have been notoriously tricky for other AI models, such as counting the number of Rs in the word “Strawberry.” (See screenshot above).

In another test, when comparing two decimal numbers (9.9 and 9.11), the model systematically broke the problem into smaller steps, from analyzing whole numbers to comparing decimal places.

These results are backed up by independent third-party analysis from LM Arena, which named Gemini 2.0 Flash Thinking the number one performing model across all LLM categories.

Native support for image uploads and analysis

In a further improvement over the rival OpenAI o1 family, Gemini 2.0 Flash Thinking is designed to process images from the jump.

o1 launched as a text-only model, but has since expanded to include image and file upload analysis. Both models can also only return text, at this time.

Gemini 2.0 Flash Thinking also does not currently support grounding with Google Search, or integration with other Google apps and external third-party tools, according to the developer documentation.

Gemini 2.0 Flash Thinking’s multimodal capability expands its potential use cases, enabling it to tackle scenarios that combine different types of data.

For example, in one test, the model solved a puzzle that required analyzing textual and visual elements, demonstrating its versatility in integrating and reasoning across formats.

Developers can leverage these features via Google AI Studio and Vertex AI, where the model is available for experimentation.

As the AI landscape grows increasingly competitive, Gemini 2.0 Flash Thinking could mark the beginning of a new era for problem-solving models. Its ability to handle diverse data types, offer visible reasoning, and perform at scale positions it as a serious contender in the reasoning AI market, rivaling OpenAI’s o1 family and beyond.



Source link

[wp-stealth-ads rows="2" mobile-rows="3"]

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

#GlobalNewsIt
Coinmama
#GlobalNewsIt
Google unveils Gemini 2.0 Flash Thinking to rival OpenAI o1
itrust
Coinmama
Allen Institute for AI (Ai2) Launches OLMoTrace: Real-Time Tracing of LLM Outputs Back to Training Data
DeepCoder delivers top coding performance in efficient 14B open model
Photo of a gavel as OpenAI launches a legal counteroffensive against one of its co-founders, Elon Musk, and his competing AI venture, xAI.
Google Introduces Agent2Agent (A2A): A New Open Protocol that Allows AI Agents Securely Collaborate Across Ecosystems Regardless of Framework or Vendor
Google introduces Firebase Studio, an end-to-end platform that builds custom apps in-browser, in minutes
Horse race as Deep Cogito releases several open large language models (LLMs), claiming the AI models outperform competitors and represent a step towards achieving general superintelligence.
bitcoin
ethereum
bnb
xrp
cardano
solana
dogecoin
polkadot
shiba-inu
dai
Long-term holders continue to accumulate as short-term sellers react to market stress
Dogecoin
Breaks $4,000 as Weekly Transactions Hit $17.15 Billion
US crypto miners may rush to buy rigs in tariff pause despite ‘clear disadvantage’
Exploring the Advances in Automatic Speech Recognition (ASR) Technology
Long-term holders continue to accumulate as short-term sellers react to market stress
Dogecoin
Breaks $4,000 as Weekly Transactions Hit $17.15 Billion
US crypto miners may rush to buy rigs in tariff pause despite ‘clear disadvantage’
bitcoin
ethereum
tether
xrp
bnb
solana
usd-coin
dogecoin
tron
cardano
bitcoin
ethereum
tether
xrp
bnb
solana
usd-coin
dogecoin
tron
cardano