Google Launches Gemini

11 Dec 2023

Sophie Robson

Head of Innovation

Last week Google launched Gemini, their latest AI model they hope will take down GPT-4. CEO Sundar Pichai has termed this the ‘Gemini era’ with Gemini being Google’s latest large language model, describing it as “A huge leap forward in an AI model that will ultimately affect practically all of Google’s products. “One of the powerful things about this moment,” Pichai says, “is you can work on one underlying technology and make it better and it immediately flows across our products.” 

OpenAI launched ChatGPT over a year ago, and caught Google off guard by how powerful ChatGPT became, now they are ready to fight back.

Gemini started to roll out last week within Google’s Bards English language setting. There are three versions of Gemini: Ultra, the biggest and most powerful; the Pro version and the Nano version which is significantly smaller and more efficient. Google plans to license Gemini to customers via Google Cloud, allowing them to integrate the model into their applications. It’s launching in more than 170 countries and will be made available to developers through Google Cloud’s API from December 13. Gemini will be introduced into other Google products including generative search, ads, and Chrome in the next coming months. Google stated that “the most powerful Gemini version of all will debut in 2024, pending “extensive trust and safety checks”.

In a comparison done by the former engineer at Google Search - Debarghya Das, he analyzed the differences between Gemini Ultra and GPT-4V, which are the most superior versions of Google's Gemini and OpenAI's ChatGPT, respectively, using various benchmarks:

General Understanding (MMLU):

Gemini Ultra: Achieves a remarkable 90.0% in Massive Multitask Language Understanding (MMLU), showcasing its ability to comprehend 57 subjects, including STEM, humanities, and more.

GPT-4V: Reports an 86.4% 5-shot capability in a similar benchmark.

Reasoning Abilities:

Gemini Ultra: Scores 83.6% in the Big-Bench Hard benchmark, demonstrating proficiency in diverse, multi-step reasoning tasks.

GPT-4V: Shows comparable performance with an 83.1% 3-shot capability in a similar context.

Reading Comprehension (DROP):

Gemini Ultra: Excels with an 82.4 F1 Score in the DROP reading comprehension benchmark.

GPT-4V: Achieves 80.9 3-shot capability in a similar scenario.

Commonsense Reasoning (HellaSwag):

Gemini Ultra: Impresses with an 87.8% 10-shot capability in the HellaSwag benchmark, showcasing adept common sense reasoning.

GPT-4V: Demonstrates a slightly higher 95.3% 10-shot capability in the same benchmark.

Mathematical Proficiency (GSM8K):

Gemini Ultra: Excels in basic arithmetic manipulations with a 94.4% maj1@32 score.

GPT-4V: Maintains 92.0% 5-shot capability in Grade School math problems.

Challenging Math Problems (MATH):

Gemini Ultra: Tackles complex math problems with a 53.2% 4-shot capability, showcasing versatility.

GPT-4V: Maintains a competitive 52.9% 4-shot capability in a similar context.

Code Generation (HumanEval):

Gemini Ultra: Efficiently generates Python code with a commendable 74.4% 0-shot capability (IT).

GPT-4V: Performs well with a 67.0% 0-shot capability.

Natural Language to Code (Natural2Code):

Gemini Ultra: Showcases proficiency in generating Python code with a 74.9% 0-shot capability.

GPT-4V: Maintains a 73.9% 0-shot capability in a similar benchmark.

The launch of Gemini demonstrates Google's strategy to dominate the AI world and especially position themselves as equal front runners to ChatGPT.

The overall capabilities of Gemini are somewhat frightening. Which begs the question, why are big tech companies really wanting to create a future where AI models potentially have greater intelligence than humans. This ultimately can not be a good thing.

Gemini Ultra, the most advanced version of Gemini, scored 90.0% on a test called MMLU (massive multitask language understanding). This test uses 57 subjects like math, physics, history, law, medicine, and ethics to check both knowledge and problem-solving abilities. This is the first time an AI model has done better than human experts on this test.

Gemini's cost is estimated to be in the hundreds of millions, the potential revenue for the company that dominates AI services through the cloud could be monumental. Oren Etzioni, former CEO of the Allen Institute for AI, notes, "This is a take-no-prisoners, must-win war." As AI becomes increasingly integral to various industries, Google's strategic move with Gemini signifies a significant step forward, unlocking new possibilities.

We welcome your thoughts on the great AI debate. Reach out to us at Crowd, we would love to hear how you are currently using AI in your business or if you need help getting started.

