In response to increasing pressure to clarify its approach to monetizing artificial intelligence (AI), Google has unveiled its latest and most powerful AI model, Gemini. Introduced on Wednesday, Gemini is a groundbreaking language model that Google believes will redefine the capabilities of AI and open up new opportunities for users and developers.
Gemini is not a singular model but a suite of three different sizes, each catering to specific needs and use cases. The largest and most capable category is Gemini Ultra, followed by Gemini Pro, which offers versatility across a wide range of tasks, and Gemini Nano, designed for specific tasks and mobile devices. This diversified approach aims to make AI more accessible and applicable to various scenarios.
As of December 13, developers and enterprise customers can access Gemini Pro through the Gemini API in Google AI Studio or Google Cloud Vertex AI. Android developers will also be able to leverage Gemini Nano for mobile applications. Google plans to license Gemini to customers through Google Cloud, allowing them to integrate the model into their own applications.
Gemini is positioned to power various Google products, including the Bard chatbot and Search Generative Experience (SGE). Bard, which will now use Gemini Pro for advanced reasoning and planning, is set to receive a major update early next year with “Bard Advanced,” utilizing the capabilities of Gemini Ultra.
Gemini Ultra has already achieved a significant milestone by outperforming human experts on Massive Multitask Language Understanding (MMLU). This includes a diverse range of subjects such as math, physics, history, law, medicine, and ethics, showcasing Gemini’s prowess in both world knowledge and complex problem-solving.
Gemini’s multimodal design enables it to understand and seamlessly operate across different types of information, including text, code, audio, image, and video. Sundar Pichai, Google’s CEO, emphasized the collaborative effort that went into building Gemini, highlighting its ability to generalize and reason across diverse information sources.
Despite being the largest model, Gemini Ultra is surprisingly more cost-efficient, according to Eli Collins, Vice President of Product at Google DeepMind. Collins stated that the model’s efficiency, combined with comprehensive safety evaluations, sets Gemini apart as a highly tested and advanced AI model.
However, Google faced questions about the delayed launch of Gemini, reminiscent of earlier challenges in rolling out AI tools. Collins explained that testing more advanced models takes time, and Gemini is the most extensively tested AI model Google has developed.
Accompanying the Gemini announcement, Google introduced its next-generation tensor processing unit, TPU v5p, for training AI models. This chip, offering improved performance for the price, competes with custom silicon from rivals Amazon and Microsoft, reflecting the intensifying competition in the cloud AI space.
The launch of Gemini raises questions about Google’s AI monetization strategy. When asked about plans to charge for access to “Bard Advanced,” Google’s general manager for Bard, Sissie Hsiao, emphasized the focus on creating a positive user experience and deferred discussions on monetization details.
In conclusion, Google’s Gemini marks a significant leap in AI capabilities, with its innovative suite of models catering to diverse needs. As Google strives to position itself as a leader in AI, the challenge remains to effectively monetize these advancements, especially as competitors also unveil their custom AI solutions. The future integration of Gemini into products like SGE hints at Google’s commitment to evolving AI experiences, although concrete plans for monetization are yet to be disclosed. As Sundar Pichai noted, “This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company.” The unfolding year will reveal the extent of Gemini’s impact on AI applications and Google’s bottom line.
(Source: Axios | CNBC | Wired)