Premium
This is an archive article published on February 21, 2024

Gemini 1.5 Pro: 5 capabilities that Google hopes will outperform ChatGPT

Here are 5 Gemini 1.5 Pro features that Google hopes will beat out its rival OpenAI's ChatGPT.

Gemini 1.5 Pro could be a life-saver for many professionals.Gemini 1.5 Pro could be a life-saver for many professionals. (Google)

It has been less than a week since Google launched its powerful AI model, Gemini 1.5 Pro. The model, which is currently accessible to a select set of users, is already making some noise on the internet. The Gemini 1.5 Pro mid-sized multimodal AI model has been scaled for a wide range of tasks. The model comes with a standard 1,28,000 token context window, however, Google is allowing a limited number of developers and enterprise customers to try it with a context window of up to 1 million tokens.

While Gemini 1.5 Pro is still far from being made available to the public, the Internet seems to be bustling with its incredible use cases shared by those who have already accessed the model. Below are some use cases that are making the Gemini 1.5 Pro model a head turner in an ocean of AI models and chatbots.

Analysing videos

We are in the age of AI, where texts, images, and videos that are too good to be true can make us suspicious. While there are signs that indicate if an image or video is AI-generated, there hasn’t been any AI tool so far that could comment on the origin or accuracy of a video. Uploading the recently released Sora video of a cat and asking Gemini 1.5 Pro if it’s generated by AI, will give you a response that could clear your doubts. The 1.5 Pro promptly said that the uploaded video could be AI-generated although it was difficult to confirm. The AI said that the cat’s movement and realistic lighting shadows could suggest it is real. However, at the same time, the cat’s eyes appear unnaturally large and the fur looks too perfect. The response did not clearly indicate the matter, but Gemini 1.5 Pro went into great detail, coaxing the user to decide for themselves.

Understanding long videos

This was demonstrated by Google when the Gemini 1.5 Pro was launched where the developers showed how a 44-minute long silent film was used as prompt. Later they used multimodal prompts to see the accuracy of the model. Similarly, upon uploading a long video of the entire NBA dunk contest and asking which dunk had the highest core. The Gemini 1.5 Pro model was able to accurately find the perfect 50 dunk and based on details based on its ability to understand long context video.

Analysing transcripts and helping you decide

Imagine you are confused about which movie to watch between two masterpieces. The natural instinct is to go on to the internet to check ratings and decide. With Gemini 1.5 Pro you could get more personalised information based on the analysis of the movie transcripts. Users can upload transcripts of two movies and ask Gemini to compare and contrast the transcripts. The Google AI model is capable of giving you a precise comparison of two movies based on the transcript.

Translation

This could be a game changer as Gemini 1.5 Pro could translate languages in minutes. It can even translate entire newsletters from English to a language like Saterlandic which is spoken by less than 2000 people. While free versions of ChatGPT or Gemini Chatbot have moderate success, Gemini 1.5 Pro could be a great tool for translation.

Decoding complex tables in documents

Gemini 1.5 Pro could be a life-saver for many professionals. The model is capable of deciphering even the most complex tables and statistics in long reports in PDF files. To simplify, upload a 150-page long report as a report and ask the model to explain the table on the 77th page. The AI model with seconds comes up with the most logical explanation.

Story continues below this ad

The Gemini 1.5 Pro comes with a standard 1,28,000 token context window. However, Google is allowing a select set of developers and enterprise customers to try it with a context window of up to 1 million tokens. Gemini 1.5 Pro is at present in preview mode and developers can test the model using Google’s AI Studio and Vertex AI.

The above list has been compiled based on a thread shared by AI enthusiast Rowan Cheung on X (formerly Twitter)

 

Latest Comment
Post Comment
Read Comments
Advertisement
Loading Taboola...
Advertisement