- USDT(TRC-20)
- $0.0
During the kickoff keynote for Google I/O 2024, the general tone seemed to be, âCan we have an extension?â Googleâs promised AI improvements are definitely taking center stage here, but with a few exceptions, most are still in the oven.
Thatâs not too surprisingâthis is a developer conference, after all. But it seems like consumers will have to wait a while longer for their promised "Her" moment. Hereâs what you can expect once Googleâs new features start to arrive.
Credit: Google/YouTube
Maybe the most impactful addition for most people will be expanded Gemini integration in Google Search. While Google already has a âgenerative searchâ feature that can jot out a quick paragraph or two, itâll soon be joined by âAI Overviews.â
AI Overviews will optionally extend generative search into an entire page, with answers to your questions as well as suggestions based on the context of your search.
For instance, if you live in a sunny area with good weather and ask for ârestaurants near you,â Overviews might give you a few basic suggestions, but also a separate, unprompted subheading with restaurants that have good patio seating.
In the more traditional search results page, youâll instead be able to use âAI organized search results,â which eschew traditional SEO to intelligently recommend web pages to you based on highly specific prompts.
For instance, you can ask Google to âcreate a gluten free three-day meal plan with lots of veggies and at least two desserts,â and the search page will create several subheadings with links to appropriate recipes under each.
Google is also bringing AI to how you search, with an emphasis on multimodalityâmeaning you can use it with more than text. Specifically, an âAsk with Videoâ feature is in the works that will allow you to simply point your phone camera at an object, ask for identification or repair help, and get answers via generative search.
Google didn't directly address how its handling criticism that AI search results essentially steal content from sources around the web without users needing to click through the original source. That said, demonstrators highlighted multiple times that these features bring you to useful links you can check out yourself, perhaps covering their bases in the face of these critiques.
AI Overviews are already rolling out to Googleâs experimental Search labs, with AI Organized Search Results and Ask with Video set for âthe coming weeks.â
Credit: Google/YouTube
Another of the more concrete features in the works is âAsk Photos,â which plays with multimodality to help you sort through the hundreds of gigabytes of images on your phone.
Say your daughter took swimming lessons last year and youâve lost track of your first photos of her in the water. Ask photos will let you simply ask, âWhen did my daughter learn to swim?" Your phone will automatically know who you mean by âyour daughter,â and surface images from her first swimming lesson.
Thatâs similar to searching your photo library for pictures of your cat by just typing âcat,â sure, but the idea is that the multimodal AI can support more detailed questions and understand what youâre asking with greater context, powered by Gemini and the data already stored on your phone.
Other details are light, with Ask Photos set to debut âin the coming months.â
Credit: Google/YouTube
Hereâs where we get into more pie in the sky stuff. Project Astra is the most C-3PO weâve seen AI get yet. The idea is youâll be able to load up the Gemini app on your phone, open your camera, point it around, and ask for questions and help based on what your phone sees.
For instance, point at a speaker, and Astra will be able to tell you what parts are in the hardware and how theyâre used. Point at a drawing of a cat with dubious vitality, and Astra will answer your riddle with âSchrĂśdingerâs Cat.â Ask it where your glasses are, and if Astra was looking at them earlier in your shot, it will be able to tell you.
This is maybe the classical dream when it comes to AI, and quite similar to OpenAI's recently announced GPT-4o, so it makes sense that itâs not ready yet. Astra is set to come âlater this year,â but curiously, itâs also supposed to work on AR glasses as well as phones. Perhaps weâll be learning of a new Google wearable soon.
Credit: Google/YouTube
Itâs unclear when this feature will be ready, since it seems to be more of an example for Googleâs improved AI models than a headliner, but one of the more impressive (and possibly unsettling) demos Google showed off during I/O involved creating a custom podcast hosted by AI voices.
Say your son is studying physics in school, but is more of an audio learner than a text-oriented one. Supposedly, Gemini will soon let you dump written PDFs into Googleâs NotebookLM app and ask Gemini to make an audio program discussing them. The app will generate what feels like a podcast, hosted by AI voices talking naturally about the topics from the PDFs.
Your son will then be able to interrupt the hosts at any time to ask for clarification.
Hallucination is obviously a major concern here, and the naturalistic language might be a little âcringe,â for lack of a better word. But thereâs no doubt itâs an impressive showcaseâŚif only we knew when weâll be able to recreate it.
Credit: Google/YouTube
Thereâs a few other tools in the works that seem purpose-built for your typical consumer, but for now, theyâre going to be limited to Googleâs paid Workspace plans.
The most promising of these is Gmail integration, which takes a three-pronged approach. The first is summaries, which can read through a Gmail thread and break down key points for you. Thatâs not too novel, nor is the second prong, which allows AI to suggest contextual replies for you based on information in your other emails.
But Gemini Q&A seems genuinely transformative. Imagine youâre looking to get some roofing work done and youâve already emailed three different construction firms for quotes. Now, you want to make a spreadsheet of each firm, their quoted price, and their availability. Instead of having to sift through each of your emails with them, you can instead ask a Gemini box at the bottom of Gmail to make that spreadsheet for you. It will search your Gmail inbox and generate a spreadsheet within minutes, saving you time and perhaps helping you find missed emails.
This sort of contextual spreadsheet building will also be coming to apps outside of Gmail, but Google was also proud to show off its new âVirtual Gemini Powered Teammate.â Still in the early stages, this upcoming Workspace feature is kind of like a mix between a typical Gemini chat box and Astra. The idea is that organizations will be able to add AI agents to their Slack equivalents that will be on call to answer questions and create documents on a 24/7 basis.
Gmailâs Gemini features will be rolling out this month to Workspace Labs users.
Credit: Google/YouTube
Earlier this year, OpenAI replaced ChatGPT plugins with âGPTs,â allowing users to create custom versions of its ChatGPT chatbots built to handle specific questions. Gems are Googleâs answer to this, and work relatively similarly. Youâll be able to create a number of Gems that each have their own page within your Gemini interface, and each answer to a specific set of instructions. In Googleâs demo, suggested Gems included examples like âYoga Bestie,â which offers exercise advice.
Gems are another feature that wonât see the light of day until a few months from now, so for now, you'll have to stick with GPTs.
Credit: Google/YouTube
Fresh off the muted reception to the Humane AI Pin and Rabbit R1, AI aficionados were hoping that Google I/O would show Geminiâs answer to the promises behind these devices, i.e. the ability to go beyond simply collating information and actually interact with websites for you. What we got was a light tease with no set release date.
In a pitch from Google CEO Sundar Pichai, we saw the companyâs intention to make AI Agents that can âthink multiple steps ahead.â For example, Pichai talked about the possibility for a future Google AI Agent to help you return shoes. It could go from âsearching your inbox for the receipt,â all the way to âfilling out a return form,â and âscheduling a pickup,â all under your supervision.
All of this had a huge caveat in that it wasnât a demo, just an example of something Google wants to work on. âImagine if Gemini couldâ did a lot of heavy lifting during this part of the event.
Credit: Google/YouTube
In addition to highlighting specific features, Google also touted the release of new AI models and updates to its existing AI model. From generative models like Imagen 3, to larger and more contextually intelligent builds of Gemini, these aspects of the presentation were intended more for developers than end users, but thereâs still a few interesting points to pull out.
The key standouts are the introduction of Veo and Music AI Sandbox, which generate AI video and sound respectively. Thereâs not too many details on how they work yet, but Google brought out big stars like Donald Glover and Wyclef Jean for promising quotes like, âEverybodyâs gonna become a directorâ and, âWe digging through the infinite crates.â
For now, the best demos we have for these generative models are in examples posted to celebrity YouTube channels. Hereâs one below:
Google also wouldnât stop talking about Gemini 1.5 Pro and 1.5 Flash during its presentation, new versions of its LLM primarily meant for developers that support larger token counts, allowing for more contextuality. These probably wonât matter much to you, but pay attention to Gemini Advanced.
Gemini Advanced is already on the market as Googleâs paid Gemini plan, and allows a larger amount of questions, some light interaction with Gemini 1.5, integration with various apps such as Docs (separate from Workspace-exclusive features), and uploads of files like PDFs.
Some of Googleâs promised features sound like theyâll need you to have a Gemini Advanced subscription, specifically those that want you to upload documents so the chatbot can answer questions related to them or riff off them with its own content. We donât know for sure yet what will be free and what wonât, but itâs yet another caveat to keep in mind for Googleâs âkeep your eye on usâ promises this I/O.
That's a wrap on Google's general announcements for Gemini. That said, they also made announcements for new AI features in Android, including a new Circle to Search ability and using Gemini for scam detection. (Not Android 15 news, however: That comes tomorrow.)
Full story here:
Thatâs not too surprisingâthis is a developer conference, after all. But it seems like consumers will have to wait a while longer for their promised "Her" moment. Hereâs what you can expect once Googleâs new features start to arrive.
AI in Google Search
Credit: Google/YouTube
Maybe the most impactful addition for most people will be expanded Gemini integration in Google Search. While Google already has a âgenerative searchâ feature that can jot out a quick paragraph or two, itâll soon be joined by âAI Overviews.â
AI Overviews will optionally extend generative search into an entire page, with answers to your questions as well as suggestions based on the context of your search.
For instance, if you live in a sunny area with good weather and ask for ârestaurants near you,â Overviews might give you a few basic suggestions, but also a separate, unprompted subheading with restaurants that have good patio seating.
In the more traditional search results page, youâll instead be able to use âAI organized search results,â which eschew traditional SEO to intelligently recommend web pages to you based on highly specific prompts.
For instance, you can ask Google to âcreate a gluten free three-day meal plan with lots of veggies and at least two desserts,â and the search page will create several subheadings with links to appropriate recipes under each.
Google is also bringing AI to how you search, with an emphasis on multimodalityâmeaning you can use it with more than text. Specifically, an âAsk with Videoâ feature is in the works that will allow you to simply point your phone camera at an object, ask for identification or repair help, and get answers via generative search.
Google didn't directly address how its handling criticism that AI search results essentially steal content from sources around the web without users needing to click through the original source. That said, demonstrators highlighted multiple times that these features bring you to useful links you can check out yourself, perhaps covering their bases in the face of these critiques.
AI Overviews are already rolling out to Googleâs experimental Search labs, with AI Organized Search Results and Ask with Video set for âthe coming weeks.â
Search your photos with AI
Credit: Google/YouTube
Another of the more concrete features in the works is âAsk Photos,â which plays with multimodality to help you sort through the hundreds of gigabytes of images on your phone.
Say your daughter took swimming lessons last year and youâve lost track of your first photos of her in the water. Ask photos will let you simply ask, âWhen did my daughter learn to swim?" Your phone will automatically know who you mean by âyour daughter,â and surface images from her first swimming lesson.
Thatâs similar to searching your photo library for pictures of your cat by just typing âcat,â sure, but the idea is that the multimodal AI can support more detailed questions and understand what youâre asking with greater context, powered by Gemini and the data already stored on your phone.
Other details are light, with Ask Photos set to debut âin the coming months.â
Project Astra: an AI agent in your pocket
Credit: Google/YouTube
Hereâs where we get into more pie in the sky stuff. Project Astra is the most C-3PO weâve seen AI get yet. The idea is youâll be able to load up the Gemini app on your phone, open your camera, point it around, and ask for questions and help based on what your phone sees.
For instance, point at a speaker, and Astra will be able to tell you what parts are in the hardware and how theyâre used. Point at a drawing of a cat with dubious vitality, and Astra will answer your riddle with âSchrĂśdingerâs Cat.â Ask it where your glasses are, and if Astra was looking at them earlier in your shot, it will be able to tell you.
This is maybe the classical dream when it comes to AI, and quite similar to OpenAI's recently announced GPT-4o, so it makes sense that itâs not ready yet. Astra is set to come âlater this year,â but curiously, itâs also supposed to work on AR glasses as well as phones. Perhaps weâll be learning of a new Google wearable soon.
Make a custom podcast Hosted by Robots
Credit: Google/YouTube
Itâs unclear when this feature will be ready, since it seems to be more of an example for Googleâs improved AI models than a headliner, but one of the more impressive (and possibly unsettling) demos Google showed off during I/O involved creating a custom podcast hosted by AI voices.
Say your son is studying physics in school, but is more of an audio learner than a text-oriented one. Supposedly, Gemini will soon let you dump written PDFs into Googleâs NotebookLM app and ask Gemini to make an audio program discussing them. The app will generate what feels like a podcast, hosted by AI voices talking naturally about the topics from the PDFs.
Your son will then be able to interrupt the hosts at any time to ask for clarification.
Hallucination is obviously a major concern here, and the naturalistic language might be a little âcringe,â for lack of a better word. But thereâs no doubt itâs an impressive showcaseâŚif only we knew when weâll be able to recreate it.
Paid features
Credit: Google/YouTube
Thereâs a few other tools in the works that seem purpose-built for your typical consumer, but for now, theyâre going to be limited to Googleâs paid Workspace plans.
The most promising of these is Gmail integration, which takes a three-pronged approach. The first is summaries, which can read through a Gmail thread and break down key points for you. Thatâs not too novel, nor is the second prong, which allows AI to suggest contextual replies for you based on information in your other emails.
But Gemini Q&A seems genuinely transformative. Imagine youâre looking to get some roofing work done and youâve already emailed three different construction firms for quotes. Now, you want to make a spreadsheet of each firm, their quoted price, and their availability. Instead of having to sift through each of your emails with them, you can instead ask a Gemini box at the bottom of Gmail to make that spreadsheet for you. It will search your Gmail inbox and generate a spreadsheet within minutes, saving you time and perhaps helping you find missed emails.
This sort of contextual spreadsheet building will also be coming to apps outside of Gmail, but Google was also proud to show off its new âVirtual Gemini Powered Teammate.â Still in the early stages, this upcoming Workspace feature is kind of like a mix between a typical Gemini chat box and Astra. The idea is that organizations will be able to add AI agents to their Slack equivalents that will be on call to answer questions and create documents on a 24/7 basis.
Gmailâs Gemini features will be rolling out this month to Workspace Labs users.
Gems
Credit: Google/YouTube
Earlier this year, OpenAI replaced ChatGPT plugins with âGPTs,â allowing users to create custom versions of its ChatGPT chatbots built to handle specific questions. Gems are Googleâs answer to this, and work relatively similarly. Youâll be able to create a number of Gems that each have their own page within your Gemini interface, and each answer to a specific set of instructions. In Googleâs demo, suggested Gems included examples like âYoga Bestie,â which offers exercise advice.
Gems are another feature that wonât see the light of day until a few months from now, so for now, you'll have to stick with GPTs.
Agents
Credit: Google/YouTube
Fresh off the muted reception to the Humane AI Pin and Rabbit R1, AI aficionados were hoping that Google I/O would show Geminiâs answer to the promises behind these devices, i.e. the ability to go beyond simply collating information and actually interact with websites for you. What we got was a light tease with no set release date.
In a pitch from Google CEO Sundar Pichai, we saw the companyâs intention to make AI Agents that can âthink multiple steps ahead.â For example, Pichai talked about the possibility for a future Google AI Agent to help you return shoes. It could go from âsearching your inbox for the receipt,â all the way to âfilling out a return form,â and âscheduling a pickup,â all under your supervision.
All of this had a huge caveat in that it wasnât a demo, just an example of something Google wants to work on. âImagine if Gemini couldâ did a lot of heavy lifting during this part of the event.
New Google AI Models
Credit: Google/YouTube
In addition to highlighting specific features, Google also touted the release of new AI models and updates to its existing AI model. From generative models like Imagen 3, to larger and more contextually intelligent builds of Gemini, these aspects of the presentation were intended more for developers than end users, but thereâs still a few interesting points to pull out.
The key standouts are the introduction of Veo and Music AI Sandbox, which generate AI video and sound respectively. Thereâs not too many details on how they work yet, but Google brought out big stars like Donald Glover and Wyclef Jean for promising quotes like, âEverybodyâs gonna become a directorâ and, âWe digging through the infinite crates.â
For now, the best demos we have for these generative models are in examples posted to celebrity YouTube channels. Hereâs one below:
Google also wouldnât stop talking about Gemini 1.5 Pro and 1.5 Flash during its presentation, new versions of its LLM primarily meant for developers that support larger token counts, allowing for more contextuality. These probably wonât matter much to you, but pay attention to Gemini Advanced.
Gemini Advanced is already on the market as Googleâs paid Gemini plan, and allows a larger amount of questions, some light interaction with Gemini 1.5, integration with various apps such as Docs (separate from Workspace-exclusive features), and uploads of files like PDFs.
Some of Googleâs promised features sound like theyâll need you to have a Gemini Advanced subscription, specifically those that want you to upload documents so the chatbot can answer questions related to them or riff off them with its own content. We donât know for sure yet what will be free and what wonât, but itâs yet another caveat to keep in mind for Googleâs âkeep your eye on usâ promises this I/O.
That's a wrap on Google's general announcements for Gemini. That said, they also made announcements for new AI features in Android, including a new Circle to Search ability and using Gemini for scam detection. (Not Android 15 news, however: That comes tomorrow.)
Full story here: