Google said on Wednesday that it's “aware that Gemini is offering inaccuracies in some historical image generation depictions” and that it's "working to improve these kinds of depictions immediately." (Image: Google)Google received a lot of flak for its Gemini chatbot’s AI image generation feature that was launched three weeks. Users accused it of overdoing diversity and inclusion when generating images of people. For example, one user pointed out how Gemini threw up images with people of various ethnicities when asked to show an image depicting the founding fathers of the United States. They were all white men and this “historical inaccuracy” was deemed a problem.
The search giant on Friday acknowledged the issue and said that it will work to fix it while the feature is temporarily paused.
“Three weeks ago, we launched a new image generation feature for the Gemini conversational app (formerly known as Bard), which included the ability to create images of people. It’s clear that this feature missed the mark. Some of the images generated are inaccurate or even offensive. We’re grateful for users’ feedback and are sorry the feature didn’t work well. We’ve acknowledged the mistake and temporarily paused image generation of people in Gemini while we work on an improved version,” wrote Google’s senior vice president, Prabhakar Raghavan in a company blog post.
The image generation feature of Gemini was built on an AI model called Imagen 2. Google tuned this feature to counter some of the issues the company says it saw in other generation products — how people use it to depict violent or sexually explicit images or depictions of real people.
The company also tried to bring in standards of diversity, equity and inclusion to the product but they seemed to have overshot the mark. Raghavan wrote that if a user gives a prompt for something like a group of football players or someone walking a dog, it would be ideal if it depicted people of more than just one ethnicity.
“However, if you prompt Gemini for images of a specific type of person — such as “a Black teacher in a classroom,” or “a white veterinarian with a dog” — or people in particular cultural or historical contexts, you should absolutely get a response that accurately reflects what you ask for,” admitted Raghavan.
Google seems to have done two things wrong here. While the company tuned the model to ensure that a range of people are depicted, it did not account for the cases where it really should not show a range of people. The company also says that the model became way more cautious than was intended, refusing to answer some prompts entirely because it wrongly interpreted normal prompts as sensitive.
The company has temporarily shut down the model’s operations and will only bring it back after extensive work, which should also include rigorous testing.