OpenAI announced today the launch of new voice and image capabilities for its popular conversational agent ChatGPT. These new features mark a major expansion in ChatGPT's capabilities, allowing users to have more natural conversations by speaking to the AI assistant and showing it images.
"We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about," said OpenAI in the press release.
The new voice feature will allow users to engage in back-and-forth conversations with ChatGPT by speaking out loud. Users can choose between five different AI-generated voices and ask questions or give instructions.
"Speak with ChatGPT and have it talk back. Speak with it on the go, request a bedtime story for your family, or settle a dinner table debate," describes OpenAI.
The image capabilities let users show ChatGPT photos to get information or ask questions about visual content. For example, users could show ChatGPT a photo of their fridge and pantry and ask for recipe ideas. The mobile app also includes a drawing tool to focus the AI on specific image areas.
OpenAI said the new features are powered by their latest natural language AI models, GPT-3.5 and GPT-4, which can apply reasoning skills to both visual and audio inputs.
The company plans a gradual rollout of voice and images, starting with Plus and Enterprise users over the next two weeks. OpenAI said this approach will allow them to refine safety measures and prepare users for more advanced AI.
"OpenAI’s goal is to build AGI that is safe and beneficial. We believe in making our tools available gradually, which allows us to make improvements and refine risk mitigations over time while also preparing everyone for more powerful systems in the future," the company stated.