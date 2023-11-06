OpenAI launched a slew of new APIs during its first-ever developer day.

DALL-E 3, OpenAI’s text-to-image model, is now available via an API after first coming to ChatGPT and Bing Chat. Similar to the previous version of DALL-E (e.g. DALL-E 2), the API incorporates built-in moderation to help protect against misuse, OpenAI says.

The DALL-E 3 API offers different format and quality options and resolutions ranging from 1024×1024 to 1792×1024, with prices starting at $0.04 per image generated. But at least at present, it’s somewhat limited compared to the DALL-E 2 API.

Unlike the DALL-E 2 API, the DALL-E 3 can’t be used to create edited versions of images by having the model replace some areas of a pre-existing image or create variations of an existing image. And when a generation request is sent to DALL-E 3, OpenAI says that it’ll automatically re-write it “for safety reasons” and “to add more detail,” which could lead to less precision depending on the prompt.

Elsewhere, OpenAI’s now providing a text-to-speech API that offers six preset voices to choose from and two generative AI model variants. It’s available starting today, with pricing starting at $0.015 per input 1,000 characters.

“This is much more natural than anything else we’ve heard out there, which can make apps more natural to interact with and more accessible,” OpenAI Sam Altman said on stage. “It also unlocks a lot of use cases like language learning and voice assistance.”

In a related announcement, OpenAI launched the next version of its open source automatic speech recognition model, Whisper large-v3, which the company claims boasts improved performance across languages.