OpenAI said that third-party developers can now add ChatGPT to their own apps and services by using an API. This is a lot cheaper than using the language models it already has. Whisper, the company’s AI-powered speech-to-text model, will be available through an API, and the company’s terms of service for developers will change in important ways.
OpenAI says that its ChatGPT API can be used for more than just making chat interfaces that use AI. But it also talks about a few companies that have been doing this, like Snap, which announced its My AI feature at the end of February. The company says that its new family of gpt-3.5-turbo models is the best for many things besides chat.
Table of Contents
OpenAI Launches ChatGPT and Whisper API
After a short trial, OpenAI is now letting developers use its ChatGPT and Whisper models. Developers can now use API calls to make their apps interact with chatbots and turn speech into text. The ChatGPT language model is made to answer questions. It’s been getting a lot of attention since it came out on November 30, 2022.
If the answer is in its training data, it has a good chance of coming up with a good answer when given a text prompt. Or, if asked about jailbreaking, it might answer in a way that goes against its rules for safety. Whisper is a system that can understand what you say.
It came out in September of last year. It can turn spoken English into text, which can then be put into ChatGPT or used for any other speech-to-text application, such as interview transcription. The ChatGPT gpt-3.5 turbo model family came out on Wednesday.
OpenAI says that it is 10 times cheaper than the GPT-3.5 models that came before it because it costs $0.002 per 1,000 tokens (about 750 words). The large-V2 model of Whisper costs $0.006 per minute. OpenAI admits that it can be hard to run the open source version of the code. Data scientist Max Woolf says in an online post that the prices for the API are very low.
He said, “I don’t see how OpenAI can make money from this.” “This has to be a loss-leader so that competitors can’t even start.” There is no way to know for sure that these prices won’t go up in the future. OpenAI has said that it will make changes based on what early adopters say so that it is ready for a flood of paying customers.
This includes not using data sent through API to improve services, like training models, unless customers agree to it, changing its Terms of Service and Usage Policies to make it clear that users own the data that goes into and comes out of the models, and making other changes. Also, the company now knows that its services need to be stable, which wasn’t clear before.
In a blog post, the company said, “Over the last two months, our uptime has not met our own or our users’ expectations.” “Our engineering team’s top priority right now is making sure that production use cases are stable. We know that for AI to help everyone, we have to be a reliable service provider.”
That isn’t quite a Service Level Agreement, but it could be the start of something similar. Still, it’s not clear what OpenAI will do to make sure that “AI helps all of humanity.” But surely the first step to getting mechanical chatter to the poorest people in the world is to make them available.
What is Whisper API?
Whisper is a system that can automatically understand what you say, and it costs $0.006 per minute. OpenAI says that Whisper can do “robust” transcription in multiple languages and translate from those languages to English.
It can read files in many different formats, such as M4A, MP3, MP4, MPEG, MPGA, WAV, and WEBM. Speech recognition systems have been made by many companies, and many of them are very good.
Software and services from tech giants like Google, Amazon, and Meta are based on these systems. Greg Brockman, the president and chairman of OpenAI, says that what makes Whisper unique is that it was trained on 680,000 hours of multilingual and multitask data from the web. This helped it understand accents, background noise, and technical terms better.
In a video conference, Brockman told, “We released a model, but that wasn’t enough for the whole developer ecosystem to build around it.” “The Whisper API is the same big model you can get for free, but it works better than ever. It’s a lot faster and very simple to use.”
To add to what Brockman said, there are many things that make it hard for businesses to use voice transcription technology. According to a 2020 Statista survey, the main reasons companies haven’t adopted tech like tech-to-speech are problems with accuracy, accent or dialect recognition, and cost.
Uses of ChatGPT
In addition to responding to human questions, ChatGPT can also be used to:
Develop Content
ChatGPT can be used to generate content since it can generate text quickly in response to a prompt. For instance, the AI tool can produce a song based on a user’s request. In addition, this artificial intelligence application can aid users in accomplishing their literary goals and improving their writing style.
Make AI Art
Since the introduction of DALLE-2, Mid journey, and other artistic AI technologies, AI art generators have been at the forefront of the advancement of artistic imagery. OpenAI’s ChatGPT has the ability to generate Augmented Reality (AR) scenarios with a high level of detail in the future when told to do so.
Write Code and Debug
ChatGPT can parse code, produce code, and aid in debugging code for developers. It can be used, for instance, to generate SQL queries. Owing to the importance of SQL knowledge for data scientists, utilising ChatGPT to enhance SQL skills might catapult you to the next level of your career.
Handle and Manipulate Data
It is challenging to filter, manage, and organise unstructured data, rendering it unnecessary. ChatGPT is able to transform unstructured content into a structured format by modifying data. The tool can be used to add data to a database, generate indexes, and comprehend JSON queries, among other things.
Tutor and Guide
ChatGPT’s ability to explain language, coding, and even physics is astonishing. As the AI teaching capabilities of ChatGPT improve and become more refined in the next years, it will have a tremendous effect on how students interact with the outside world.
Hence, ChatGPT will have a substantial effect on the education technology industry. Numerous ed-tech companies may now teach the fundamentals of a subject using ChatGPT as a venue for students to ask questions and explain misconceptions.
Limitations of ChatGPT
Wrong Answers
ChatGPT is a large language model that is always getting better at answering questions. But since the technology is brand new, the model hasn’t been trained enough yet. So, the chatbot could give the wrong answer. Because of this, StackOverflow has banned ChatGPT, saying, “Overall, because the average rate of getting correct answers from ChatGPT is too low, the posting of answers by ChatGPT is very bad for our site and for users who are asking for or looking for correct answers.”
Sustainability
On Twitter, people are talking about how many Graphics Processing Units (GPUs) ChatGPT needs to work. The lesson is that it costs a lot of money to run ChatGPT. There are a lot of questions about how long ChatGPT will last since it is free.
Training Data and Biases
Like a lot of AI models, the training data for ChatGPT has its limits. Both the limits on the training data and any bias in the data can hurt the results of the model. In fact, this AI tool has shown bias when it comes to training data groups from minorities. Because of this, it’s important to make the data in the models clearer to make this technology less biased.
Limitations of Whisper API
Whisper does have some problems, though, especially when it comes to guessing the “next word.” Because the system was trained on a lot of noisy data, OpenAI says that Whisper’s transcriptions might include words that weren’t actually spoken.
This could be because Whisper is trying to both guess the next word in an audio recording and write down the recording itself. Whisper also doesn’t work the same way in every language. Whisper makes more mistakes when used by people who speak languages that aren’t well represented in the training data.
Still, OpenAI thinks that Whisper’s ability to transcribe will be used to improve other apps, services, products, and tools. The AI-powered language learning app Speak is already using the Whisper API to power a new virtual speaking companion inside the app.
If OpenAI can break into the speech-to-text market in a big way, the company backed by Microsoft could make a lot of money. One report says that by 2026, the segment could be worth $5.4 billion, which is more than its value of $2.2 billion in 2021.
Conclusion
Despite the fact that a number of developers have discovered workarounds to incorporate chat services into their apps, such as utilizing OpenAI’s standard GPT API, which has been available for some time, the introduction of an official ChatGPT API feels like the moment when the floodgates open. While many organizations are designing their own AI chatbot models, most engineers are unable of doing so. They can now utilize the technologies of OpenAI.