GPT family review and Azure endpoints

Gen AI
Announcement and details around Generative Pretrained Transformer (GPT) model family
Author

Rafa Sanchez

Published

March 14, 2023

gpt-4-32k

On 2023 pi-day, OpenAI release GPT-4, a new model mainly optimized for chat but works well for traditional completions tasks. Basically adds these major improvements:

  • Multi_modality: inputs text and images, and outputs text
  • Improvement on complex tasks like advanced exams. e.g. on a simulated bar exam.
  • The basic GPT-4 model has 8192 tokens (equivalent to 13 pages of text) and has been trained up to September 2021. It provides also a larger 32k version (equivalent to around 52 pages)
  • OpenAI has issued an evaluation framework, for developers to submit improvements (which in turn can get prioritized for API access).

The price of GPT-4 is 30x times more expensive thatn GPT3.5:

  • 8k-model (gpt-4 and gpt-4-0613): $0.03/1k prompt tokens ad $0.06/1k sampled tokens.
  • 32k-model (gpt-4-32k and gpt-4-32k-0613): double as 8k-model.
Algorithm Parameters Token window Checkpoins URL
GPT-3 175B 2049 davinci
GPT-3.5 turbo 20B 4096/16384 gpt-3.5-turbo-16k
GPT-4 Not public Up to 32k gpt-4-32k

Applications and demos built with GPT-4:

  1. Recreating ping-pong game in less than 60 seconds
  2. Put data in JSON format
  3. Turn a hand-drawn sketch into a functional website
  4. Discover vulnerabilites in an Ethereum solidity contract

gpt-3.5-turbo

gpt-3.5-turbo family is a set of models for text generation, chat and code completion. The original one is text-davinci-003 (recognized as GPT 3.5) but have been superseeded by the new turbo family, which are cheaper and supports more input tokens:

  • gpt-3.5-turbo: 1/10th the cost of text-davinci-003. 4096 tokens
  • gpt-3.5-turbo-16k: same but 16384 tokens

ChatGPT is a chatbot built on top of GPT3.5 with Reinforcement Learning. The endpoint is the same as GPT-3.5

davinci

davinci is the priginal GPT-3 model. GPT-3 legacy models include text-curie-001, davinci or text-ada-001. They are not offered in the Azure OpenAI service.

Azure OpenAI endpoints

Microsoft has announced a paid version of ChatGPT, through an API, and has rumoured to prepare an integration with Bing. Azure offers a separate endpoint for GPT models, using the same openai python package library but with differente API keys and parameters.

This is the list of models supported by Azure OpenAI services.

  • gpt-4-0613: 8k tokens, note 0314 version will be deprecated
  • gpt-4-32k-0613: 32k tokens, note 0314 version will be deprecated
  • gpt-35-turbo-16k
  • text-embedding-ada-002: embeddings version 2
  • dalle2: (preview)

Dedicated hosting: Foundry projecy

On Feb 21, OpenAI announced a new developer platform called Foundry, where customers can run AI models in dedicated hosting capacity (economical for usage of >450K tokens per day).

Azure OpenAI-Google Cloud model comparison

The following is a public comparison between Azure and Google Cloud GenAI models:

Type Azure endpoint Google Cloud enspoint
Text text-davinci-003 text-bison@001
Chat text-curie-001 chat-bison@001
Embedding text-ada-001 textembedding-gecko@001

Azure OpenAI-Google Cloud price comparison

The following is a comparison based on publicly available prices:

References

[1] OpenAI: GPT-4 announcemen
[2] OpenAI: GPT-4 paper, 98 pages
[3] OpenAI: GPT-4 System Card
[4] OpenAI: Morgan Stanley using GPT-4
[5] YouTube video: GPT-4 Developer Livestream