New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

feat: add support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions. #3628

Open

54corbin wants to merge 2 commits into TabbyML:main from 54corbin:support-latest-openai-models

54corbin commented Dec 28, 2024

Add Support for Latest OpenAI Chat Models (e.g., o1, gpt-4o) for Completions

Overview

The current implementation in crates/http-api-bindings/src/completion/openai.rs exclusively supports the legacy completion model gpt-3.5-turbo-instruct. This model has been marked as legacy in OpenAI's API documentation, as illustrated below:

This pull request introduces support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions while retaining compatibility with the legacy gpt-3.5-turbo-instruct model.

Changes

Updated Supported Models: Added support for new OpenAI chat models such as o1 and gpt-4o.
Backward Compatibility: Maintained support for the legacy gpt-3.5-turbo-instruct completion model.
Configuration Enhancements: Updated configuration examples to demonstrate usage with both new and legacy models.

Configuration

Below is an example of the config.toml used during testing:

# Chat Model Configuration
[model.chat.http]
kind = "openai/chat"
model_name = "gpt-3.5-turbo"  # Ensure to use a chat model, such as gpt-4o
api_endpoint = "https://api.openai.com/v1"  # DO NOT append the `/chat/completions` suffix
api_key = "<your_api_key>"

# Completion Model Configuration
[model.completion.http]
kind = "openai/completion"
# Uncomment and configure the following lines to use a different completion model
# model_name = "gpt-4o-mini"  # Use a completion model, such as gpt-3.5-turbo-instruct
# api_endpoint = "https://api.openai.com/v1/chat"  # DO NOT append the `/completions` suffix

model_name = "gpt-3.5-turbo-instruct"  # Default to legacy completion model
api_endpoint = "https://api.openai.com/v1"  # DO NOT append the `/completions` suffix
api_key = "<your_api_key>"

# Embedding Model Configuration
[model.embedding.http]
kind = "openai/embedding"
model_name = "text-embedding-3-small"  # Use an embedding model, such as text-embedding-3-small
api_endpoint = "https://api.openai.com/v1"  # DO NOT append the `/embeddings` suffix
api_key = "<your_api_key>"


          Add support for latest openai chat models e.g. o1,o1-mini,gpt-4o,gpt-…

f9d48a9

…4o-mini while doing completion

54corbin commented

View reviewed changes

crates/http-api-bindings/src/completion/openai.rs

               struct CompletionRequest {
                   model: String,
-                  prompt: String,
+                  #[serde(skip_serializing_if = "Option::is_none")]

Author

54corbin Dec 28, 2024 •

edited

Loading

prompt and messages fields are not allowed at the same time even with the empty value.

54corbin commented

View reviewed changes

crates/http-api-bindings/src/completion/openai.rs

                   max_tokens: i32,
                   temperature: f32,
                   stream: bool,
                   presence_penalty: f32,
+                  #[serde(skip_serializing_if = "Vec::is_empty")]
+                  messages: Vec<Message>,

Author

54corbin Dec 28, 2024

ditto

54corbin commented

View reviewed changes

crates/http-api-bindings/src/completion/openai.rs

               struct CompletionResponseChoice {
-                  text: String,
+                  text: Option<String>,
+                  delta: Option<CompletionResponseDelta>,

Author

54corbin Dec 28, 2024

FYI: offical api docs:
https://platform.openai.com/docs/api-reference/chat/streaming

54corbin commented

View reviewed changes

crates/http-api-bindings/src/completion/openai.rs

-                      } else {
-                          (prompt, None)
-                      };
+                      const LEGACY_MODEL_NAME: &str = "gpt-3.5-turbo-instruct";

Author

54corbin Dec 28, 2024

At the moment, this is the only one I knew.

54corbin commented

View reviewed changes

crates/http-api-bindings/src/completion/openai.rs

                           stream: true,
                           presence_penalty: options.presence_penalty,
+                          ..Default::default()

Author

54corbin Dec 28, 2024

other fields will be filled later according to the self.model_name


          set temperature with the value from configs

fe95cfe

54corbin commented

View reviewed changes

crates/http-api-bindings/src/completion/openai.rs

+                              (prompt, None)
+                          };
+                          request.prompt = Some(prompt.into());
+                          request.suffix = suffix.map(Into::into);

Author

54corbin Dec 28, 2024 •

edited

Loading

For other fields marked as #[serde(skip_serializing_if = "Option::is_none")], will be ignored while serializing to json otherwise openai server would complain about the extra fields.

54corbin commented

View reviewed changes

crates/http-api-bindings/src/completion/openai.rs

+                          request.messages = vec![Message {
+                              role: "user".to_string(),
+                              content: format!("{SYS_PMT}\n{prompt}"),

Author

54corbin Dec 28, 2024

ditto

54corbin changed the title ~~Feat: Add support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions.~~ feat: Add support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions.

54corbin changed the title ~~feat: Add support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions.~~ feat: add support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions.

54corbin marked this pull request as draft

December 28, 2024 07:50

54corbin marked this pull request as ready for review

December 28, 2024 07:51

Author

54corbin commented Dec 28, 2024

@zwpaper @wsxiaoys PTAL, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet