Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions. #3628

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

54corbin
Copy link

Add Support for Latest OpenAI Chat Models (e.g., o1, gpt-4o) for Completions

Overview

The current implementation in crates/http-api-bindings/src/completion/openai.rs exclusively supports the legacy completion model gpt-3.5-turbo-instruct. This model has been marked as legacy in OpenAI's API documentation, as illustrated below:

Legacy Model

This pull request introduces support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions while retaining compatibility with the legacy gpt-3.5-turbo-instruct model.

Changes

  • Updated Supported Models: Added support for new OpenAI chat models such as o1 and gpt-4o.
  • Backward Compatibility: Maintained support for the legacy gpt-3.5-turbo-instruct completion model.
  • Configuration Enhancements: Updated configuration examples to demonstrate usage with both new and legacy models.

Configuration

Below is an example of the config.toml used during testing:

# Chat Model Configuration
[model.chat.http]
kind = "openai/chat"
model_name = "gpt-3.5-turbo"  # Ensure to use a chat model, such as gpt-4o
api_endpoint = "https://api.openai.com/v1"  # DO NOT append the `/chat/completions` suffix
api_key = "<your_api_key>"

# Completion Model Configuration
[model.completion.http]
kind = "openai/completion"
# Uncomment and configure the following lines to use a different completion model
# model_name = "gpt-4o-mini"  # Use a completion model, such as gpt-3.5-turbo-instruct
# api_endpoint = "https://api.openai.com/v1/chat"  # DO NOT append the `/completions` suffix

model_name = "gpt-3.5-turbo-instruct"  # Default to legacy completion model
api_endpoint = "https://api.openai.com/v1"  # DO NOT append the `/completions` suffix
api_key = "<your_api_key>"

# Embedding Model Configuration
[model.embedding.http]
kind = "openai/embedding"
model_name = "text-embedding-3-small"  # Use an embedding model, such as text-embedding-3-small
api_endpoint = "https://api.openai.com/v1"  # DO NOT append the `/embeddings` suffix
api_key = "<your_api_key>"

struct CompletionRequest {
model: String,
prompt: String,

#[serde(skip_serializing_if = "Option::is_none")]
Copy link
Author

@54corbin 54corbin Dec 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prompt and messages fields are not allowed at the same time even with the empty value.

max_tokens: i32,
temperature: f32,
stream: bool,
presence_penalty: f32,

#[serde(skip_serializing_if = "Vec::is_empty")]
messages: Vec<Message>,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

struct CompletionResponseChoice {
text: String,
text: Option<String>,
delta: Option<CompletionResponseDelta>,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

} else {
(prompt, None)
};
const LEGACY_MODEL_NAME: &str = "gpt-3.5-turbo-instruct";
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, this is the only one I knew.

stream: true,
presence_penalty: options.presence_penalty,
..Default::default()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other fields will be filled later according to the self.model_name

(prompt, None)
};
request.prompt = Some(prompt.into());
request.suffix = suffix.map(Into::into);
Copy link
Author

@54corbin 54corbin Dec 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For other fields marked as #[serde(skip_serializing_if = "Option::is_none")], will be ignored while serializing to json otherwise openai server would complain about the extra fields.


request.messages = vec![Message {
role: "user".to_string(),
content: format!("{SYS_PMT}\n{prompt}"),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@54corbin 54corbin changed the title Feat: Add support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions. feat: Add support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions. Dec 28, 2024
@54corbin 54corbin changed the title feat: Add support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions. feat: add support for the latest OpenAI chat models (e.g., o1, gpt-4o) for completions. Dec 28, 2024
@54corbin 54corbin marked this pull request as draft December 28, 2024 07:50
@54corbin 54corbin marked this pull request as ready for review December 28, 2024 07:51
@54corbin
Copy link
Author

@zwpaper @wsxiaoys PTAL, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant