Skip to content

OpenAI 兼容性

注意: OpenAI 兼容性是实验性的,可能会进行重大调整,包括破坏性变更。如需完全访问 Ollama API,请参阅 Ollama Python 库JavaScript 库REST API

Ollama 提供了与 OpenAI API 部分功能的实验性兼容,以帮助现有应用程序连接到 Ollama。

使用方法

OpenAI Python 库

python
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',

    # required but ignored
    api_key='ollama',
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
    model='llama3.2',
)

response = client.chat.completions.create(
    model="llava",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": "",
                },
            ],
        }
    ],
    max_tokens=300,
)

completion = client.completions.create(
    model="llama3.2",
    prompt="Say this is a test",
)

list_completion = client.models.list()

model = client.models.retrieve("llama3.2")

embeddings = client.embeddings.create(
    model="all-minilm",
    input=["why is the sky blue?", "why is the grass green?"],
)

OpenAI JavaScript 库

javascript
import OpenAI from 'openai'

const openai = new OpenAI({
  baseURL: 'http://localhost:11434/v1/',

  // required but ignored
  apiKey: 'ollama',
})

const chatCompletion = await openai.chat.completions.create({
    messages: [{ role: 'user', content: 'Say this is a test' }],
    model: 'llama3.2',
})

const response = await openai.chat.completions.create({
    model: "llava",
    messages: [
        {
        role: "user",
        content: [
            { type: "text", text: "What's in this image?" },
            {
            type: "image_url",
            image_url: "",
            },
        ],
        },
    ],
})

const completion = await openai.completions.create({
    model: "llama3.2",
    prompt: "Say this is a test.",
})

const listCompletion = await openai.models.list()

const model = await openai.models.retrieve("llama3.2")

const embedding = await openai.embeddings.create({
  model: "all-minilm",
  input: ["why is the sky blue?", "why is the grass green?"],
})

curl

shell
curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "llama3.2",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llava",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What'\''s in this image?"
          },
          {
            "type": "image_url",
            "image_url": {
               "url": ""
            }
          }
        ]
      }
    ],
    "max_tokens": 300
  }'

curl http://localhost:11434/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "llama3.2",
        "prompt": "Say this is a test"
    }'

curl http://localhost:11434/v1/models

curl http://localhost:11434/v1/models/llama3.2

curl http://localhost:11434/v1/embeddings \
    -H "Content-Type: application/json" \
    -d '{
        "model": "all-minilm",
        "input": ["why is the sky blue?", "why is the grass green?"]
    }'

端点

/v1/chat/completions

支持的功能

  • [x] Chat completions
  • [x] 流式传输
  • [x] JSON 模式
  • [x] 可重复的输出
  • [x] 视觉
  • [x] 工具(流式传输支持即将推出)
  • [ ] 对数概率

支持的请求字段

  • [x] model
  • [x] messages
    • [x] 文本 content
    • [x] 图像 content
      • [x] Base64 编码的图像
      • [ ] 图像 URL
    • [x] content 部分的数组
  • [x] frequency_penalty
  • [x] presence_penalty
  • [x] response_format
  • [x] seed
  • [x] stop
  • [x] stream
  • [x] temperature
  • [x] top_p
  • [x] max_tokens
  • [x] tools
  • [ ] tool_choice
  • [ ] logit_bias
  • [ ] user
  • [ ] n

/v1/completions

支持的功能

  • [x] Completions
  • [x] 流式传输
  • [x] JSON 模式
  • [x] 可重复的输出
  • [ ] 对数概率

支持的请求字段

  • [x] model
  • [x] prompt
  • [x] frequency_penalty
  • [x] presence_penalty
  • [x] seed
  • [x] stop
  • [x] stream
  • [x] temperature
  • [x] top_p
  • [x] max_tokens
  • [x] suffix
  • [ ] best_of
  • [ ] echo
  • [ ] logit_bias
  • [ ] user
  • [ ] n

注意事项

  • prompt 目前仅接受字符串

/v1/models

注意事项

  • created 对应模型最后修改的时间
  • owned_by 对应 ollama 用户名,默认为 "library"

/v1/models/{model}

注意事项

  • created 对应模型最后修改的时间
  • owned_by 对应 ollama 用户名,默认为 "library"

/v1/embeddings

支持的请求字段

  • [x] model
  • [x] input
    • [x] 字符串
    • [x] 字符串数组
    • [ ] 令牌数组
    • [ ] 令牌数组的数组
  • [ ] encoding format
  • [ ] dimensions
  • [ ] user

模型

在使用模型之前,请将其拉取到本地 ollama pull

shell
ollama pull llama3.2

默认模型名称

对于依赖默认 OpenAI 模型名称(如 gpt-3.5-turbo)的工具,使用 ollama cp 将现有模型名称复制到一个临时名称:

ollama cp llama3.2 gpt-3.5-turbo

之后,这个新的模型名称可以指定在 model 字段中:

shell
curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "gpt-3.5-turbo",
        "messages": [
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'

设置上下文大小

OpenAI API 没有提供设置模型上下文大小的方法。如果你需要更改上下文大小,可以创建一个 Modelfile,其内容如下:

modelfile
FROM <some model>
PARAMETER num_ctx <context size>

使用 ollama create mymodel 命令创建一个具有更新上下文大小的新模型。使用更新后的模型名称调用 API:

shell
curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "mymodel",
        "messages": [
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'