Ollama AI

一个用于与 Ollama API 交互的 Ruby gem，允许你在本地运行开源 AI 大语言模型 (LLM)。

这张图片展示了一只羊驼的头部与一颗红色红宝石宝石融合在一起，背景为浅米色。红色的切面同时形成了红宝石和羊驼的轮廓，创造出一个巧妙的视觉融合效果。

这个 Gem 旨在提供对 Ollama 的低级访问，使人们能够在其基础上构建抽象。如果你对更高级的抽象或更加用户友好的工具感兴趣，可以考虑使用 Nano Bots 💎 🤖。

简介和快速入门

gem 'ollama-ai', '~> 1.3.0'

require 'ollama-ai'

client = Ollama.new(
  credentials: { address: 'http://localhost:11434' },
  options: { server_sent_events: true }
)

result = client.generate(
  { model: 'llama2',
    prompt: 'Hi!' }
)

结果：

[{ 'model' => 'llama2',
   'created_at' => '2024-01-07T01:34:02.088810408Z',
   'response' => 'Hello',
   'done' => false },
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:34:02.419045606Z',
   'response' => '!',
   'done' => false },
 # ..
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:34:07.680049831Z',
   'response' => '?',
   'done' => false },
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:34:07.872170352Z',
   'response' => '',
   'done' => true,
   'context' =>
     [518, 25_580,
      # ...
      13_563, 29_973],
   'total_duration' => 11_653_781_127,
   'load_duration' => 1_186_200_439,
   'prompt_eval_count' => 22,
   'prompt_eval_duration' => 5_006_751_000,
   'eval_count' => 25,
   'eval_duration' => 5_453_058_000 }]

设置

安装

gem install ollama-ai -v 1.3.0

gem 'ollama-ai', '~> 1.3.0'

使用方法

客户端

创建一个新客户端：

require 'ollama-ai'

client = Ollama.new(
  credentials: { address: 'http://localhost:11434' },
  options: { server_sent_events: true }
)

Bearer 认证

require 'ollama-ai'

client = Ollama.new(
  credentials: {
    address: 'http://localhost:11434',
    bearer_token: 'eyJhbG...Qssw5c'
  },
  options: { server_sent_events: true }
)

请记住，在代码中硬编码你的凭证是不安全的。最好使用环境变量：

require 'ollama-ai'
client = Ollama.new(
  credentials: {
    address: 'http://localhost:11434',
    bearer_token: ENV['OLLAMA_BEARER_TOKEN']
  },
  options: { server_sent_events: true }
)

方法

client.generate
client.chat
client.embeddings

client.create
client.tags
client.show
client.copy
client.delete
client.pull
client.push

generate: 生成补全

API文档：https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion

不使用流式事件

API文档：https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion

result = client.generate(
  { model: 'llama2',
    prompt: 'Hi!',
    stream: false }
)

结果：

[{ 'model' => 'llama2',
   'created_at' => '2024-01-07T01:35:41.951371247Z',
   'response' => "嗨！很高兴认识你。今天你好吗？",
   'done' => true,
   'context' =>
     [518, 25_580,
      # ...
      9826, 29_973],
   'total_duration' => 6_981_097_576,
   'load_duration' => 625_053,
   'prompt_eval_count' => 22,
   'prompt_eval_duration' => 4_075_171_000,
   'eval_count' => 16,
   'eval_duration' => 2_900_325_000 }]

接收流式事件

API文档：https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion

在使用块进行流式处理之前，请确保已启用服务器发送事件。无需设置stream: true，因为true是默认值：

client.generate(
  { model: 'llama2',
    prompt: 'Hi!' }
) do |event, raw|
  puts event
end

事件：

{ 'model' => 'llama2',
  'created_at' => '2024-01-07T01:36:30.665245712Z',
  'response' => '你好',
  'done' => false }

你可以一次性获取所有接收到的事件作为数组：

result = client.generate(
  { model: 'llama2',
    prompt: 'Hi!' }
)

结果：

[{ 'model' => 'llama2',
   'created_at' => '2024-01-07T01:36:30.665245712Z',
   'response' => '你好',
   'done' => false },
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:36:30.927337136Z',
   'response' => '！',
   'done' => false },
 # ...
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:36:37.249416767Z',
   'response' => '？',
   'done' => false },
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:36:37.44041283Z',
   'response' => '',
   'done' => true,
   'context' =>
     [518, 25_580,
      # ...
      13_563, 29_973],
   'total_duration' => 10_551_395_645,
   'load_duration' => 966_631,
   'prompt_eval_count' => 22,
   'prompt_eval_duration' => 4_034_990_000,
   'eval_count' => 25,
   'eval_duration' => 6_512_954_000 }]

你也可以同时使用两种方式：

result = client.generate(
  { model: 'llama2',
    prompt: 'Hi!' }
) do |event, raw|
  puts event
end

chat: 生成聊天补全

API文档：https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-chat-completion

result = client.chat(
  { model: 'llama2',
    messages: [
      { role: 'user', content: '嗨！我的名字是紫色。' }
    ] }
) do |event, raw|
  puts event
end

事件：

{ 'model' => 'llama2',
  'created_at' => '2024-01-07T01:38:01.729897311Z',
  'message' => { 'role' => 'assistant', 'content' => "\n" },
  'done' => false }

结果：

[{ 'model' => 'llama2',
   'created_at' => '2024-01-07T01:38:01.729897311Z',
   'message' => { 'role' => 'assistant', 'content' => "\n" },
   'done' => false },
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:38:02.081494506Z',
   'message' => { 'role' => 'assistant', 'content' => '*' },
   'done' => false },
 # ...
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:38:17.855905499Z',
   'message' => { 'role' => 'assistant', 'content' => '?' },
   'done' => false },
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:38:18.07331245Z',
   'message' => { 'role' => 'assistant', 'content' => '' },
   'done' => true,
   'total_duration' => 22_494_544_502,
   'load_duration' => 4_224_600,
   'prompt_eval_count' => 28,
   'prompt_eval_duration' => 6_496_583_000,
   'eval_count' => 61,
   'eval_duration' => 15_991_728_000 }]

来回对话

API文档：https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-chat-completion

为了维持来回对话，你需要附加收到的回复并为你的请求建立历史记录：

result = client.chat(
  { model: 'llama2',
    messages: [
      { role: 'user', content: '你好！我的名字是Purple。' },
      { role: 'assistant',
        content: '你好，Purple！' },
      { role: 'user', content: "我的名字是什么？" }
    ] }
) do |event, raw|
  puts event
end

事件：

{ 'model' => 'llama2',
  'created_at' => '2024-01-07T01:40:07.352998498Z',
  'message' => { 'role' => 'assistant', 'content' => ' Pur' },
  'done' => false }

结果：

[{ 'model' => 'llama2',
   'created_at' => '2024-01-07T01:40:06.562939469Z',
   'message' => { 'role' => 'assistant', 'content' => '你的' },
   'done' => false },
 # ...
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:40:07.352998498Z',
   'message' => { 'role' => 'assistant', 'content' => ' Pur' },
   'done' => false },
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:40:07.545323584Z',
   'message' => { 'role' => 'assistant', 'content' => 'ple' },
   'done' => false },
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:40:07.77769408Z',
   'message' => { 'role' => 'assistant', 'content' => '！' },
   'done' => false },
 { 'model' => 'llama2',
   'created_at' => '2024-01-07T01:40:07.974165849Z',
   'message' => { 'role' => 'assistant', 'content' => '' },
   'done' => true,
   'total_duration' => 11_482_012_681,
   'load_duration' => 4_246_882,
   'prompt_eval_count' => 57,
   'prompt_eval_duration' => 10_387_150_000,
   'eval_count' => 6,
   'eval_duration' => 1_089_249_000 }]

embeddings: 生成嵌入

API文档：https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-embeddings

result = client.embeddings(
  { model: 'llama2',
    prompt: '你好！' }
)

结果：

[{ 'embedding' =>
   [0.6970467567443848, -2.248202085494995,
    # ...
    -1.5994540452957153, -0.3464218080043793] }]

模型

create: 创建模型

API文档：https://github.com/jmorganca/ollama/blob/main/docs/api.md#create-a-model

result = client.create(
  { name: 'mario',
    modelfile: "FROM llama2\nSYSTEM 你是来自超级马里奥兄弟的马里奥。" }
) do |event, raw|
  puts event
end

事件：

{ 'status' => '正在读取模型元数据' }

结果:

[{ 'status' => '正在读取模型元数据' },
 { 'status' => '正在创建系统层' },
 { 'status' =>
   '使用已创建的层 sha256:4eca7304a07a42c48887f159ef5ad82ed5a5bd30fe52db4aadae1dd938e26f70' },
 { 'status' =>
   '使用已创建的层 sha256:876a8d805b60882d53fed3ded3123aede6a996bdde4a253de422cacd236e33d3' },
 { 'status' =>
   '使用已创建的层 sha256:a47b02e00552cd7022ea700b1abf8c572bb26c9bc8c1a37e01b566f2344df5dc' },
 { 'status' =>
   '使用已创建的层 sha256:f02dd72bb2423204352eabc5637b44d79d17f109fdb510a7c51455892aa2d216' },
 { 'status' =>
   '正在写入层 sha256:1741cf59ce26ff01ac614d31efc700e21e44dd96aed60a7c91ab3f47e440ef94' },
 { 'status' =>
   '正在写入层 sha256:e8bcbb2eebad88c2fa64bc32939162c064be96e70ff36aff566718fc9186b427' },
 { 'status' => '正在写入清单' },
 { 'status' => '成功' }]

创建完成后，您可以使用它:

client.generate(
  { model: 'mario',
    prompt: '嗨！你是谁？' }
) do |event, raw|
  print event['response']
end

哇哦！调整太阳镜 是我，马里奥！眨眼你一定是我在蘑菇王国遇到的新朋友。摘下礼帽 是什么风把你吹到这片树林来了？也许你在寻求冒险的帮助？点头告诉我就行，我会尽力帮助你的！😃

tags: 列出本地模型

API文档: https://github.com/jmorganca/ollama/blob/main/docs/api.md#list-local-models

result = client.tags

结果:

[{ 'models' =>
   [{ 'name' => 'llama2:latest',
      'modified_at' => '2024-01-06T15:06:23.6349195-03:00',
      'size' => 3_826_793_677,
      'digest' =>
      '78e26419b4469263f75331927a00a0284ef6544c1975b826b15abdaef17bb962',
      'details' =>
      { 'format' => 'gguf',
        'family' => 'llama',
        'families' => ['llama'],
        'parameter_size' => '7B',
        'quantization_level' => 'Q4_0' } },
    { 'name' => 'mario:latest',
      'modified_at' => '2024-01-06T22:41:59.495298101-03:00',
      'size' => 3_826_793_787,
      'digest' =>
      '291f46d2fa687dfaff45de96a8cb6e32707bc16ec1e1dfe8d65e9634c34c660c',
      'details' =>
      { 'format' => 'gguf',
        'family' => 'llama',
        'families' => ['llama'],
        'parameter_size' => '7B',
        'quantization_level' => 'Q4_0' } }] }]

show: 显示模型信息

API文档: https://github.com/jmorganca/ollama/blob/main/docs/api.md#show-model-information

result = client.show(
  { name: 'llama2' }
)

结果:

[{ 'license' =>
     "LLAMA 2 社区许可协议\t\n" \
     # ...
     "* 报告违反可接受使用政策或未经许可使用Llama的行为..." \
     "\n",
   'modelfile' =>
     "# 由\"ollama show\"生成的Modelfile\n" \
     # ...
     'PARAMETER stop "<</SYS>>"',
   'parameters' =>
     "stop                           [INST]\n" \
     "stop                           [/INST]\n" \
     "stop                           <<SYS>>\n" \
     'stop                           <</SYS>>',
     'template' =>
     "[INST] <<SYS>>{{ .System }}<</SYS>>\n\n{{ .Prompt }} [/INST]\n",
   'details' =>
     { 'format' => 'gguf',
       'family' => 'llama',
       'families' => ['llama'],
       'parameter_size' => '7B',
       'quantization_level' => 'Q4_0' } }]

copy: 复制模型

API文档: https://github.com/jmorganca/ollama/blob/main/docs/api.md#copy-a-model

result = client.copy(
  { source: 'llama2',
    destination: 'llama2-backup' }
)

结果:

true

如果源模型不存在:

begin
  result = client.copy(
    { source: 'purple',
      destination: 'purple-backup' }
  )
rescue Ollama::Errors::OllamaError => error
  puts error.class # Ollama::Errors::RequestError
  puts error.message # '服务器响应状态码404'

  puts error.payload
  # { source: 'purple',
  #   destination: 'purple-backup',
  #   ...
  # }

  puts error.request.inspect
  # #<Faraday::ResourceNotFound response={:status=>404, :headers...
end

delete: 删除模型

API文档：https://github.com/jmorganca/ollama/blob/main/docs/api.md#delete-a-model

result = client.delete(
  { name: 'llama2' }
)

结果：

true

如果模型不存在：

begin
  result = client.delete(
    { name: 'llama2' }
  )
rescue Ollama::Errors::OllamaError => error
  puts error.class # Ollama::Errors::RequestError
  puts error.message # '服务器返回状态码404'

  puts error.payload
  # { name: 'llama2',
  #   ...
  # }

  puts error.request.inspect
  # #<Faraday::ResourceNotFound response={:status=>404, :headers...
end

pull：拉取模型

API文档：https://github.com/jmorganca/ollama/blob/main/docs/api.md#pull-a-model

result = client.pull(
  { name: 'llama2' }
) do |event, raw|
  puts event
end

事件：

{ 'status' => '正在拉取清单' }

结果：

[{ 'status' => '正在拉取清单' },
 { 'status' => '正在拉取 4eca7304a07a',
   'digest' =>
   'sha256:4eca7304a07a42c48887f159ef5ad82ed5a5bd30fe52db4aadae1dd938e26f70',
   'total' => 1_602_463_008,
   'completed' => 1_602_463_008 },
 # ...
 { 'status' => '正在验证sha256摘要' },
 { 'status' => '正在写入清单' },
 { 'status' => '正在移除任何未使用的层' },
 { 'status' => '成功' }]

push：推送模型

文档：API和发布你的模型。

你需要在https://ollama.ai创建一个账户，并在https://ollama.ai/settings/keys添加你的公钥。

你的密钥位于/usr/share/ollama/.ollama/。你可能需要将它们复制到你的用户目录：

sudo cp /usr/share/ollama/.ollama/id_ed25519 ~/.ollama/
sudo cp /usr/share/ollama/.ollama/id_ed25519.pub ~/.ollama/

将你的模型复制到你的用户命名空间：

client.copy(
  { source: 'mario',
    destination: 'your-user/mario' }
)

然后推送它：

result = client.push(
  { name: 'your-user/mario' }
) do |event, raw|
  puts event
end

事件：

{ 'status' => '正在检索清单' }

结果：

[{ 'status' => '正在检索清单' },
 { 'status' => '正在推送 4eca7304a07a',
   'digest' =>
   'sha256:4eca7304a07a42c48887f159ef5ad82ed5a5bd30fe52db4aadae1dd938e26f70',
   'total' => 1_602_463_008,
   'completed' => 1_602_463_008 },
 # ...
 { 'status' => '正在推送 e8bcbb2eebad',
   'digest' =>
   'sha256:e8bcbb2eebad88c2fa64bc32939162c064be96e70ff36aff566718fc9186b427',
   'total' => 555,
   'completed' => 555 },
 { 'status' => '正在推送清单' },
 { 'status' => '成功' }]

模式

文本

你可以使用generate或chat方法处理文本。

图像

一张黑白老式钢琴的图片。这是一架立式钢琴，琴键位于图像的右侧。钢琴放置在铺有瓷砖的地板上。钢琴顶部有一个小圆形物体。

图片来源：Unsplash

你需要选择一个支持图像的模型，如LLaVA或bakllava，并将图像编码为Base64格式。

根据你的硬件，某些支持图像的模型可能运行较慢，因此你可能需要增加客户端的超时时间：

client = Ollama.new(
  credentials: { address: 'http://localhost:11434' },
  options: {
    server_sent_events: true,
    connection: { request: { timeout: 120, read_timeout: 120 } } }
)

使用generate方法：

require 'base64'

client.generate(
  { model: 'llava',
    prompt: '请描述这张图片。',
    images: [Base64.strict_encode64(File.read('piano.jpg'))] }
) do |event, raw|
  print event['response']
end

输出：

这是一张黑白照片，显示了一架老式钢琴，看起来需要维护。钢琴旁边放着一把椅子。除此之外，画面中没有其他物品或人物。

使用chat方法：

require 'base64'
```ruby
result = client.chat(
  { model: 'llava',
    messages: [
      { role: 'user',
        content: '请描述这张图片。',
        images: [Base64.strict_encode64(File.read('piano.jpg'))] }
    ] }
) do |event, raw|
  puts event
end

输出:

图片展示了一架老式钢琴，放置在木地板上，带有黑色琴键。钢琴旁边还有另一个键盘，可能用于演奏音乐。

钢琴顶部有两只鼠标放置在不同位置。这些鼠标可能用于控制正在播放的音乐，或者只是装饰品。整体氛围似乎聚焦于通过这件独特乐器表达艺术。

流式传输和服务器发送事件(SSE)

服务器发送事件(SSE)是一种技术，允许某些端点提供流式传输功能，比如创造"模型正在与你一起打字"的印象，而不是一次性提供整个答案。

你可以设置客户端对所有支持的端点使用服务器发送事件(SSE):

client = Ollama.new(
  credentials: { address: 'http://localhost:11434' },
  options: { server_sent_events: true }
)

或者，你可以根据请求决定:

result = client.generate(
  { model: 'llama2',
    prompt: '你好!' },
  server_sent_events: true
) do |event, raw|
  puts event
end

启用服务器发送事件(SSE)后，你可以使用代码块通过事件接收部分结果。这个功能对于提供流式传输功能的方法特别有用，比如generate: 接收流事件

服务器发送事件(SSE)挂起

方法调用会_挂起_直到服务器发送事件结束，所以即使不提供代码块，你也可以获得接收事件的最终结果: 接收流事件

新功能和API

Ollama可能会推出我们在Gem中尚未涵盖的新端点。如果是这种情况，你可能仍然可以通过request方法使用它。例如，generate只是api/generate的包装器，你可以像这样直接调用它:

result = client.request(
  'api/generate',
  { model: 'llama2',
    prompt: '你好!' },
  request_method: 'POST', server_sent_events: true
)

请求选项

适配器

这个gem默认使用Faraday和Typhoeus适配器。

如果你想的话，可以使用不同的适配器:

require 'faraday/net_http'

client = Ollama.new(
  credentials: { address: 'http://localhost:11434' },
  options: { connection: { adapter: :net_http } }
)

超时

你可以使用timeout选项设置等待请求完成的最长秒数:

client = Ollama.new(
  credentials: { address: 'http://localhost:11434' },
  options: { connection: { request: { timeout: 5 } } }
)

如果你喜欢的话，也可以对Faraday的请求选项进行更精细的控制:

client = Ollama.new(
  credentials: { address: 'http://localhost:11434' },
  options: {
    connection: {
      request: {
        timeout: 5,
        open_timeout: 5,
        read_timeout: 5,
        write_timeout: 5
      }
    }
  }
)

错误处理

捕获

require 'ollama-ai'

begin
  client.chat_completions(
    { model: 'llama2',
      prompt: '你好!' }
  )
rescue Ollama::Errors::OllamaError => error
  puts error.class # Ollama::Errors::RequestError
  puts error.message # '服务器响应状态码500'

  puts error.payload
  # { model: 'llama2',
  #   prompt: '你好!',
  #   ...
  # }

  puts error.request.inspect
  # #<Faraday::ServerError response={:status=>500, :headers...
end

简短形式

require 'ollama-ai/errors'

begin
  client.chat_completions(
    { model: 'llama2',
      prompt: '你好!' }
  )
rescue OllamaError => error
  puts error.class # Ollama::Errors::RequestError
end

错误

OllamaError

BlockWithoutServerSentEventsError

RequestError

开发

bundle
rubocop -A

bundle exec ruby spec/tasks/run-client.rb
bundle exec ruby spec/tasks/test-encoding.rb

目的

这个Gem旨在提供对Ollama的低级访问，使人们能够在其之上构建抽象。如果你对更高级的抽象或更用户友好的工具感兴趣，你可能想考虑Nano Bots 💎 🤖。

发布到RubyGems

gem build ollama-ai.gemspec

gem signin

gem push ollama-ai-1.3.0.gem

更新README

安装Babashka:

curl -s https://raw.githubusercontent.com/babashka/babashka/master/install | sudo bash

更新template.md文件，然后:

bb tasks/generate-readme.clj

当template.md发生变化时自动更新README.md的技巧：

sudo pacman -S inotify-tools # Arch / Manjaro
sudo apt-get install inotify-tools # Debian / Ubuntu / Raspberry Pi OS
sudo dnf install inotify-tools # Fedora / CentOS / RHEL

while inotifywait -e modify template.md; do bb tasks/generate-readme.clj; done

Markdown实时预览的技巧：

pip install -U markdown_live_preview

mlp README.md -p 8076

资源和参考

这些资源和参考在您的学习过程中可能会有用：

免责声明

这不是Ollama的官方项目，也与Ollama没有任何关联。

本软件在MIT许可证下分发。该许可证包含免责声明。此外，作者对使用本项目可能导致的任何损害或费用不承担任何责任。使用Ollama AI Ruby Gem的风险由您自行承担。

ollama-ai

Ollama AI

简介和快速入门

目录

设置

安装

使用方法

客户端

Bearer 认证

方法

generate: 生成补全

不使用流式事件

接收流式事件

chat: 生成聊天补全

来回对话

embeddings: 生成嵌入

模型

create: 创建模型

tags: 列出本地模型

show: 显示模型信息

copy: 复制模型

delete: 删除模型

pull：拉取模型

push：推送模型

模式

文本

图像

流式传输和服务器发送事件(SSE)

服务器发送事件(SSE)挂起

新功能和API

请求选项

适配器

超时

错误处理

捕获

简短形式

错误

开发

目的

发布到RubyGems

更新README

资源和参考

免责声明

编辑推荐精选

Vora

Refly.AI

酷表ChatExcel

TRAE编程

AIWritePaper论文写作

博思AIPPT

潮际好麦

iTerms

SimilarWeb流量提升

Sora2视频免费生成

探索AI的无限可能

推荐工具精选

TRAE编程

扣子-AI办公

讯飞文书

商汤小浣熊

讯飞绘文

讯飞绘镜

iTerms

AI云服务特惠

火山引擎

阿里云

腾讯云

华为云

百度智能云

AWS

关注微信公众号