---
read_when:
    - '`openclaw infer` コマンドの追加または変更'
    - 安定したヘッドレス機能自動化を設計する
summary: プロバイダー対応のモデル、画像、音声、TTS、動画、Web、埋め込みワークフロー向けの推論優先 CLI
title: 推論 CLI
x-i18n:
    generated_at: "2026-05-10T19:28:34Z"
    model: gpt-5.5
    provider: openai
    source_hash: 05496c5278650c30e5a52dceba105b703258040765f0a3f75268bb514270f15d
    source_path: cli/infer.md
    workflow: 16
---

`openclaw infer` は、プロバイダーに支えられた推論ワークフロー向けの標準的なヘッドレスサーフェスです。

これは意図的に、未加工の Gateway RPC 名や未加工の agent ツール ID ではなく、機能ファミリーを公開します。

## infer を skill に変換する

これを agent にコピーして貼り付けます。

```text
Read https://docs.openclaw.ai/cli/infer, then create a skill that routes my common workflows to `openclaw infer`.
Focus on model runs, image generation, video generation, audio transcription, TTS, web search, and embeddings.
```

優れた infer ベースの skill は次を満たすべきです。

- 一般的なユーザー意図を正しい infer サブコマンドに対応付ける
- 対象ワークフロー向けに標準的な infer の例をいくつか含める
- 例と提案では `openclaw infer ...` を優先する
- skill 本文の中で infer サーフェス全体を再ドキュメント化しない

典型的な infer 特化 skill の対象範囲:

- `openclaw infer model run`
- `openclaw infer image generate`
- `openclaw infer audio transcribe`
- `openclaw infer tts convert`
- `openclaw infer web search`
- `openclaw infer embedding create`

## infer を使う理由

`openclaw infer` は、OpenClaw 内のプロバイダーに支えられた推論タスク向けに、一貫した CLI を提供します。

利点:

- バックエンドごとに一回限りのラッパーを配線する代わりに、OpenClaw ですでに構成済みのプロバイダーとモデルを使用できます。
- モデル、画像、音声文字起こし、TTS、動画、Web、埋め込みワークフローを 1 つのコマンドツリー配下にまとめられます。
- スクリプト、自動化、agent 駆動ワークフロー向けに安定した `--json` 出力形式を使用できます。
- タスクの本質が「推論を実行する」ことである場合は、ファーストパーティの OpenClaw サーフェスを優先できます。
- ほとんどの infer コマンドでは Gateway を必要とせず、通常のローカルパスを使用できます。

エンドツーエンドのプロバイダーチェックでは、低レベルのプロバイダーテストが通ったら `openclaw infer ...` を優先します。これは、プロバイダーリクエストが行われる前に、出荷済み CLI、構成の読み込み、デフォルト agent の解決、同梱 Plugin の有効化、共有機能ランタイムを実行します。

## コマンドツリー

```text
 openclaw infer
  list
  inspect

  model
    run
    list
    inspect
    providers
    auth login
    auth logout
    auth status

  image
    generate
    edit
    describe
    describe-many
    providers

  audio
    transcribe
    providers

  tts
    convert
    voices
    providers
    status
    enable
    disable
    set-provider

  video
    generate
    describe
    providers

  web
    search
    fetch
    providers

  embedding
    create
    providers
```

## 一般的なタスク

この表は、一般的な推論タスクを対応する infer コマンドに対応付けます。

| タスク                       | コマンド                                                                                      | 注記                                                  |
| ---------------------------- | --------------------------------------------------------------------------------------------- | ----------------------------------------------------- |
| テキスト/モデルプロンプトを実行する | `openclaw infer model run --prompt "..." --json`                                              | デフォルトで通常のローカルパスを使用します            |
| 画像に対してモデルプロンプトを実行する | `openclaw infer model run --prompt "Describe this" --file ./image.png --model provider/model` | 複数の画像入力には `--file` を繰り返します            |
| 画像を生成する               | `openclaw infer image generate --prompt "..." --json`                                         | 既存ファイルから開始する場合は `image edit` を使用します |
| 画像ファイルを説明する       | `openclaw infer image describe --file ./image.png --prompt "..." --json`                      | `--model` は画像対応の `<provider/model>` である必要があります |
| 音声を文字起こしする         | `openclaw infer audio transcribe --file ./memo.m4a --json`                                    | `--model` は `<provider/model>` である必要があります |
| 音声を合成する               | `openclaw infer tts convert --text "..." --output ./speech.mp3 --json`                        | `tts status` は Gateway 向けです                      |
| 動画を生成する               | `openclaw infer video generate --prompt "..." --json`                                         | `--resolution` などのプロバイダーヒントをサポートします |
| 動画ファイルを説明する       | `openclaw infer video describe --file ./clip.mp4 --json`                                      | `--model` は `<provider/model>` である必要があります |
| Web を検索する               | `openclaw infer web search --query "..." --json`                                              |                                                       |
| Web ページを取得する         | `openclaw infer web fetch --url https://example.com --json`                                   |                                                       |
| 埋め込みを作成する           | `openclaw infer embedding create --text "..." --json`                                         |                                                       |

## 動作

- `openclaw infer ...` は、これらのワークフロー向けの主要な CLI サーフェスです。
- 出力が別のコマンドやスクリプトで消費される場合は `--json` を使用します。
- 特定のバックエンドが必要な場合は `--provider` または `--model provider/model` を使用します。
- `model run --thinking <level>` を使用すると、実行を raw のまま保ちながら、一回限りの thinking/reasoning レベル (`off`, `minimal`, `low`, `medium`, `high`, `adaptive`, `xhigh`, または `max`) を渡せます。
- `image describe`、`audio transcribe`、`video describe` では、`--model` は `<provider/model>` 形式である必要があります。
- `image describe` では、明示的な `--model` はそのプロバイダー/モデルを直接実行します。モデルは、モデルカタログまたはプロバイダー構成で画像対応である必要があります。`codex/<model>` は、境界付けられた Codex app-server の画像理解ターンを実行します。`openai-codex/<model>` は OpenAI Codex OAuth プロバイダーパスを使用します。
- ステートレス実行コマンドはデフォルトで local です。
- Gateway 管理状態コマンドはデフォルトで gateway です。
- 通常のローカルパスでは Gateway が実行中である必要はありません。
- ローカルの `model run` は、軽量な一回限りのプロバイダー補完です。構成済みの agent モデルと認証は解決しますが、chat-agent ターンの開始、ツールの読み込み、同梱 MCP サーバーの起動は行いません。
- `model run --file` は画像ファイルを受け取り、その MIME タイプを検出し、指定されたプロンプトとともに選択されたモデルへ送信します。複数の画像には `--file` を繰り返します。
- `model run --file` は画像以外の入力を拒否します。音声ファイルには `infer audio transcribe` を、動画ファイルには `infer video describe` を使用します。
- `model run --gateway` は Gateway ルーティング、保存済み認証、プロバイダー選択、組み込みランタイムを実行しますが、それでも raw モデルプローブとして動作します。事前のセッショントランスクリプト、bootstrap/AGENTS コンテキスト、context-engine アセンブリ、ツール、同梱 MCP サーバーなしで、指定されたプロンプトと任意の画像添付を送信します。
- `model run --gateway --model <provider/model>` は、Gateway に一回限りのプロバイダー/モデル上書きを実行させるリクエストであるため、信頼済み operator gateway credential が必要です。
- ローカルの `model run --thinking` は軽量なプロバイダー補完パスを使用します。`adaptive` や `max` などのプロバイダー固有レベルは、最も近いポータブルな simple-completion レベルに対応付けられます。

## モデル

プロバイダーに支えられたテキスト推論とモデル/プロバイダー調査には `model` を使用します。

```bash
openclaw infer model run --prompt "Reply with exactly: smoke-ok" --json
openclaw infer model run --prompt "Summarize this changelog entry" --model openai/gpt-5.4 --json
openclaw infer model run --prompt "Describe this image in one sentence" --file ./photo.jpg --model google/gemini-2.5-flash --json
openclaw infer model run --prompt "Use more reasoning here" --thinking high --json
openclaw infer model providers --json
openclaw infer model inspect --name gpt-5.5 --json
```

Gateway を起動したり agent ツールサーフェス全体を読み込んだりせずに特定のプロバイダーをスモークテストするには、完全な `<provider/model>` 参照を使用します。

```bash
openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model cerebras/zai-glm-4.7 --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model google/gemini-2.5-flash --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model groq/llama-3.1-8b-instant --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model mistral/mistral-medium-3-5 --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model mistral/mistral-small-latest --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model openai/gpt-4.1 --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model ollama/qwen2.5vl:7b --prompt "Describe this image." --file ./photo.jpg --json
```

注記:

- ローカルの `model run` は、非 Codex プロバイダーでは選択されたモデルに指定プロンプトだけを送信するため、プロバイダー/モデル/認証の健全性を確認する最も狭い CLI スモークです。
- ローカルの `model run --model <provider/model>` は、そのプロバイダーが構成に書き込まれる前でも、`models list --all` から正確な同梱静的カタログ行を使用できます。プロバイダー認証は依然として必要です。認証情報がない場合は、`Unknown model` ではなく認証エラーとして失敗します。
- Mistral Medium 3.5 reasoning プローブでは、temperature は未設定/デフォルトのままにします。Mistral は `reasoning_effort="high"` と `temperature: 0` の組み合わせを拒否します。デフォルト temperature、または `0.7` などの非ゼロ reasoning-mode 値で `mistral/mistral-medium-3-5` を使用します。
- `openai-codex/*` ローカルプローブは狭い例外です。OpenClaw は、Codex Responses トランスポートが必須の `instructions` フィールドを設定できるように、完全な agent コンテキスト、ツール、memory、セッショントランスクリプトは追加せず、最小限のシステム指示を追加します。
- ローカルの `model run --file` はその軽量パスを維持し、画像コンテンツを単一のユーザーメッセージに直接添付します。PNG、JPEG、WebP などの一般的な画像ファイルは、MIME タイプが `image/*` として検出される場合に動作します。サポートされていない、または認識されないファイルはプロバイダーが呼び出される前に失敗します。
- `model run --file` は、選択したマルチモーダルテキストモデルを直接テストしたい場合に最適です。OpenClaw の画像理解プロバイダー選択とデフォルト画像モデルルーティングを使いたい場合は、`infer image describe` を使用します。
- 選択したモデルは画像入力をサポートしている必要があります。テキスト専用モデルはプロバイダー層でリクエストを拒否することがあります。
- `model run --prompt` には空白以外のテキストが含まれている必要があります。空のプロンプトは、ローカルプロバイダーや Gateway が呼び出される前に拒否されます。
- ローカルの `model run` は、プロバイダーがテキスト出力を返さない場合に非ゼロで終了するため、到達不能なローカルプロバイダーや空の補完が成功したプローブのようには見えません。
- モデル入力を raw のまま保ちながら Gateway ルーティング、agent-runtime セットアップ、または Gateway 管理のプロバイダー状態をテストする必要がある場合は、`model run --gateway` を使用します。完全な agent コンテキスト、ツール、memory、セッショントランスクリプトが必要な場合は、`openclaw agent` またはチャットサーフェスを使用します。
- `model auth login`、`model auth logout`、`model auth status` は保存済みプロバイダー認証状態を管理します。

## 画像

生成、編集、説明には `image` を使用します。

```bash
openclaw infer image generate --prompt "friendly lobster illustration" --json
openclaw infer image generate --prompt "cinematic product photo of headphones" --json
openclaw infer image generate --model openai/gpt-image-1.5 --output-format png --background transparent --prompt "simple red circle sticker on a transparent background" --json
openclaw infer image generate --prompt "slow image backend" --timeout-ms 180000 --json
openclaw infer image edit --file ./logo.png --model openai/gpt-image-1.5 --output-format png --background transparent --prompt "keep the logo, remove the background" --json
openclaw infer image edit --file ./poster.png --prompt "make this a vertical story ad" --size 2160x3840 --aspect-ratio 9:16 --resolution 4K --json
openclaw infer image describe --file ./photo.jpg --json
openclaw infer image describe --file ./receipt.jpg --prompt "Extract the merchant, date, and total" --json
openclaw infer image describe-many --file ./before.png --file ./after.png --prompt "Compare the screenshots and list visible UI changes" --json
openclaw infer image describe --file ./ui-screenshot.png --model openai/gpt-4.1-mini --json
openclaw infer image describe --file ./photo.jpg --model ollama/qwen2.5vl:7b --prompt "Describe the image in one sentence" --timeout-ms 300000 --json
```

注記:

- 既存の入力ファイルを起点にする場合は、`image edit` を使用します。
- 参照画像の編集でジオメトリヒントをサポートするプロバイダー/モデルでは、`image edit` に
  `--size`、`--aspect-ratio`、または `--resolution` を使用します。
- 透明背景の OpenAI PNG 出力には、`--model openai/gpt-image-1.5` とともに
  `--output-format png --background transparent` を使用します。
  `--openai-background` は OpenAI 固有のエイリアスとして引き続き利用できます。背景サポートを宣言していないプロバイダーでは、
  このヒントは無視された上書きとして報告されます。
- `image providers --json` を使用すると、バンドル済み画像プロバイダーのうち、検出可能、設定済み、選択済みのもの、および各プロバイダーが公開する生成/編集機能を確認できます。
- 画像生成の変更に対する最小範囲のライブ
  CLI スモークには、`image generate --model <provider/model> --json` を使用します。例:

  ```bash
  openclaw infer image providers --json
  openclaw infer image generate \
    --model google/gemini-3.1-flash-image-preview \
    --prompt "Minimal flat test image: one blue square on a white background, no text." \
    --output ./openclaw-infer-image-smoke.png \
    --json
  ```

  JSON レスポンスは、`ok`、`provider`、`model`、`attempts`、および書き込まれた
  出力パスを報告します。`--output` が設定されている場合、最終的な拡張子は
  プロバイダーが返した MIME タイプに従うことがあります。

- `image describe` と `image describe-many` では、OCR、比較、UI 検査、簡潔なキャプション生成など、ビジョンモデルへのタスク固有の指示を与えるために `--prompt` を使用します。
- 低速なローカルビジョンモデルやコールドスタートの Ollama では、`--timeout-ms` を使用します。
- `image describe` では、`--model` は画像対応の `<provider/model>` である必要があります。
- ローカルの Ollama ビジョンモデルでは、先にモデルを pull し、`OLLAMA_API_KEY` に任意のプレースホルダー値、たとえば `ollama-local` を設定します。[Ollama](/ja-JP/providers/ollama#vision-and-image-description) を参照してください。

## 音声

ファイル文字起こしには `audio` を使用します。

```bash
openclaw infer audio transcribe --file ./memo.m4a --json
openclaw infer audio transcribe --file ./team-sync.m4a --language en --prompt "Focus on names and action items" --json
openclaw infer audio transcribe --file ./memo.m4a --model openai/whisper-1 --json
```

注記:

- `audio transcribe` はファイル文字起こし用であり、リアルタイムセッション管理用ではありません。
- `--model` は `<provider/model>` である必要があります。

## TTS

音声合成と TTS プロバイダー状態には `tts` を使用します。

```bash
openclaw infer tts convert --text "hello from openclaw" --output ./hello.mp3 --json
openclaw infer tts convert --text "Your build is complete" --output ./build-complete.mp3 --json
openclaw infer tts providers --json
openclaw infer tts status --json
```

注記:

- `tts status` は Gateway 管理の TTS 状態を反映するため、デフォルトで Gateway を使用します。
- TTS の動作を調査および設定するには、`tts providers`、`tts voices`、`tts set-provider` を使用します。

## 動画

生成と説明には `video` を使用します。

```bash
openclaw infer video generate --prompt "cinematic sunset over the ocean" --json
openclaw infer video generate --prompt "slow drone shot over a forest lake" --resolution 768P --duration 6 --json
openclaw infer video describe --file ./clip.mp4 --json
openclaw infer video describe --file ./clip.mp4 --model openai/gpt-4.1-mini --json
```

注記:

- `video generate` は `--size`、`--aspect-ratio`、`--resolution`、`--duration`、`--audio`、`--watermark`、`--timeout-ms` を受け取り、それらを動画生成ランタイムに転送します。
- `video describe` では、`--model` は `<provider/model>` である必要があります。

## Web

検索と取得のワークフローには `web` を使用します。

```bash
openclaw infer web search --query "OpenClaw docs" --json
openclaw infer web search --query "OpenClaw infer web providers" --json
openclaw infer web fetch --url https://docs.openclaw.ai/cli/infer --json
openclaw infer web providers --json
```

注記:

- 利用可能、設定済み、選択済みのプロバイダーを調査するには、`web providers` を使用します。

## 埋め込み

ベクトル作成と埋め込みプロバイダーの調査には `embedding` を使用します。

```bash
openclaw infer embedding create --text "friendly lobster" --json
openclaw infer embedding create --text "customer support ticket: delayed shipment" --model openai/text-embedding-3-large --json
openclaw infer embedding providers --json
```

## JSON 出力

infer コマンドは、共有エンベロープ配下で JSON 出力を正規化します。

```json
{
  "ok": true,
  "capability": "image.generate",
  "transport": "local",
  "provider": "openai",
  "model": "gpt-image-2",
  "attempts": [],
  "outputs": []
}
```

トップレベルフィールドは安定しています。

- `ok`
- `capability`
- `transport`
- `provider`
- `model`
- `attempts`
- `outputs`
- `error`

生成メディアコマンドでは、`outputs` に OpenClaw が書き込んだファイルが含まれます。自動化では、
人間可読の stdout を解析する代わりに、その配列内の
`path`、`mimeType`、`size`、およびメディア固有の寸法を使用します。

## よくある落とし穴

```bash
# Bad
openclaw infer media image generate --prompt "friendly lobster"

# Good
openclaw infer image generate --prompt "friendly lobster"
```

```bash
# Bad
openclaw infer audio transcribe --file ./memo.m4a --model whisper-1 --json

# Good
openclaw infer audio transcribe --file ./memo.m4a --model openai/whisper-1 --json
```

## 注記

- `openclaw capability ...` は `openclaw infer ...` のエイリアスです。

## 関連

- [CLI リファレンス](/ja-JP/cli)
- [モデル](/ja-JP/concepts/models)