Eval 結果クエリ - Weights & Biases Documentation

curl --request POST \ --url https://api.example.com/v2/{entity}/{project}/eval_results/query \ --header 'Authorization: Basic <encoded-value>' \ --header 'Content-Type: application/json' \ --data ' { "evaluation_call_ids": [ "<string>" ], "evaluation_run_ids": [ "<string>" ], "filters": [ { "query": { "$expr": { "$and": [ { "$literal": "<string>" } ] } }, "evaluation_call_id": "<string>" } ], "include_predict_and_score_children": true, "include_raw_data_rows": false, "include_rows": true, "include_summary": false, "limit": 123, "offset": 0, "require_intersection": false, "resolve_row_refs": false, "sort_by": [ { "field": "<string>", "evaluation_call_id": "<string>", "mode": "value" } ], "summary_require_intersection": true } '

{ "rows": [ { "row_digest": "<string>", "evaluations": [ { "evaluation_call_id": "<string>", "trials": [ { "predict_and_score_call_id": "<string>", "model_latency_seconds": 123, "model_output": null, "predict_call_id": "<string>", "scorer_call_ids": {}, "scores": {}, "total_tokens": 123 } ] } ], "raw_data_row": null } ], "total_rows": 123, "summary": { "evaluations": [ { "evaluation_call_id": "<string>", "display_name": "<string>", "evaluation_ref": "<string>", "model_ref": "<string>", "scorer_stats": [ { "scorer_key": "<string>", "numeric_count": 0, "numeric_mean": 123, "pass_known_count": 0, "pass_rate": 123, "pass_signal_coverage": 123, "pass_true_count": 0, "path": "<string>", "trial_count": 0 } ], "started_at": "<string>", "trace_id": "<string>", "trial_count": 0 } ], "row_count": 0 }, "warnings": [ "<string>" ] }

承認

Authorization

string

header

必須

Basic authentication header of the form Basic <encoded-value>, where <encoded-value> is the base64-encoded string username:password.

パスパラメータ

entity

string

必須

project

string

必須

ボディ

application/json

evaluation_call_ids

string[] | null

含める評価ルート Call ID。

evaluation_run_ids

string[] | null

Evaluation Runs API の評価 Call ID のエイリアス。

filters

EvalResultsFilter · object[] | null

グループ化された行に適用されるフィルターです。複数のフィルターは AND 条件で結合されます。

Show child attributes

include_predict_and_score_children

boolean

デフォルト:true

true の場合（デフォルト）、各 predict_and_score Call の子 Call（predict/score）を取得して、predict_call_id、scorer_call_ids、さらにより正確なレイテンシ/トークンデータを取得します。false の場合、これらのフィールドは predict_and_score Call 自体から導出されます（predict_call_id と scorer_call_ids は null/空になります）。

include_raw_data_rows

boolean

デフォルト:false

true の場合、各結果行の raw_data_row を設定します。インライン行は dict 値として返され、データセット参照行は、resolve_row_refs も true の場合を除き、ref 文字列として返されます。

include_rows

boolean

デフォルト:true

true の場合、グループ化された行/試行データを rows に含め、要求された行レベル view の total_rows を計算します。

include_summary

boolean

デフォルト:false

true の場合、集約された Scorer/評価の summary データを summary に含めます。

limit

integer | null

グループ化と積集合の後に適用されるオプションの行レベルのページサイズ。

offset

integer

デフォルト:0

グループ化と積集合の後に適用されるオプションの行レベルのページオフセット。

require_intersection

boolean

デフォルト:false

true の場合、要求されたすべての評価に存在する行のみを含めます。

resolve_row_refs

boolean

デフォルト:false

true の場合（include_raw_data_rows=True が必要）、表のルックアップによってデータセット行の参照文字列を実際の行データに解決します。false の場合、データセット行 ref はそのまま返されます。

sort_by

EvalResultsSortBy · object[] | null

結果行の並べ替え指定です。サポートされるフィールド接頭辞: scores., inputs., outputs.。null の場合、行は row_digest の昇順で並べ替えられます。

Show child attributes

summary_require_intersection

boolean | null

summary セクションに対するオプションの積集合動作です。null の場合は、require_intersection の値が使用されます。

レスポンス

正常なレスポンス

rows

EvalResultsRow · object[]

必須

Show child attributes

total_rows

integer

必須

summary

EvalResultsSummaryRes · object

Show child attributes

warnings

string[]

致命的ではない警告（例: データセット行 ref の解決失敗）。