Skip to main content
weave / Evaluation Sets up an evaluation which includes a set of scorers and a dataset. Calling evaluation.evaluate(model) will pass in rows form a dataset into a model matching the names of the columns of the dataset to the argument names in model.predict. Then it will call all of the scorers and save the results in weave. Example
// Collect your examples into a dataset
const dataset = new weave.Dataset({
  id: 'my-dataset',
  rows: [
    { question: 'What is the capital of France?', expected: 'Paris' },
    { question: 'Who wrote "To Kill a Mockingbird"?', expected: 'Harper Lee' },
    { question: 'What is the square root of 64?', expected: '8' },
  ],
});

// Define any custom scoring function
const scoringFunction = weave.op(function isEqual({ modelOutput, datasetRow }) {
  return modelOutput == datasetRow.expected;
});

// Define the function to evaluate
const model = weave.op(async function alwaysParisModel({ question }) {
  return 'Paris';
});

// Start evaluating
const evaluation = new weave.Evaluation({
  id: 'my-evaluation',
  dataset: dataset,
  scorers: [scoringFunction],
});

const results = await evaluation.evaluate({ model });

Type parameters

NameType
Rextends DatasetRow
Eextends DatasetRow
MM

Hierarchy

Table of contents

Constructors

Properties

Accessors

Methods

Constructors

constructor

new Evaluation<R, E, M>(parameters): Evaluation<R, E, M>

Type parameters

NameType
Rextends DatasetRow
Eextends DatasetRow
MM

Parameters

NameType
parametersEvaluationParameters<R, E, M>

Returns

Evaluation<R, E, M>

Overrides

WeaveObject.constructor

Defined in

evaluation.ts:148

Properties

__savedRef

Optional __savedRef: ObjectRef | Promise<ObjectRef>

Inherited from

WeaveObject.__savedRef

Defined in

weaveObject.ts:49

Accessors

description

get description(): undefined | string

Returns

undefined | string

Inherited from

WeaveObject.description

Defined in

weaveObject.ts:76

name

get name(): string

Returns

string

Inherited from

WeaveObject.name

Defined in

weaveObject.ts:72

Methods

evaluate

evaluate(«destructured»): Promise<Record<string, any>>

Parameters

NameTypeDefault value
«destructured»Objectundefined
› maxConcurrency?number5
› modelWeaveCallable<(…args: [{ datasetRow: R }]) => Promise<M>>undefined
› nTrials?number1

Returns

Promise<Record<string, any>>

Defined in

evaluation.ts:163

predictAndScore

predictAndScore(«destructured»): Promise<{ model_latency: number = modelLatency; model_output: any = modelOutput; model_success: boolean = !modelError; scores: { [key: string]: any; } }>

Parameters

NameType
«destructured»Object
› columnMapping?ColumnMapping<R, E>
› exampleR
› modelWeaveCallable<(…args: [{ datasetRow: E }]) => Promise<M>>

Returns

Promise<{ model_latency: number = modelLatency; model_output: any = modelOutput; model_success: boolean = !modelError; scores: { [key: string]: any; } }>

Defined in

evaluation.ts:231

saveAttrs

saveAttrs(): Object

Returns

Object

Inherited from

WeaveObject.saveAttrs

Defined in

weaveObject.ts:53
I