Skip to contents

All functions

EvalReporters EvalProgressReporter EvalCompactProgressReporter
Test reporters for LLM evaluation
evaluate() evaluate_active_file()
Evaluate LLM performance
expect_r_code()
Check if input is syntactically valid R code
grade_output()
Grade model outputs
grade_queue()
Grade an evaluation data frame
input() output()
Flag model inputs and outputs
judges()
Define judge models
results_read()
Interface with eval results