**task 3: debate reflection experiments**

- [ ] total experiments: 

             - setting 3.1: Select the best-performing pair from Setting 2.1. Initialize the core evaluator and critic as follows:
             - setting 3.1.1: Self-evaluation. Use the same LLM from the selected pair as both the core evaluator and the critic. total experiments: 1 combination x  ___ #datasets = ___ total runs
             - setting 3.1.2: Cross-evaluation. Assign one LLM from the selected pair as the core evaluator and the other as the critic. total experiments: 1 combination x  ___ #datasets = ___ total runs
             - once we see the results from task 3, other settings will be defined...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

task 3: debate reflection experiments #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

**task 3: debate reflection experiments** #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

task 3: debate reflection experiments #6