Skip to content

**task 3: debate reflection experiments** #6

@jd-coderepos

Description

@jd-coderepos
  • total experiments:

           - setting 3.1: Select the best-performing pair from Setting 2.1. Initialize the core evaluator and critic as follows:
           - setting 3.1.1: Self-evaluation. Use the same LLM from the selected pair as both the core evaluator and the critic. total experiments: 1 combination x  ___ #datasets = ___ total runs
           - setting 3.1.2: Cross-evaluation. Assign one LLM from the selected pair as the core evaluator and the other as the critic. total experiments: 1 combination x  ___ #datasets = ___ total runs
           - once we see the results from task 3, other settings will be defined...
    

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions