Skip to content

Support In-Context-Learning (ICL) in agents #1064

@qingyun-wu

Description

@qingyun-wu

Scenario: Interactive example selection. More specifically, theAssistantAgent can ask for examples anytime during the interaction with the UserProxyAgent

### Tasks
- [x] Review PR https://github.com/microsoft/FLAML/pull/1056
- [x] Write user code example for the above scenario
- [x] Enable exmaple selection (could start with a vanilla example selection method, e.g., random) in `UserProxyAgent`.
### Tasks: June10-13
- [x] Conduct manual initial experiments to verify whether the LLM can request new examples as planned. The hypothesis is that the LLM can use triggers and prompts to evaluate and request examples on its own. We need to determine whether the LLM possesses this capability.
- [ ] Keep in mind and not urgent: We need to consider whether to add the ICL as a "skill" in the AssistantAgent, aligning it with the UserProxyAgent's skills. One possible solution is to include a prompt in the system message when the UserProxyAgent is detected to be working on ICL.
### Tasks: June 14-20
- [x] Prompt Engineering: find the best prompt which can elicit the LLM to think of which kind of exemplars are most useful
- [x] Find more datasets @tong
- [x] Implement the first algorithm (LLM information+ coverage), and have some initial results
### Tasks: June 20-27
- [x] Study the https://arxiv.org/abs/2304.11406 paper, and figure out whether we can use the personalized datasets
- [x] @littlelittlecloud Check the ICL of Math dataset without Voting
- [ ] @tongwu2020 Think about the methods of improving the Math performance
- [x] Manually check the ICL results of the Math dataset
- [x] Ask for the dataset of API bank
### Tasks: June 27
- [ ] @littlelittlecloud @jtongxin #tongwu2020 Design the experiments for Math
- [ ] @jtongxin Study the performance of GPT-4/3.5 on LaMP dataset

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions