diff --git a/ideas/2026.md b/ideas/2026.md index f661223..05f48e0 100644 --- a/ideas/2026.md +++ b/ideas/2026.md @@ -300,3 +300,42 @@ Let's make more use of the Ryan(Tm) update bot! - add updateScripts - add tests to existing scripts that are failing - create a metric to analyze existing script pass/fail/update-time + +## Generative Nix: Surveying LLM Proficiency In NixOS + +Effort: small (90 hours) + +LLMs have gone from toys to industry-standard tooling but their ability to program or debug Nix hasn't really been studied; some folks have good luck with them and some don't, and so it'd be helpful to know where support for Nix in these LLMs stands. + +For this project, the mentee would: +- Select a handful of commercial and open-source state-of-the-art LLMs +- Select a handful of representative tasks using Nix, Nixpkgs, and NixOS +- Select performance criteria +- Conduct experiments to benchmark performance of each LLM against each task + +Deliverables: +- Selection of benchmark tasks and criteria for LLM usage on Nix/NixOS/Nixpkgs +- Blog post describing results of commerical and open-source state-of-the-art LLMs on those tests + +Skills required: +- Basic experience with LLMs +- Basic scripting knowledge +- Access to commercial LLMs and open-source LLMs +- Ability to design and conduct experiments +- Ability to write and communicate clearly + +Skills suggested: +- Basic knowledge of Nix +- Basic knowledge of NixOS systems administration +- Basic knowledge of Nixpkgs + +Possible Mentors: +- [@crertel](https://github.com/crertel) + +Difficulty: medium + +References: +- https://haskellforall.com/2026/02/my-experience-with-vibe-coding + +Prior efforts: +- None, really.