JohannesAck

Follow

🗼

Johannes Ackermann JohannesAck

🗼

Follow

PhD student at the University of Tokyo working on Reinforcement Learning and broader Machine Learning

53 followers · 24 following

Achievements

Achievements

Pinned Loading

gradientregularization_trl gradientregularization_trl Public

Implementation for our paper "Gradient Regularization prevents Reward Hacking in RLHF and RLVR". Implemented TRL and for Huggingface Transformers

Python 12
OffPolicyCorrectedRewardModeling OffPolicyCorrectedRewardModeling Public

Implementation for our COLM paper "Off-Policy Corrected Reward Modeling for RLHF"

Python 8
tf2multiagentrl tf2multiagentrl Public

Clean implementation of Multi-Agent Reinforcement Learning methods (MADDPG, MATD3, MASAC, MAD4PG) in TensorFlow 2.x

Python 172 33
OfflineRLStructuredNonstationarity OfflineRLStructuredNonstationarity Public

Implementation for RLC paper "Offline Reinforcement Learning from Datasets with Structured Non-Stationarity".

Python 7