Towards Uniformly Superhuman Autonomy via Subdominance Minimization

Brian Ziebart, Sanjiban Choudhury, Xinyan Yan, Paul Vernaza Proceedings of the 39th International Conference on Machine Learning, PMLR 162:27654-27670, 2022.

デモに様々な劣解が含まれていることを仮定．なるべく，パレート支配的な解を見つける「superhuman」を目指すIRL．

キモは恐らくFig.3．

①78%のsuperhumanな振舞いを獲得（パレート劣の数？） MaxEntIRLでは，50%，データクリーニングしても72%だった．

②パレート優劣に有効な特徴量だけ選別された？

moNo’s note

最近読んだ本，観た映画など，気ままにメモします．

Towards Uniformly Superhuman Autonomy via Subdominance Minimization