Preferences are central to decision making by both machines and humans. Representing, learning, and reasoning with preferences is an important area of study both within computer science and across the social sciences. When we give our preferences to an AI system we expect the system to make decisions or recommendations that are consistent with our preferences but the decisions should also adhere to certain norms, guidelines, and ethical principles. Hence, when working with preferences it is necessary to understand and compute a metric (distance) between preferences – especially if we encode both the user preferences and ethical systems in the same formalism. In this paper we investigate the use of CP-nets as a formalism for representing orderings over actions for AI systems. We leverage a recently proposed metric for CP-nets and propose a neural network architecture to learn an approximation of the metric, CPMetric. Using these two tools we look at how one can build a fast and flexible value alignment system (This is an expanded version of our paper, “Metric Learning for Value Alignment” . In this version we have added the classification and regression results and significantly expanded the description of the CPMetric network.).
Supplementary notes can be added here, including code and math.