Active preference learning based on radial basis functions
This paper proposes a method for solving optimization problems in which the decision-maker cannot evaluate the objective function, but rather can only express a preference such as "this is better than that" between two candidate decision vectors. The algorithm described in this paper aims at reaching the global optimizer by iteratively proposing the decision maker a new comparison to make, based on actively learning a surrogate of the latent (unknown and perhaps unquantifiable) objective function from past sampled decision vectors and pairwise preferences. The surrogate is fit by means of radial basis functions, under the constraint of satisfying, if possible, the preferences expressed by the decision maker on existing samples. The surrogate is used to propose a new sample of the decision vector for comparison with the current best candidate based on two possible criteria: minimize a combination of the surrogate and an inverse weighting distance function to balance between exploitation of the surrogate and exploration of the decision space, or maximize a function related to the probability that the new candidate will be preferred. Compared to active preference learning based on Bayesian optimization, we show that our approach is superior in that, within the same number of comparisons, it approaches the global optimum more closely and is computationally lighter. MATLAB and a Python implementations of the algorithms described in the paper are available at http://cse.lab.imtlucca.it/ bemporad/idwgopt.
READ FULL TEXT