Towards Explainable Metaheuristic: Mining Surrogate Fitness Models for Importance of Variables
Metaheuristic search algorithms look for solutions that either maximise or minimise a set of objectives, such as cost or performance. However most real-world optimisation problems consist of nonlinear problems with complex constraints and conflicting objectives. The process by which a GA arrives at a solution remains largely unexplained to the end-user. A poorly understood solution will dent the confidence a user has in the arrived at solution. We propose that investigation of the variables that strongly influence solution quality and their relationship would be a step toward providing an explanation of the near-optimal solution presented by a metaheuristic. Through the use of four benchmark problems we use the population data generated by a Genetic Algorithm (GA) to train a surrogate model, and investigate the learning of the search space by the surrogate model. We compare what the surrogate has learned after being trained on population data generated after the first generation and contrast this with a surrogate model trained on the population data from all generations. We show that the surrogate model picks out key characteristics of the problem as it is trained on population data from each generation. Through mining the surrogate model we can build a picture of the learning process of a GA, and thus an explanation of the solution presented by the GA. The aim being to build trust and confidence in the end-user about the solution presented by the GA, and encourage adoption of the model.
READ FULL TEXT