They perform several model runs with slightly different configurations and then compare the predictions to a reference period, currently 1991-2020. When they say "above average", most model runs fell in the upper tercile of the reference period and the probability is derived from the fraction.