New video: Are you using the MNIST dataset to compare the algorithms?

Presenter: David Issa Mattos (PhD student, Chalmers)

One of the most common tasks when developing a new tool is to benchmark it against competing tools. Both researchers and practitioners often look at the results of these benchmarks before selecting the appropriate tool. However, the quality of the benchmark greatly influences the results. In this presentation, we discuss how to evaluate if your benchmark has the appropriate difficulty level and is capable of differentiating the competing tools. We make an analogy with education and illustrate the assessment of benchmarks with two cases, one in datasets for automated labeling algorithms and the other in optimization algorithms. In both cases, the benchmarks are either too easy or too difficult and cannot be used to differentiate the tools.

Link to the recorded presentation on YouTube:


Make sure you and your colleagues registered on the SC_BB mailing list. You can do so at the following link: