by The Bayesian Observer

One of the problems for practitioners (as well as for researchers) of machine learning is that there is no well defined ‘frontier’ of machine learning algorithms that perform ‘best’ in a given problem domain. Even though there are some guidelines as to what is likely to work well in a given setting, this is not guaranteed, and often other approaches outperform well known approaches. In fact a large number of papers at machine learning conferences seem to be based on this kind of surprise (‘look, I tried an autoencoder on this problem and it outperforms the algorithms people have been using for decades!’). It is hard to have a universal frontier of this type because much of the ability of a machine learning algorithm depends upon the structure of the data it is run on. For this reason I was very happy to come across, a website that allow people to upload algorithms (coded up in any language), and datasets, and compare the performance of their algorithm on all the datasets in their system, as well as the performance of any of the algorithms in their system on one’s dataset. This is immensely useful  because it serves as an empirical frontier of the performance of machine learning algorithms on different types of datasets. MLcomp connects to computational resources hosted on a standard Amazon EC2 linux instance, allowing users to test algorithms on datasets. And it is free.

There are at least a couple of reasons why something like MLcomp should be adopted by more people (current usage seems low, based on how useful Ithink it can be). First, the utility of such a platform is proportional to the number of algorithms and datasets hosted on it, i.e. proportional to the number of people using it and contributing to it. I feel that if a certain threshold number of people begin using it actively, then it’s utility as a platform will take off rapidly after that. Second, it can prove to be an invaluable resource for ensemble based learning: Upload your dataset, run a number of algorithms on it, download the code of the algorithms that seem to do well, and use them in a ensemble to improve the performance of your supervised classifier / regressor.