Stop Bashing ML Hackathons Already, Because They Are Not Close To Real-World – Analytics India Magazine

For years, people have been comparing machine learning and data science hackathons with real-world implications. Yet, ironically, the debates are never-ending and often ambiguous.

For instance, if you look at online hackathon platforms like Kaggle or MachineHack. These platforms allow users to find and publish data sets, explore and build models in a web-based data-science environment, collaborate/work with other data scientists and machine learning engineers, and enter the competition to solve data science and machine learning challenges across experience levels beginners to intermediate and expert.

Hackathon platforms have been serving as a test-bed for data scientists and machine learning professionals. As per Kaggle, more than 55 per cent of data scientists have less than three years of experience, and six per cent of them pursuing data science have been using machine learning for more than a decade.

There are a lot more gains than losses by participating in hackathons. Some of the benefits/advantages include:

In this article, we will talk about the differences between hackathon platforms and real-world machine learning projects and draw a clear conclusion between the both.

Before we delve deep into understanding the difference between hackathons and real-world machine learning projects, lets look into a lifecycle of a machine learning project. As explained by Steve Nouri, founder, AI4Diversity, it typically involves:

Many industry experts believe that the hackathon platforms might be an amazing way to experiment and learn. Still, it only aligns with a single stage of the ML lifecycle i.e., training the model. However, when a data scientist builds a model in the real world and optimises the metric, they need to consider the RoI, inference, re-training cost and costs in general. That is a completely missing puzzle while working on hackathon platforms.

To drive the adoption of an ML model within the business stakeholders, it is important we think about interpretability as well, said Sushanth Dasari, data scientist at Trust, stating that it drives a lot of key decisions in each of the steps in the life cycle, which is never the case with a hackathon.

In real-world ML projects, 90 per cent of the time is spent on acquiring, cleaning and processing the data, often querying different databases and merging this data. The quality of the input data needs to be carefully assessed and checked for correctness, integrity, and consistency, said Daniele Gadler, data scientist at ONE LOGIC GmbH.

Further, he said once the Ml model had been developed and deployed, a lot of time goes into monitoring, re-training the model and re-training it based on newly ingested data (MLOps). Instead, in hackathons, the data is already provided and is generally cleaner than in real-world projects. Furthermore, there are no concerns about real-world issues such as model stability, maintainability, deployability, etc. You can just focus on developing a super-complex unmaintainable huge model with the goal of obtaining the best performance on the data provided for the competition, hoping it will generalise on newly unseen data, said Gadler.

Joseph Wehbe, co-founder and CEO of DAIMLAS.com, said that time is wasted improving 0.000001 accuracies on hackathon platforms, but you do not do that in the real world. It focuses only on one performance metric. However, in the real world, you focus on scalability, speed, deployment, and cost. You dont learn how to clean raw data. You dont learn understanding the business problem, deployment skills, team skills interacting with leadership, and analysis to understand what business problem you are trying to solve, he added.

While hackathon platforms like Kaggle, MachineHack, etc., push users to explore new problems, it also helps them understand the science part well enough to do real-world work.

Hackathon platforms can be as real as real-world, but only the environments are different. For example, what a gym is for athletes, hackathon platforms are for data scientists and machine learning professionals, a great place to practice and learn.

Amit Raja Naik is a senior writer at Analytics India Magazine, where he dives deep into the latest technology innovations. He is also a professional bass player.

Read the original here:
Stop Bashing ML Hackathons Already, Because They Are Not Close To Real-World - Analytics India Magazine

Related Posts

Comments are closed.