The Winners Of Weekend Hackathon -Tea Story at MachineHack – Analytics India Magazine

The Weekend Hackathon Edition #2 The Last Hacker Standing Tea Story challenge concluded successfully on 19 August 2021. The challenge involved creating a time series analysis model that forecasts for 29 weeks . It had almost 240+participants and 110+ solutions posted on the leaderboard.

Based on the leaderboard score,we have the top 4 winners of the Tea Story Time Series Challenge, who will get free passes to the virtual Deep Learning DevCon 2021, to be held on 23-24 Sept 2021. Here, we look at the winners journeys, solution approaches and experiences at MachineHack.

First Rank Vybhav Nath C A

Vybhav Nath- a final year student at IIT Madras. He entered this field during his second year of college and started participating in MachineHack hackathons from last year. He plans to take up a career in Data Science.

Approach

He says the problem was unique in the sense that many columns in the test set had a lot of null value. So this was a challenging task to solve. He kept his preprocessing steps restricted to imputation and replacing N.S tasks. This was the first competition where he didnt use any ML model. Since many columns had null values, he interpolated the columns to get a fully populated test set. Then the final prediction was just the mean of these Price columns. He thinks this was total doosra by the cool MachineHack Team.

Experience

He says, I always participate in MH hackathons whenever possible. There are a wide variety of problems which test multiple areas. I also get to participate with many Professionals which I found to be a good pointer about where I stand among them.

Check out his solution here.

Second prize Shubham Bharadwaj

Shubham has been working as a Data Scientist for about 7 years now. He has been working on large datasets for the past 7 years. Started off with SQL then BigData Analytics, then Data Engineering and finally working as a Data Scientist. But he is new to hackathons and this is his fourth hackathon in which he has participated. He loves to solve complex problems.

Approach

The data which was provided was very raw in nature, there were around 70 percent missing values in the test dataset. From his point of view ,finding the best imputation method was the backbone of this challenge.

Preprocessing steps followed:

1. Converting the columns to correct data types,

2. Imputing the missing values- He tried various methods like filling the null values with mean of each column, mean of that row, MICE. But the best was KNN imputer with n_neighbors as 3.

For removing the outliers,he used the IQR(InterQuartile Range), which helped in reducing the mean square error.

Models tried were logistic regression, then XGBRegressor, ARIMA, T-POT, and finally H2OAutoML which yielded the best result.

Experience

Shubham says I am new to the MachineHack family, and one thing is for sure that I am here to stay. Its a great place, I have already learned so much. The datasets are of wide variety and the problem statements are unique, puzzling and complex. Its a must for every aspiring and professional data scientist to upskill themselves.

Check out his solution here.

Third prize Panshul Pant

Panshul is a Computer Science and Engineering graduate. He has picked up data science mostly from online platforms like Coursera, Hackerearth, MachineHack and by watching videos on YouTube. Going through articles on websites like Analytics India Magazine have also helped him in this journey. This problem was based on a time series which made it unique, though he solved it using machine learning algorithms rather than other traditional ways.

Approach

There were certain string values like N.S, No sale etc in all numerical columns which I changed to Null values and imputed all the null values. I tried various ways to impute NaNs like with zero, mean, f-fill and b-fill methods .Out of these forward and backward filling methods performed significantly better. Exploring the data he noticed that the prices increased over the months and years, having a trend. The target columns values were also very closely related to the average of prices of all the independent columns.He kept all data including the outliers without much change as tree based models are quite robust to outliers.

As the prices were related to time he extracted time based features as well out of which day of week proved to be useful. An average based feature which had the average of all the numerical columns was extremely useful for good predictions. He tried using some aggregate based features as well but they were not of much help. For predictions he used tree based models like lightgbm and xgboost. The combination of both of them using weighted average gave best results.

Experience

Panshul says It was definitely a valuable experience. The challenges set up by the organisers are always exciting and unique. Participating in these challenges has helped me hone my skills in this domain.

Check out his solution here.

Fourth prize Shweta Thakur

Shwetas fascination with data science started when she realised how numbers can guide decision making. She did a PGP-DSBA course from Great Learning . Even though her professional work does not involve Data Science activity, she loves to challenge herself by working on Data Science projects and participating in Hackathons.

Approach

Shweta says that the fact that it is a time series problem makes it unique. She observed the trend and seasonality in the dataset and the higher correlation between various variables. Didnt treat the outliers but tried to treat the missing values with interpolate (linear, spline)method, ffill, bfill, replacing with other values from dataset.Even though some of the features were not as significant in identifying the target but removing them didnt improve the RMSE. She tried only SARIMAX.

Experience

Shweta says It was a great experience to compete with people from different back-ground and expertise.

Check out his solution here.

Once again, join us in congratulating the winners of this exciting hackathon who indeed were the Last Hackers Standing of Tea Story- Weekend Hackathon Edition-2 . We will be back next week with the winning solutions of the ongoing challenge Soccer Fever Hackathon.

Original post:

The Winners Of Weekend Hackathon -Tea Story at MachineHack - Analytics India Magazine

Related Post

Comments are closed.