May 3rd, 2021

What it takes to become a World No 1 on Kaggle

RSS icon RSS Category: Data Science, Kaggle, Machine Learning, Makers

In conversation with Guanshuo Xu: A Data Scientist, Kaggle Competitions Grandmaster, and a Ph.D. in Electrical Engineering.

In this series of interviews, I present the stories of established Data Scientists and Kaggle Grandmasters at H2O.ai, who share their journey, inspirations, and accomplishments. The intention behind these interviews is to motivate and encourage others who want to understand what it takes to be a Kaggle Grandmaster.

In this article, I shall be sharing my interaction with Guanshuo Xu. He is a Kaggle Competitions Grandmaster and a Data Scientist at H2O.ai. Guanshuo obtained his Ph.D. in Electrical & Electronics Engineering at the New Jersey Institute of Technology, focusing on machine learning-based image forensics and steganalysis.

Guanshuo is a man of many accomplishments. His methods for real-world image tampering detection and localization won second place in the First IEEE Image Forensics Challenge. His architectural design of deep neural networks outperformed traditional feature-based methods for the first time in image steganalysis. More recently, Guanshuo also achieved the world number one rank in the competition’s tier on Kaggle with a win in the Alaska2 Image Steganalysis and RSNA STR Pulmonary Embolism Detection competitions.  

Here is also a link to Guanshuo’s interview at CTDS.show where he discusses his achievements on Kaggle.


 

In this interview, we shall know more about his academic background, passion for Kaggle, and his journey to the number one title. Here is an excerpt from my conversation with Gunashuo:

You have a background in Ph.D. in Electrical Engineering. Did it somehow influence your decision to take up Machine Learning as a career?

Guanshuo: Yes, my doctoral research used machine learning techniques to solve problems like image tampering detection and hidden data detection. For example, my last Ph.D. research project was to use deep neural nets on image steganalysis. So my education and research are directly related to machine learning. Hence, machine learning was a natural choice of career for me.

How did your start with Kaggle, and what kept you motivated throughout your grandmaster’s journey?

Guanshuo: From the time I discovered Kaggle, I have been addicted to it. Some of the motivating factors for continuous competing on Kaggle would be the combined satisfaction of winning competitions and prize money, learning new techniques, widening and deepening my understanding of machine learning, and building surprisingly effective models.

How does it feel to be World No 1 in Competitions? Does that bring in an extra amount of pressure while competing?

The top 5 Kagglers in the Competition’s category as on date | Source: Kaggle’s website

Guanshuo: Honestly speaking, there is a lot more pressure to maintain the number one rank than achieve it. This is because it requires “smoother” performance. Sometimes I have to participate in more competitions simultaneously than I used to participate in before.

How do you typically approach a Kaggle problem? 

A glimpse of Guanshuo’s competition’s profile. : source: https://www.kaggle.com/wowfattie/competitions

Guanshuo: My approach varies based on the type of problem and the goal of the competition. Nowadays, what I often do is spend days or even weeks on understanding the data and the problem and thinking of a solution which includes, for instance, guessing the distribution of the private test data, proper validation scheme, detailed modeling steps, etc. Once I have a decent picture of the overall approach, I start coding and modeling. This process helps me to gain more understanding and make corrections or adjustments, if necessary, to the overall approach.

Could you give us a sneak peek into your toolkit like a favorite programming language, IDE, Algorithms, etc

Guanshuo: As far as my toolkit is concerned, I mostly use gedit, Python, and Pytorch for deep learning.

The Data Science domain is rapidly evolving. How do you manage to keep up with all the latest developments?

Guanshuo: I get to know about most of the new stuff and technologies through Kaggle, my colleagues, or even by mere googling. As far as new developments in machine learning are concerned, it depends on the actual needs. I tend to filter out anything not instantly helpful and maybe keep an eye on the potentially exciting stuff. Then I get back to it as and when needed. 

A word of advice for the Data Science aspirants who have just started or wish to start their Data Science journey?

A virtual panel where Guanshuo, along with fellow H2O.ai Kaggle GrandMasters shared his insights on Kaggle

Guanshuo: It basically depends on each person’s background and interests. However, finding a suitable platform to learn and develop skills can make things much easier in general. Additionally, taking part in Kaggle competitions can prove to be an additional helpful resource.


 

To achieve a world no 1 rank is no mean feat and Guanshuo’s relentless attitude and hard work deserve all the credit. A peek into his various winning solutions on Kaggle showcases his structured approach which is an essential element to be inculcated for problem-solving.

About the Author

Parul Pandey

Parul focuses on the intersection of H2O.ai, data science and community. She works as a Principal Data Scientist and is also a Kaggle Grandmaster in the Notebooks category.

Leave a Reply

+
[Infographic] Healthcare providers: How to avoid AI “Pilot-Itis”

From increased clinician burnout and financial instability to delays in elective and preventative care, the

March 15, 2023 - by
+
Deploy a WAVE app on an AWS EC2 instance

This article was originally published by Greg Fousas and Michelle Tanco on Medium  and reviewed by

March 10, 2023 - by Michelle Tanco and Greg Fousas
+
How Horse Racing Predictions with H2O.ai Saved a Local Insurance Company $8M a Year

In this Technical Track session at H2O World Sydney 2022, SimplyAI's Chief Data Scientist Matthew

March 8, 2023 - by Liz Pratusevich
+
AI and Humans Combating Extinction Together with Dr. Tanya Berger-Wolf

Dr. Tanya Berger-Wolf, Co-Founder and Director of AI for conservation nonprofit Wild Me, takes the

March 1, 2023 - by Liz Pratusevich
+
Improving Search Query Accuracy: A Beginner’s Guide to Text Regression with H2O Hydrogen Torch

Although search engines are vital to our daily lives, they need help understanding complex user

February 28, 2023 - by
Blog header image with boats on it
+
What it means—and takes—to be at AI’s edge with Dr. Tim Fountaine

Dr. Tim Fountaine, Senior Partner at McKinsey & Company joins us at H2O World Sydney

February 23, 2023 - by Liz Pratusevich

Request a Demo

Explore how to Make, Operate and Innovate with the H2O AI Cloud today

Learn More