completedOctober 2021 - Jananuary 2022

1st UNIST-POSTECH-KAIST Data Science Competition

Silver Prize (4th Place) - Shale gas production prediction and investment optimization

PyCaretBinary Integer ProgrammingSHAPAutoMLData Analysis

Project Summary: Led Team JORDY to 4th place (Silver Prize) among 38 teams with 144 participants, developing machine learning solutions for shale gas production forecasting and investment optimization using Korea National Oil Corporation (KNOC) data.


The Energy Challenge

Korea's premier science universities hosted their inaugural data science competition, focusing on optimizing shale gas production and investment decisions using real-world KNOC data. Teams were challenged to predict production viability and develop optimal investment strategies balancing risk and return in the volatile energy market.

The Innovation

AutoML Production Prediction We leveraged PyCaret's automated machine learning pipeline to build binary classification models for shale gas production viability. Our approach automated feature engineering, model selection, and hyperparameter optimization, achieving sMAPE scores below 25 across different production scenarios.

Investment Optimization Engine Using binary integer programming, we developed an optimization framework to maximize expected well purchase profits. The system balanced production probability distributions, capital requirements, and risk tolerance through probability matching strategies, achieving a final profit projection of $937,816.

Advanced Data Engineering Our preprocessing pipeline included temporal feature engineering, sophisticated categorical encoding for stimulation fluids and proppants, SHAP-based feature importance analysis, and systematic correlation-based feature removal.

Recognition and Impact

Our comprehensive approach earned us 4th place and the Silver Prize along with $800 in prize money. As team leader and Q&A responder, I contributed innovative suggestions during the final presentation, including Reynolds number integration for enhanced fluid dynamics modeling and ESG management frameworks for environmental impact assessment.

The project was recognized for demonstrating how academic research can translate into practical energy sector solutions, establishing methodologies for investment portfolio optimization and production forecasting.

Industry Applications

This project demonstrated practical applications for energy companies in investment decision-making, production planning, and environmental compliance. Our work showed how data science can transform traditional energy operations by providing quantitative frameworks for well acquisition decisions and sustainable production strategies.