Phuc (Patrick) Nguyen

Logo

Resume | LinkedIn | GitHub

I am currently pursueing master degree in Data Science at the NOVA Information Management School. Before the MSc., I was a management consultant at KPMG working on supply chain management.

My areas of interest are MLOps, Business Intelligence, Data Engineering.

Portfolio

Hello,

I’m Phuc, or Patrick, born and raised in a beautiful town in the middle of Vietnam, an engineer currently pursuing a Master Degree in Data Science, a data analyst enthusiast with a strong business sense. I thrive on digging into complex business problems, producing insightful, strategic recommendations, and implementing data-driven solutions.

As hobbies, I spend a lot of time playing football casually and competitively (Wing-back). I am also a gaming nerd (FIFA and CS:GO)

Updating …

Projects list

Projects Tag Type
BI Digital Transformation BI, DE Professional
E-commerce Operations Dashboard BI, DE, Visualization Educational
Vessel Operations Dashboard BI, DE, Visualization Educational
Supply Chain KPIs & Dashboard design Management, BI, Visualization Professional
Supply Chain Operations excellence Management, SCM Professional
Cryptocurrency forecasting DL Educational
Customer Segmentation for Marketing ML Educational
FIFA Dashboard visualization Visualization Educational
Forecasting Model for multi-stores company ML, MLOps Educational
Income prediction modelling ML Educational
Market Basket Analysis ML, MLOps Educational
Metrics for Machine Translation DL, NLP Educational
Online shop database design DE Educational
Recommendation system for Online Shopping ML Educational
(*) Abbreviation: BI: Business Intelligence | DE: Data Engineering | DL: Deep Learning | ML: Machine Learning | NLP: Natural Language Processing | SCM: Supply Chain Management

Ecommerce Operations Dashboard

View on GitHub Power BI Dashboard Open Report

We designed and implemented a dashboard that support both Management and Operational analytics and decision-making. The project deployed a streamlined data integration, transformation and modelling process with advanced analytics features and storytelling techniques.


This project is meant to reinforce the conceptual knowledge acquired throughout the course and deliver an end-to-end self-service BI solution to support the analytical capability of Sales and Supply Chain Management.



BI Digital Transformation

A digital transformation project aimed to improve decision making and synchronized data source for a large steel manufacturer in Vietnam.


Key tasks:


Vessel Operations Dashboard

Power BI Dashboard

This is an educational project aimed to design and implement a dashboard that support vessel operations management. The operations including Operation time, Summary of trips, Timeline of each vessel, Emission report.



Customer Segmentation for Marketing

View on GitHub Open Notebook Open Report

This is a data mining project with objective of developing a customer segmentation strategy for the Paralyzed Veterans of America (PVA). Using a dataset provided by this non-profit organization, our main goal was to better understand how their donors behave and identify the different segments of donors/potential donors within their database.


This analysis was only possible with the help of preprocessing techniques and clustering models learnt during the semester to help us adquire a better business understanding of this sector and group donors by their behaviour, getting some insights about the best approach to maximize the donation's amount and the best customer description.


Finally we explained the clusters we created and we made a detailed marketing idea for each one.



Market Basket Analysis

THE FAMOUS INSTACART PROBLEM


View on GitHub Open Notebook Open Report Open Presentation

The project objective is to explore the famous dataset Instacart. First we did Exploratory Data Analysis with different aspect such as buying patterns by product, hour of day, day of week and product category. The insights gathered from the analysis are then utilized to build the Apriori algorithm with 2 different approach: inter-department and intra-department to predict the basket of the customer


Finally, a dashboard combining the EDA and a recommendation system built from the result of the Apriori algorithm is deployed to help the business stakeholder develop their customer experience.



Predictive modeling

View on GitHub Open Report

The project presents the results of applying machine learning methods suitable for predicting income from the proposed dataset using supervised machine learning algorithms: Logistic Regression, Linear Discriminant Analysis, K-Neighbors, Decision Tree, Gaussian Naive Bayes, Random Forest, Support Vector, Gradient Boosting, and Ada Boost algorithm, and compare their performances to obtain the best performing classifier and then use Stacking to increasing the predictive force of the classifier.


The result is that the best model is the stacking model. However considering the efficiently in training and predicting time. The final decision is the Gradient Boosting model using GLMM encoding method


Keywords: GLMM, Logistic Regression, Random Forest, Gradient Boosting, and Stacking classifier.


FOOTBALL GENERATIONS IN COMPARISION

Open Web App View on GitHub Open Report

The dashboard created aims to inspect some questions that all the football-lovers mind. How will be the next generation of football players? What will happen after the decade of Messi and Ronaldo?


The dashboard was created with Plotly, an interactive graphing library for Python. With Plotly, we developed the main interactions that are necessary to tell the story we are interested to transmit. The second objective of this project was to create a more pleasant experience for the user, and to improve the layout we used HTML, CSS and a framework from CSS called Bootstrap. In this way we achieved a better organization of our Dash App and the users can navigate through the sections like a normal web page



Machine translation

A regression approach on the Most Recent Metrics for Machine Translation

View on GitHub Open Report

The purpose of this project is to create a metric that best correlates with human assessments of machine translation quality. We tried different available scoring metrics such as ROUGE, BLUE, BLUERT (most advanced metrics) with different n-grams setting. Finally, an ensemble approach was used as the scores generated from the metrics are used as features for a regressor (Gradient Boosting ModelS) to finally predict the score


The result showed that the ensemble modelling did show a significant improvement from the original scoring metrics. However performance and time constraint is still the problem with this approach



Forecasting Cryptocurrency Prices Time Series

View on GitHub Open Report

The project is to build and benchmark predictive models for timeseries forecasting using deep neural networks: LSTM, GRU and Bi-LSTM. We tested the models with basic L1 regularization, grid-search for hyper-parameter tunning and using RMSE for models validation. Finally, we compared the result of uni-variate and multi-variate timeseries forecasting



Introduction to Recommendation system

Open Notebook View on GitHub Open Report

Project applied several approach such as RFM analysis and perfrom customer clustering with K-Mean. For the Cold-start problem: The top 10 best-selling products will be used to create a recommender system for new users. Fot the RecSystem, we applied collaborative filtering techniques by comparing different models Bayesian Personalized Ranking (BPR), Logistic Matrix Factorization (LMF), Alternative Least Square (ALS). The evaluation metrics used are Precision at K and Hits rate



© 2021 Phuc Nguyen. Powered by Jekyll and the Minimal Theme.