Projects

Exploring Happiness Correlations with Artificial Intelligence

  • Explored correlations between happiness scores and various socioeconomic factors using statistical analysis
  • Applied machine learning methods, including Decision Tree, Random Forest, Extreme Gradient Boosting
  • Applied deep learning methods, including Multilayer Perception (MLP), and Convolutional Neural Network (CNN)
  • Predicted the happiness score and identified the most significant factors that contribute to happiness scores
  • Conducted error analysis to compare the performance of different models
  • Applied psychological theories and research to provide a more comprehensive understanding

Data Visualization on Data Science Job Salaries in Tableau

  • Performed data cleaning on ambiguous attributes, missing values, and irrelevant data
  • Created visualizations and calculation fields that showed the distribution, trends, and percentage changes of salaries by different attributes
  • Used parameters to create interactive visualizations that allowed users to filter the data by different attributes
  • Created a dashboard that combined multiple visualizations into a single view and allowed users to interactively explore the data with different filters
  • Applied actions to create interactivity between different visualizations in the dashboard, such as highlighting the data points in other visualizations when users selected a specific data in the map or chart

Web Application of Event Check System

  • Developed an event check system for LCSD Cultural Programmes, enabling users to access event information across different locations
  • Implemented the system using React for the front-end and Node.js with Express for the back-end
  • Implemented user authentication and authorization to allow for different roles such as admins and general users
  • Utilized MongoDB as the database, containing user and event information
  • Enabled users to search locations by keyword or categories, comment on events, and label locations as favorites

A JAVA-based Online Book Ordering System and Database Management

  • Implemented an Online Book Ordering System using JAVA and Oracle Database
  • Enabled users to log in to the main menu with different roles such as admins, general users, employees, or managers
  • Allowed admins to modify the database by initializing, dropping, resetting all the tables and records, and generating the overview of all tables
  • Enabled general users to search the database for books by different attributes, and with the capability to check their history of book orders, as well as the shipping status and information of their orders
  • Allowed employees or managers to update the shipping status of the orders, view the orders grouped by different shipping status, and view the N most popular books

Predicting Diabetes with Machine Learning Models and SMOTE

  • Determined the best indicators for predicting whether a patient has diabetes and generated insights into each of these risk factors
  • Adopted the Synthetic Minority Oversampling Technique (SMOTE) to address the accuracy paradox caused by imbalanced dataset
  • Applied machine learning models to classify diabetes, including Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, Extreme Gradient Boosting, and K-Nearest Neighbors Algorithm
  • Conducted error analysis to compare the performance of different models

Applications of Hidden Markov Models in Earthquake Prediction

  • Identified seismicity patterns to evaluate earthquake risk in earthquake-prone areas
  • Determined the best Hidden Markov Models for predicting earthquakes by global decoding (Viterbi Algorithm)
  • Compared models with different hidden states by model selection and residual diagnostics
  • Interpreted the meaning behind different hidden states and explored possible applications of the model
  • Predicted the years with high-frequency earthquakes

Classifying Water Potability with Classification Models

  • Classified water potability by its features, substance, chemical element, to determine whether it is safe for human consumption
  • Preprocessed the data using procedures like data cleaning, correlation, and feature scaling
  • Compared the ROC curve and accuracy score of machine learning models: K-Nearest Neighbors algorithm, Random Forest, Support Vector Machine, and Gaussian process classification
  • Identified the most suitable method by error analysis