About Me

Hi, my name is Min, a Software Engineer at Leadgenius, in Berkeley, CA.
I graduated from UC Berkley in 2016 with a Bachelor's degree in computer science, statistics, and economics.

I'm fascinated by back-end development, data engineering, and machine learning. I am currently excited to be building data pipeline to ingest company data from variety of sources to provide quality data to our customers.


Some of my (academic) interests are:

  • Data Engineering
  • Software Engineering
  • Statistical Analysis (and programming)
  • Machine Learning
  • Deep Learning

Related Courseworks:

  • Database Systems (CS 186)
  • Artificial Intelligence (CS 188)
  • Machine Learning (CS 189)
  • Statistical Inference and Computing (STAT 135)
  • Modern Statistical Prediction and Machine Learning (STAT 154)
  • Reproducible and Collaborative Statistical Data Science (STAT 159)
  • Discrete Mathematics and Probability (CS 70, STAT 134)
  • Data Structures in Programming (CS 61B)
  • Computer Architecture (CS 61C)

Projects

Mixed Gamble Task

Investigated the relationship between the brain activity and the behavior of the subjects towards the 50/50 gambling situations using a whole-brain robust regression analysis on Python. (Python, Numpy, Scikit, Pandas, Nibabel)

Prediction of Kobe Bryant’s
Performance in His Next Game

Applied Least-Squares regressions to create statistical models and cross validated to predict Kobe Bryant's performance in his next game. (R)

Probabilistic Modeling of
Interactions on UC Berkeley Campus

Designed an independent research topic and hypothesis to predict common routes of UCB undergraduates with different majors and their interactions on UC Berkeley campus. (R)


Titanic Survival Classification (Kaggle)

Used an ensemble classifier, Random Forest, Logistic Regression to successfuly predict what kinds of people were likely to survive from Titanic Disaster. (Python, Sklearn, Pandas)

Bag of Words : Sentiment Analysis (Kaggle)

Use TFIDF vectorizer and Google's word2vec, a deep-learning inspired method that focuses on the meaning of words, to perform sentiment analysis/basic natural language processing on iMDB movie reviews. (python, sklearn, word2vec)

Rossmann Drug Store Sales Prediction (Kaggle)

Increase drugstore sales prediction accuracy using linear model with feature selection methods (Stepwise selection, partial t-test), parallel distributed Random Forest, and XGBoost (R, H2O, XGBoost)




“Only those who will risk going too far can possibly find out how far one can go.”
- T. S. Eliot