Projects
Advanced machine learning methods were utilized to build, test and optimise the performance of K-NN algorithm for breast cancer diagnosis.
Python scikit-learn machine learning feature selection PCA cross-validation evaluation-metrics Pandas IPython notebook
Identified which Enron employees are more likely to have committed fraud using machine learning and public Enron financial and email data.
Python scikit-learn machine learning natural language processing feature selection
Investigated a wine dataset using R and exploratory data analysis techniques, exploring both single variables and relationships between variables.
RStudio R packages plotting in R exploratory data analysis techniques
Chose a region and used data munging techniques to assess the quality of the data for validity, accuracy, completeness, consistency and uniformity.
Python data verification data cleaning
Posed a question about a dataset, then used NumPy and Pandas to answer that question based on the data and created a report to share the results.
Python NumPy Pandas Matplotlib IPython notebook