© 2019, DISCnet            DISCnet is the Data Intensive Science Centre in SEPnet, and an STFC Centre for Doctoral Training;  a collaboration between

the Universities of Southampton, Sussex, Portsmouth, Queen Mary University of London, and Open University

linkedinlogo.png
slack-15-682088.png

Machine Learning

Aims & Objectives

The course will give an introduction to machine learning in a practical way. Through a combination of lecturing and practical, workshop-style exercises, the students will be familiarised with the basic concepts and led towards using machine learning techniques in their own research.

 

Day 1: Overview of Machine Learning

  • Success stories in machine learning

  • Failures of machine learning

  • Machine learning techniques

    • Linear Regression, MLP, SVMs, Decision Trees, Deep Learning

  • Machine learning problems

    • Supervised learning (regression/classification), Unsupervised learning (PCA/clustering), Semi-supervised learning, reinforcement learning

  • Making sense of data

    • Types of data (images, text, numbers)

    • Data preparation, missing data

  •  Common tools

    • Matlab, python

  • Homework

 

 

Day 2: Introduction to Machine Learning

  • The perceptron/Bayes optimal decisions

  • MLPs

  • Gradient learning, SGD, momentum

  • Evaluating performance

    • ROC curves

  • Homework

 

 

Day 3: Advanced Machine Learning

  • Generalisation

    • Bias-Variance Dilemma

  • Ensemble Techniques

    • Ada-boost, random forest

  • Kernel methods

    • SVM

    • kernels

  • Probabilistic techniques

    • Gaussian Processes

    • Graphical Models, LDA, MCMC

  • Homework

 

 

Day 4: Deep Learning

  • Why Deep

  • CNNs

  • LSTMs

  • GPU programming (libraries)

  • Keras tutorial

  • Homework

 

Day 5:  Practical Machine Learning

  • Workshop on data you provide

  • We will look at:

    • Analyse the problem

    • Visualise the data

    • Cleaning the data

    • Using machine learning libraries

    • Evaluate performance

 

Learning & Teaching Resources / links / background reading:

Notebook will be required.

 

 

Examples

Some examples will be provided during the course. However, for Day 5 it would be very useful for the students to bring their own data. Please make sure the data are on your notebooks of external drives as access via internet may be too slow.

 

Prerequisites / Linked Modules

Knowledge of python via Data Camp modules (DISC6002) and statistics (DISC6003) recommended, but not required.

 

 

Approximate number of hours:

taught material + exercises + self-study: 45 hours