AS.030.421 Data Science Tools for the Chemical and Materials Sciences

Course Webpage: http://occamy.chemistry.jhu.edu/courses/AS.030.421/fall_2020/index.php

Last Updated: December 15, 2020


FALL 2020


TOPIC

Advances in measurement techniques and simulations have driven an explosion in the variety, quality, and quantity of data collected when investigating chemical and materials processes. Advances in computing have led to the practicality of machine learning (ML) and related analytical methods to explore and extract meaning from this cornucopia of data, and data science has been called the fourth pillar of the scientific method. This course will provide an introduction to modern tools of data science, including the Python programming language, Jupyter notebooks, ML algorithms and their practical implementation, and high performance computing, with specific emphasis on applying these tools to data of chemical relevance, including UV/Vis, IR and NMR spectra, 3-D micro computed tomography, and physical property data including specific heat, magnetization, and resistivity.

Note: In Fall 2020, it will be possible for students not physically present on the JHU campus, to take this course with online synchronous lectures.


Class Times: TTh 12-1:15 PM Eastern Time
Classroom: Online. Login to view details.

INSTRUCTOR:
Prof. Tyrel M. McQueen
mcqueen@jhu.edu
Zoom Office: 410-516-6201
Office Hours: by appointment or spontaneously arranged via email ("open door policy")

Grading: 60% Homework, 5% Class Participation, 15% midterm, 20% final exam presentation

Lowest homework score will be dropped.


Required Texts:

"None" (but the supplementary resources will be of value)


Supplementary Resources:
  1. SciServer
  2. Materials, Automated

Tentative Schedule (can and will change!)
Week 1: The Big Picture: Intersection of Data Science, Chemistry, and Materials [TBD]
Week 2: Introduction/Review of Basic Python, Programming Concepts, and Key Data Structures [TBD]
Week 3: Introduction/Review of Basic Python, Programming Concepts, and Key Data Structures [TBD]
Week 4: Numerical Solutions of Classic Chemical Kinetics Models [TBD]
Week 5: Automatic Peak Fitting (e.g. for UV/Vis) [TBD]
Week 6: Eliding Proprietary Data Formats (e.g. for IR/NMR), Midterm Exam (THURSDAY) [TBD]
Week 7: Generalized Automated Curve Fitting [TBD]
Week 8: Automating "Unautomatable" Tools (1 lecture, no class Thursday) [TBD]
Week 9: Effective Data Visualization (e.g. 3D micro-CT) [TBD]
Week 10: Quantification of Visualized Data (e.g. particle sizes) [TBD]
Week 11: Introduction to Machine Learning Methods [TBD]
Week 12: Sample Application of TensorFlow to Automated Cell Sorting [TBD]
Week 13: Further Applications of Techniques from this Course [TBD]
Week 14: Independent Projects [TBD]
Thursday, December 17th Final Presentations 9 AM - 11 AM

Lectures
  1. Introductions and Conceptual Overview
  2. Digital Precision and Algorithm Performance
  3. Python 101 Part 1
  4. Python 101 Part 2 (Lecture4.ipynb)
  5. Python 101 Part 3 (Lecture5.ipynb)
  6. Python 101 Part 4 (Lecture6.ipynb)
  7. Chemical Kinetics Part 1 of 2 (Lecture7.ipynb)
  8. Chemical Kinetics Part 2 of 2 (Lecture8.ipynb)
  9. UV/Vis Curve Fitting Part 1 of 2
  10. UV/Vis Curve Fitting Part 2 of 2 (Lecture10.ipynb)
  11. UV/Vis Q&A, General File Formats
  12. Generalized Automatic Curve Fitting Part 1 of 2 (Lecture12.ipynb)
  13. Generalized Automatic Curve Fitting Part 2 of 2 (Lecture13.ipynb)
  14. Automating Unautomatable Tools (Lecture14.ipynb)
  15. Data Visualization Part 1 of 2
  16. Data Visualization Part 2 of 2 (MaterialsAutomated9-uCT)
  17. Introduction to AI/ML
  18. AI/ML Practicals (Lecture18.ipynb)
  19. TensorFlow Introduction (Lecture19.ipynb)
  20. TensorFlow Sorting Part 1 of 2 (Lecture20.ipynb)
  21. TensorFlow Sorting Part 2 of 2 (Lecture21.ipynb, Lecture21b.ipynb)
  22. Other Uses of Machine Learning
  23. Other Uses of Machine Learning 2 (HW9 walkthrough, Lecture23.ipynb)
  24. How to Design Good Tools
  25. Final Thoughts

Homework Assigments

These will be posted weekly.


Exam Information
  1. Exam 1 (Thursday, October 8th): Signup Sheet
  2. Final Presentations: Thursday, December 17th, 9 AM - 11 AM (Rubric)

Handouts

These will be posted as mentioned in the class.


Lecture Notes

As a matter of course policy, lecture notes are not available online. You are welcome to stop by TMM's office to view them anytime.


Audit Policy

Graduate students are allowed to audit the course. It is required that you attend most of the lectures, and strongly recommended that you look at and complete the homework assignments.