AS.030.421 Data Science Tools for the Chemical and Materials Sciences

Course Webpage: http://occamy.chemistry.jhu.edu/courses/AS.030.421/fall_2021/index.php

Last Updated: December 2, 2021


FALL 2021


TOPIC

Advances in measurement techniques and simulations have driven an explosion in the variety, quality, and quantity of data collected when investigating chemical and materials processes. Advances in computing have led to the practicality of machine learning (ML) and related analytical methods to explore and extract meaning from this cornucopia of data, and data science has been called the fourth pillar of the scientific method. This course will provide an introduction to modern tools of data science, including the Python programming language, Jupyter notebooks, ML algorithms and their practical implementation, and high performance computing, with specific emphasis on applying these tools to data of chemical relevance, including UV/Vis, IR and NMR spectra, 3-D micro computed tomography and hyperspectral imaging, and physical property measurements. Key aspects of data organization and curation will also be covered.


Class Times: TTh 12-1:15 PM Eastern Time
Classroom: In Person (Sometimes Online). Login to view details.

INSTRUCTOR:
Prof. Tyrel M. McQueen
mcqueen@jhu.edu
Zoom Office: Login to view details.
Office Hours: by appointment or spontaneously arranged via email ("open door policy")

Grading: 60% Homework, 5% Class Participation, 15% midterm, 20% final exam project

Lowest homework score will be dropped.


Required Texts:

"None" (but the supplementary resources will be of value)


Supplementary Resources:
  1. SciServer
  2. Materials, Automated

Tentative Schedule (can and will change!)
Week 1: The Big Picture: Intersection of Data Science, Chemistry, and Materials [TBD]
Week 2: Introduction/Review of Basic Python, Programming Concepts, and Key Data Structures [TBD]
Week 3: Introduction/Review of Data Organization, Storage, and Curation Capabilities and Techniques [TBD]
Week 4: Numerical Solutions of Classic Chemical Kinetics Models [TBD]
Week 5: Kinetics Models to Peak Fitting [TBD]
Week 6: Automatic Peak Fitting (e.g. for UV/Vis) [TBD]
Week 7: Generalized Automated Curve Fitting, Midterm Exam (THURSDAY) [TBD]
Week 8: Generalized Automated Curve Fitting, Eliding Proprietary Data Formats (e.g. for IR/NMR) [No new homework]
Week 9: The Science of Coloriometry, Effective Data Visualization (e.g. 3D micro-CT, Hyperspectral) [TBD]
Week 10: Strategies for Automating "Unautomatable" Tools [TBD]
Week 11: Introduction to Machine Learning Methods [TBD]
Week 12: Sample Application of TensorFlow to Lego Sorting [TBD]
Week 13: Bug Hunting and Validation [TBD]
Week 14: Independent Projects [TBD]
Thursday, Dec. 16 Final Presentations 2 PM - 5 PM

Lectures
  1. Introductions and Conceptual Overview
  2. Digital Precision and Algorithm Performance
  3. Python 101 Part 1 [Lecture3and4.ipynb]
  4. Python 101 Part 2 [Lecture4and5.ipynb, Lecture4TestData.dat]
  5. Python 101 Part 3 [Lecture4and5.ipynb, Sunscreen-2019.zip]
  6. Python 101 Part 4 [Lecture6.ipynb]
  7. Chemical Kinetics Part 1 [Lecture7.ipynb]
  8. Chemical Kinetics Part 2 [Lecture8.ipynb]
  9. Chemical Kinetics Part 3 [Lecture9.ipynb]
  10. UV/Vis Curve Fitting Part 1
  11. UV/Vis Curve Fitting Part 2 [Lecture11.ipynb]
  12. UV/Vis Curve Fitting Part 3
  13. Generalized Curve Fitting [Lecture13.ipynb]
  14. Curve Fitting Hints and Tricks
  15. Reading Unknown Data Files [Lecture15.ipynb, Nb3Br8.hs2, YM16_LAO[110]_1.img]
  16. Colorimetry and Good Data Visualization (Dr. Waters) [Slides]
  17. 3D Data Visualization [Lecture17.ipynb, Sample-uCT.h5, Sample-HySpec.zip]
  18. Automating Unautomatable Tools [Lecture18.ipynb, Sample-DIFFaX.zip]
  19. Automating Unautomatable Tools II [Lecture18and19.ipynb]
  20. Introduction to AI/ML
  21. Introduction to Tensorflow [Lecture21.ipynb, Face_Depixelizer_Eng.ipynb]
  22. Lego Brick Sorting Part 1 [Lego-Brick-Sorting.zip, Lecture22.ipynb]
  23. Lego Brick Sorting Part 2 [Lecture23.ipynb]
  24. Debugging and Reaction Prediction 1 [Lecture24.ipynb]
  25. Debugging and Reaction Prediction II [Lecture25.ipynb]

These will be posted weekly.


Homework Assigments

These will be posted weekly.


Exam Information
  1. Final Presentations: Thursday, December 16th, 2 PM - 5 PM (Rubric)

Handouts

These will be posted as mentioned in the class.


Lecture Notes

As a matter of course policy, lecture notes are not available online. You are welcome to stop by TMM's office to view them anytime.


Audit Policy

Graduate students are allowed to audit the course. It is required that you attend most of the lectures, and strongly recommended that you look at and complete the homework assignments.