Resources

Code/Software

The Webb group is ramping up efforts to more consistently maintain some of our routine in-house code/scripts/software and make this available to the community (at your own risk!). You can view publicly available repositories associated with the user webbtheosim; a brief summary of some of the repositories is below. Please note that these pages are not polished and at various stages of development. If you have some interest in tools associated with pages that seem underdeveloped, please contact Prof. Webb, and we can expedite updating or work with you directly. 

  • featurization - scripts and data demonstrating and testing a range of polymer featurization techniques for machine learning. These are primarily associated with an article in MSDE published in 2022 but may have periodic updates.
  • PPH_public - scripts and public data associated with protein-polymer hybrid systems. 
  • data-seeding - developmental work exploring data selection methods for machine learning applications
  • gendata - in-house code for building systems and preparing requisite input files for molecular simulation (primarily in LAMMPS) 
  • polymerize - in-house code for building polymer chains of various types from composite monomers  

Data

Some of our work results in generation of data that we think may be usable by other research groups for their own development. Below we provide links and descriptions of this data. 

  • Data on Enzyme Activity Retention in glucose oxidase, lipase, and horseradish peroxidase - This distribution contains experimentally measured data for the extent of retained enzyme activity post thermal stressing for three distinct enzymes: glucose oxidase, lipase, and horseradish peroxidase. The data is used to form conclusions and develop machine learning models as reported in the publication "Machine Learning on a Robotic Platform for the Design of Polymer-Protein Hybrids"
  • Data for Coarse-grained Intrinsically Disordered Proteins - This distribution compiles numerous physical properties for 2,585 intrinsically disordered proteins (IDPs) obtained by coarse-grained molecular dynamics simulation. This combination comprises "Dataset A" as reported in "Featurization strategies for polymer sequence or composition design by machine learning"

Other