Challenge: Cluster a Compound Library
Swipe to start coding
Write a Python function using RDKit that takes a list of SMILES strings and groups them into clusters based on pairwise Tanimoto similarity. Each cluster should contain molecules where every member has a Tanimoto similarity above 0.6 with at least one other member in the cluster.
- Parse each SMILES string into an RDKit molecule.
- Generate Morgan fingerprints for each molecule.
- Compare fingerprints pairwise using Tanimoto similarity.
- Group molecules so that each cluster contains molecules with at least one similarity above 0.6 to another member.
- Return a list of clusters, where each cluster is a list of SMILES strings.
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Awesome!
Completion rate improved to 6.25
Challenge: Cluster a Compound Library
Swipe to show menu
Swipe to start coding
Write a Python function using RDKit that takes a list of SMILES strings and groups them into clusters based on pairwise Tanimoto similarity. Each cluster should contain molecules where every member has a Tanimoto similarity above 0.6 with at least one other member in the cluster.
- Parse each SMILES string into an RDKit molecule.
- Generate Morgan fingerprints for each molecule.
- Compare fingerprints pairwise using Tanimoto similarity.
- Group molecules so that each cluster contains molecules with at least one similarity above 0.6 to another member.
- Return a list of clusters, where each cluster is a list of SMILES strings.
Solution
Thanks for your feedback!
single