Lilien Lab
Department of Computer Science
Centre for Cellular and Biomolecular Research
University of Toronto

Downloads
  • We provide datasets for two ligand-size threshold values. One considers protein - small-molecule complexes as valid if the binding ligands have at least 7 heavy atoms. The second requires a 13 heavy atoms threshold. See Step 2 of description. In the case of multiple ligands, we retain only those which comply with the threshold.
  • There are two types of non-redundant lists, the protein non-redundant and the protein - small-molecule non-redundant lists. The protein non-redundant list considers only protein similarity for redundancy elimination. The protein - small-molecule list considers both protein and ligand similarities. For example, the complexes 1DVU, 1DVY and 1DVZ all share a similar protein. Their binding ligands, however, are different (see additional image); therefore, these 3 complexes may appear in the protein - small-molecule set.
  • Finally, we provide compressed files that contain the protein - small-molecule complexes split into the protein structure in PDB format and the non-covalent ligand binders in SDF format. Since the protein side doesn't change regardless the ligand size threshold we provide structures corresponds to the 7 heavy atom threshold (the 13 heavy atom threshold is a subset of these). For the ligands we provide two separate lists for 7 and 13 heavy atom threshold.
Protein Non-redundant Lists
25% non-redundant protein list with minimal ligand size of 7 or 13 heavy atoms Prot25_7HvyAtm_NonRedundant.list

Prot25_13HvyAtm_NonRedundant.list
50% non-redundant protein list with minimal ligand size of 7 or 13 heavy atoms Prot50_7HvyAtm_NonRedundant.list

Prot50_13HvyAtm_NonRedundant.list
90% non-redundant protein list with minimal ligand size of 7 or 13 heavy atoms Prot90_7HvyAtm_NonRedundant.list

Prot90_13HvyAtm_NonRedundant.list

Protein - Small-molecule Non-redundant Lists
25% non-redundant protein and 70% or 85% non-redundant ligand list with minimal ligand size of 7 or 13 heavy atoms Prot25Lig70_7HvyAtm_NonRedundant.list

Prot25Lig70_13HvyAtm_NonRedundant.list

Prot25Lig85_7HvyAtm_NonRedundant.list

Prot25Lig85_13HvyAtm_NonRedundant.list
50% non-redundant protein and 70% or 85% non-redundant ligand list with minimal ligand size of 7 or 13 heavy atoms Prot50Lig70_7HvyAtm_NonRedundant.list

Prot50Lig70_13HvyAtm_NonRedundant.list

Prot50Lig85_7HvyAtm_NonRedundant.list

Prot50Lig85_13HvyAtm_NonRedundant.list
90% non-redundant protein and 70% or 85% non-redundant ligand list with minimal ligand size of 7 or 13 heavy atoms Prot90Lig70_7HvyAtm_NonRedundant.list

Prot90Lig70_13HvyAtm_NonRedundant.list

Prot90Lig85_7HvyAtm_NonRedundant.list

Prot90Lig85_13HvyAtm_NonRedundant.list

Compressed files containing all proteins and ligands used for the selections of the non-redundant lists
Compressed files containing all ligand structures used for the selection of the non-redundant sets. ligands_sdf_7A.tar.7z
ligands_sdf_13HA.tar.7z
Compressed files containing all protein structures (without their associated ligands) used for the selection of the non-redundant sets. proteins_pdb_7HA.tar.7z

Note: the files that end with '7z' are compressed using the 7-zip utility.