kinoml.modeling.OEModeling
¶
Module Contents¶
- kinoml.modeling.OEModeling.logger¶
- kinoml.modeling.OEModeling.read_smiles(smiles: str, add_hydrogens: bool = True) openeye.oechem.OEGraphMol ¶
Read molecule from a smiles string. Explicit hydrogens will be added by default.
- Parameters
smiles (str) – Smiles string.
add_hydrogens (bool) – If explicit hydrogens should be added.
- Returns
molecule – A molecule as OpenEye molecules.
- Return type
oechem.OEGraphMol
- Raises
ValueError – Could not interpret input SMILES.
- kinoml.modeling.OEModeling.read_molecules(path: Union[str, pathlib.Path], add_hydrogens: bool = False) List[openeye.oechem.OEGraphMol] ¶
Read molecules from a file. Explicit hydrogens will not be added by default.
- Parameters
path (str, pathlib.Path) – Path to molecule file.
add_hydrogens (bool) – If explicit hydrogens should be added.
- Returns
molecules – A List of molecules as OpenEye molecules.
- Return type
list of oechem.OEGraphMol
- Raises
ValueError – Given file does not contain valid molecules.
- kinoml.modeling.OEModeling.read_electron_density(path: Union[str, pathlib.Path]) openeye.oegrid.OESkewGrid ¶
Read electron density from a file.
- Parameters
path (str, pathlib.Path) – Path to electron density file.
- Returns
electron_density – A List of molecules as OpenEye molecules.
- Return type
oegrid.OESkewGrid or None
- Raises
ValueError – Not a valid electron density file or wrong format. Only MTZ is currently supported.
- kinoml.modeling.OEModeling.write_molecules(molecules: List[openeye.oechem.OEMolBase], path: Union[str, pathlib.Path])¶
Save molecules to file.
- Parameters
molecules (list of oechem.OEMolBase) – A list of OpenEye molecules for writing.
path (str, pathlib.Path) – File path for saving molecules.
- kinoml.modeling.OEModeling.select_chain(molecule: openeye.oechem.OEMolBase, chain_id: str) openeye.oechem.OEMolBase ¶
Select a chain from an OpenEye molecule.
- Parameters
molecule (oechem.OEMolBase) – An OpenEye molecule holding a molecular structure.
chain_id (str) – Chain identifier.
- Returns
selection – An OpenEye molecule holding the selected chain.
- Return type
oechem.OEMolBase
- Raises
ValueError – No atoms were found with given chain id.
- kinoml.modeling.OEModeling.select_altloc(molecule: openeye.oechem.OEMolBase, altloc_id: str, altloc_fallback: bool = True) openeye.oechem.OEMolBase ¶
Select an alternate location from an OpenEye molecule.
- Parameters
molecule (oechem.OEMolBase) – An OpenEye molecule holding a molecular structure.
altloc_id (str) – Alternate location identifier.
altloc_fallback (bool) – If the alternate location with the highest occupancy should be used for residues that do not contain the given alternate location identifier.
- Returns
selection – An OpenEye molecule holding the selected alternate location.
- Return type
oechem.OEMolBase
- Raises
ValueError – No atoms were found with given altloc id.
- kinoml.modeling.OEModeling.remove_non_protein(molecule: openeye.oechem.OEMolBase, exceptions: Union[None, List[str]] = None, remove_water: bool = False) openeye.oechem.OEMolBase ¶
Remove non-protein atoms from an OpenEye molecule. Water will be kept by default.
- Parameters
molecule (oechem.OEMolBase) – An OpenEye molecule holding a molecular structure.
exceptions (None or list of str) – Exceptions that should not be removed.
remove_water (bool) – If water should be removed.
- Returns
selection – An OpenEye molecule holding the filtered structure.
- Return type
oechem.OEMolBase
- kinoml.modeling.OEModeling.delete_residue(structure: openeye.oechem.OEMolBase, chain_id: str, residue_name: str, residue_id: int) openeye.oechem.OEGraphMol ¶
Delete a residue from an OpenEye molecule.
- Parameters
structure (oechem.OEMolBase) – An OpenEye molecule with residue information.
chain_id (str) – The chain id of the residue
residue_name (str) – The residue name in three letter code.
residue_id (int) – The residue id.
- Returns
The OpenEye molecule without the residue.
- Return type
oechem.OEMolBase
- Raises
ValueError – Defined residue was not found in given structure.
- kinoml.modeling.OEModeling.get_expression_tags(structure: openeye.oechem.OEMolBase, labels: Iterable[str] = ('EXPRESSION TAG', 'CLONING ARTIFACT')) List[Dict] ¶
Get the chain id, residue name and residue id of residues in expression tags from a protein structure listed in the PDB header section “SEQADV”.
- Parameters
structure (oechem.OEMolBase) – An OpenEye molecule with associated PDB header section “SEQADV”.
labels (Iterable of str) – The ‘SEQADV’ labels defining expression tags. Default: (‘EXPRESSION TAG’, ‘CLONING ARTIFACT’).
- Returns
The chain id, residue name and residue id of residues in the expression tags.
- Return type
list of dict
- kinoml.modeling.OEModeling.assign_caps(structure: openeye.oechem.OEMolBase, real_termini: Union[Iterable[int] or None] = None) openeye.oechem.OEMolBase ¶
Cap N and C termini of the given input structure. Real termini can be protected from capping by providing the corresponding residue ids via the ‘real_termini’ argument.
- Parameters
structure (oechem.OEMolBase) – The OpenEye molecule holding the protein structure to cap.
real_termini (iterable of int or None) – The biologically relevant real termini that should be prevented from capping.
- Returns
structure – The OpenEye molecule holding the capped structure.
- Return type
oechem.OEMolBase
- kinoml.modeling.OEModeling.prepare_structure(structure: openeye.oechem.OEMolBase, has_ligand: bool = False, electron_density: Union[openeye.oegrid.OESkewGrid, None] = None, loop_db: Union[str, None] = None, ligand_name: Union[str, None] = None, chain_id: Union[str, None] = None, alternate_location: Union[str, None] = None, cap_termini: bool = True, real_termini: Union[List[int], None] = None) openeye.oechem.OEDesignUnit ¶
Prepare an OpenEye molecule holding a protein ligand complex for docking.
- Parameters
structure (oechem.OEMolBase) – An OpenEye molecule holding a structure with protein and optionally a ligand.
has_ligand (bool) – If structure contains a ligand that should be used in design unit generation.
electron_density (oegrid.OESkewGrid) – An OpenEye grid holding the electron density.
loop_db (str or None) – Path to OpenEye Spruce loop database. You can request a copy at https://www.eyesopen.com/database-downloads. A testing subset (3TPP) is available at https://docs.eyesopen.com/toolkits/python/sprucetk/examples_make_design_units.html.
ligand_name (str or None) – The name of the ligand located in the binding pocket of interest.
chain_id (str or None) – The chain id of interest. If chain id is None, best chain will be selected according to OESpruce.
alternate_location (str or None) – The alternate location of interest. If alternate location is None, best alternate location will be selected according to OEChem.
cap_termini (bool) – If termini should be capped with ACE and NME.
real_termini (list of int or None) – Residue numbers of biologically real termini will not be capped with ACE and NME.
- Returns
design_unit – An OpenEye design unit holding the prepared structure with the highest quality among all identified design units.
- Return type
oechem.OEDesignUnit
- Raises
ValueError – No design unit found with given chain ID, ligand name and alternate location.
- kinoml.modeling.OEModeling.prepare_complex(protein_ligand_complex: openeye.oechem.OEMolBase, electron_density: Union[openeye.oegrid.OESkewGrid, None] = None, loop_db: Union[str, None] = None, ligand_name: Union[str, None] = None, chain_id: Union[str, None] = None, alternate_location: Union[str, None] = None, cap_termini: bool = True, real_termini: Union[List[int], None] = None) openeye.oechem.OEDesignUnit ¶
Prepare an OpenEye molecule holding a protein ligand complex for docking.
- Parameters
protein_ligand_complex (oechem.OEMolBase) – An OpenEye molecule holding a structure with protein and ligand.
electron_density (oegrid.OESkewGrid) – An OpenEye grid holding the electron density.
loop_db (str or None) – Path to OpenEye Spruce loop database.
ligand_name (str or None) – The name of the ligand located in the binding pocket of interest.
chain_id (str or None) – The chain id of interest. If chain id is None, best chain will be selected according to OESpruce.
alternate_location (str or None) – The alternate location of interest. If alternate location is None, best alternate location will be selected according to OEChem.
cap_termini (bool) – If termini should be capped with ACE and NME.
real_termini (list of int or None) – Residue numbers of biologically real termini will not be capped with ACE and NME.
- Returns
design_unit – An OpenEye design unit holding the prepared structure with the highest quality among all identified design units.
- Return type
oechem.OEDesignUnit
- Raises
ValueError – No design unit found with given chain ID, ligand name and alternate location.
- kinoml.modeling.OEModeling.prepare_protein(protein: openeye.oechem.OEMolBase, loop_db: Union[str, None] = None, chain_id: Union[str, None] = None, alternate_location: Union[str, None] = None, cap_termini: bool = True, real_termini: Union[List[int], None] = None) openeye.oechem.OEDesignUnit ¶
Prepare an OpenEye molecule holding a protein structure for docking.
- Parameters
protein (oechem.OEMolBase) – An OpenEye molecule holding a structure with protein.
loop_db (str) – Path to OpenEye Spruce loop database.
chain_id (str or None) – The chain id of interest. If chain id is None, best chain will be selected according to OESpruce.
alternate_location (str or None) – The alternate location of interest. If alternate location is None, best alternate location will be selected according to OEChem.
cap_termini (bool) – If termini should be capped with ACE and NME.
real_termini (list of int or None) – Residue numbers of biologically real termini will not be capped with ACE and NME.
- Returns
design_unit – An OpenEye design unit holding the prepared structure with the highest quality among all identified design units.
- Return type
oechem.OEDesignUnit or None
- Raises
ValueError – No design unit found with given chain ID, ligand name and alternate location.
- kinoml.modeling.OEModeling.generate_tautomers(molecule: Union[openeye.oechem.OEMolBase, openeye.oechem.OEMCMolBase], max_generate: int = 4096, max_return: int = 16, pKa_norm: bool = True) List[Union[openeye.oechem.OEMolBase, openeye.oechem.OEMCMolBase]] ¶
Generate reasonable tautomers of a given molecule.
- Parameters
molecule (oechem.OEMolBase or oechem.OEMCMolBase) – An OpenEye molecule.
max_generate (int) – Maximal number of tautomers to generate.
max_return (int) – Maximal number of tautomers to return.
pKa_norm (bool) – Assign the predominant ionization state at pH ~7.4.
- Returns
tautomers – A list of OpenEye molecules holding the tautomers.
- Return type
list of oechem.OEMolBase or oechem.OEMCMolBase
- kinoml.modeling.OEModeling.generate_enantiomers(molecule: openeye.oechem.OEMolBase, max_centers: int = 12, force_flip: bool = False, enumerate_nitrogens: bool = False) List[openeye.oechem.OEMolBase] ¶
Generate enantiomers of a given molecule.
- Parameters
molecule (oechem.OEMolBase) – An OpenEye molecule.
max_centers (int) – The maximal number of stereo centers to enumerate.
force_flip (bool) – If specified stereo centers should be enumerated.
enumerate_nitrogens (bool) – If nitrogens with invertible pyramidal geometry should be enumerated.
- Returns
enantiomers – A list of OpenEye molecules holding the enantiomers.
- Return type
list of oechem.OEMolBase
- kinoml.modeling.OEModeling.generate_conformations(molecule: openeye.oechem.OEMolBase, options: openeye.oeomega.OEOmegaOptions = oeomega.OEOmegaOptions(oeomega.OEOmegaSampling_Classic)) openeye.oechem.OEMCMolBase ¶
Generate conformations of a given molecule.
- Parameters
molecule (oechem.OEMolBase) – An OpenEye molecule.
options (oeomega.OEOmegaOptions, default=oeomega.OEOmegaOptions(oeomega.OEOmegaSampling_Classic)) – Options for generating conformations. If the given molecule is a macrocycle only the maximal number of conformations will be changed from the defaults defined in oeomega.OEMacrocycleOmegaOptions().
- Returns
conformations – An OpenEye multi-conformer molecule holding the generated conformations.
- Return type
oechem.OEMCMolBase
- kinoml.modeling.OEModeling.generate_reasonable_conformations(molecule: openeye.oechem.OEMolBase, options: openeye.oeomega.OEOmegaOptions = oeomega.OEOmegaOptions(oeomega.OEOmegaSampling_Classic), pKa_norm: bool = True) List[openeye.oechem.OEMCMolBase] ¶
Generate conformations of reasonable enantiomers and tautomers of a given molecule.
- Parameters
molecule (oechem.OEMolBase) – An OpenEye molecule.
options (oeomega.OEOmegaOptions, default=oeomega.OEOmegaOptions(oeomega.OEOmegaSampling_Classic)) – Options for generating conformations. If the given molecule is a macrocycle only the maximal number of conformations will be changed from the defaults defined in oeomega.OEMacrocycleOmegaOptions().
pKa_norm (bool) – Assign the predominant ionization state at pH ~7.4.
- Returns
conformations_ensemble – A list of OpenEye multi-conformer molecules.
- Return type
list of oechem.OEMCMolBase
- kinoml.modeling.OEModeling.overlay_molecules(reference_molecule: openeye.oechem.OEMolBase, fit_molecule: openeye.oechem.OEMCMolBase)¶
Overlay a multi-conformer molecule to a single-conformer molecule and calculate the TanimotoCombo score.
- Parameters
reference_molecule (oechem.OEMolBase) – An OpenEye molecule holding a single conformation of the reference molecule for overlay.
fit_molecule (oechem.OEMCMolBase) – An OpenEye multi-conformer molecule holding the conformations of a molecule to fit during overlay.
- Returns
The TanimotoCombo score and the OpenEye molecules of the best overlay
- Return type
float, list of oechem.OEGraphMol
- kinoml.modeling.OEModeling.enumerate_isomeric_smiles(molecule: openeye.oechem.OEMolBase) Set[str] ¶
Enumerate reasonable isomeric SMILES representations of a given OpenEye molecule.
- Parameters
molecule (oechem.OEMolBase) – An OpenEye molecule.
- Returns
smiles_set – A set of reasonable isomeric SMILES strings.
- Return type
set of str
- kinoml.modeling.OEModeling.are_identical_molecules(molecule1: openeye.oechem.OEMolBase, molecule2: openeye.oechem.OEMolBase) bool ¶
Check if two OpenEye molecules are identical.
- Parameters
molecule1 (oechem.OEMolBase) – The first OpenEye molecule.
molecule2 (oechem.OEMolBase) – The second OpenEye molecule.
- Returns
True if identical molecules, else False.
- Return type
bool
- kinoml.modeling.OEModeling.get_sequence(structure: openeye.oechem.OEMolBase) str ¶
Get the amino acid sequence with one letter characters of an OpenEye molecule. All residues not perceived as standard amino acid will receive the character ‘X’.
- Parameters
structure (oechem.OEMolBase) – An OpenEye molecule.
- Returns
sequence – The amino acid sequence with one letter characters.
- Return type
str
- kinoml.modeling.OEModeling.get_structure_sequence_alignment(structure: openeye.oechem.OEMolBase, sequence: str) Tuple[str, str] ¶
Generate an alignment between a protein structure and an amino acid sequence. The provided protein structure should only contain protein residues to prevent unexpected behavior. Also, this alignment was optimized for highly similar sequences, i.e. only few mutations, deletions and insertions. Non protein residues will be marked with “X”.
- Parameters
structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure.
sequence (str) – A one letter amino acid sequence.
- Returns
structure_sequence_aligned (str) – The aligned protein structure sequence with gaps denoted as “-“.
sequence_aligned (str) – The aligned amino acid sequence with gaps denoted as “-“.
- kinoml.modeling.OEModeling.apply_deletions(target_structure: openeye.oechem.OEMolBase, template_sequence: str, delete_n_anchors: int = 2) openeye.oechem.OEMolBase ¶
Apply deletions to a protein structure according to an amino acid sequence. The provided protein structure should only contain protein residues to prevent unexpected behavior.
- Parameters
target_structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure for which deletions should be applied.
template_sequence (str) – A template one letter amino acid sequence, which holds potential deletions when compared to the target structure sequence.
delete_n_anchors (int) – Specify how many anchoring residues should be deleted at each side of the deletion. Important if connecting anchoring residues after deletion is intended, e.g. via apply_insertion. Only affects deletions in the middle of a sequence, not at the end or the beginning.
- Returns
structure_with_deletions – An OpenEye molecule holding the protein structure with applied deletions.
- Return type
oechem.OEMolBase
- Raises
ValueError – Negative values are not allowed for ‘delete_n_anchors’.
- kinoml.modeling.OEModeling.apply_insertions(target_structure: openeye.oechem.OEMolBase, template_sequence: str, loop_db: Union[str, pathlib.Path], ligand: Union[openeye.oechem.OEMolBase, None] = None) openeye.oechem.OEMolBase ¶
Apply insertions to a protein structure according to an amino acid sequence. The provided protein structure should only contain protein residues to prevent unexpected behavior.
- Parameters
target_structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure for which insertions should be applied.
template_sequence (str) – A template one letter amino acid sequence, which holds potential insertions when compared to the target structure sequence.
loop_db (str or Path) – The path to the loop database used by OESpruce to model missing loops.
ligand (oechem.OEMolBase or None, default=None) – An OpenEye molecule that should be checked for heavy atom clashes with built insertions.
- Returns
structure_with_insertions – An OpenEye molecule holding the protein structure with applied insertions.
- Return type
oechem.OEMolBase
- kinoml.modeling.OEModeling.apply_mutations(target_structure: openeye.oechem.OEMolBase, template_sequence: str, fallback_delete: bool = True) openeye.oechem.OEMolBase ¶
Mutate a protein structure according to an amino acid sequence. The provided protein structure should only contain protein residues to prevent unexpected behavior. Residues that could not be mutated will be deleted by default.
- Parameters
target_structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure to mutate.
template_sequence (str) – A template one letter amino acid sequence, which holds potential mutations when compared to the target structure sequence.
fallback_delete (bool) – If the residue should be deleted if it could not be mutated.
- Returns
An OpenEye molecule holding the mutated protein structure.
- Return type
oechem.OEMolBase
- Raises
ValueError – Mutation {oeresidue.GetName()}{oeresidue.GetResidueNumber()}{three_letter_code} failed! Only raised when fallback_delete is set False.
- kinoml.modeling.OEModeling.delete_partial_residues(structure: openeye.oechem.OEMolBase) openeye.oechem.OEMolBase ¶
Delete residues with missing sidechain or backbone atoms. The backbone is considered complete if atoms C, CA and N are present.
- Parameters
structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure.
- Returns
structure – An OpenEye molecule holding only residues with completely modeled side chains.
- Return type
oechem.OEMolBase
- kinoml.modeling.OEModeling.delete_short_protein_segments(structure: openeye.oechem.OEMolBase) openeye.oechem.OEMolBase ¶
Delete protein segments consisting of 3 or less residues.
- Parameters
structure (oechem.OEMolBase) – An OpenEye molecule holding a protein with possibly short segments.
- Returns
structure – An OpenEye molecule holding the protein without short segments.
- Return type
oechem.OEMolBase
- kinoml.modeling.OEModeling.delete_clashing_sidechains(structure: openeye.oechem.OEMolBase, cutoff: float = 2.0) openeye.oechem.OEMolBase ¶
Delete side chains that are clashing with other atoms of the given structure.
Note: Structures containing non-protein residues may lead to unexpected behavior, since also those residues will be deleted if clashing with other residues of the system. However, this behavior is important to be able to also check PTMs for clashes.
- Parameters
structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure.
cutoff (float) – The distance cutoff that is used for defining a heavy atom clash. Note: Going bigger than 2.3 A may lead to the deletion of residues involved in strong hydrogen bonds.
- Returns
processed_structure – An OpenEye molecule holding the protein structure without clashing sidechains.
- Return type
oechem.OEMolBase
- kinoml.modeling.OEModeling.get_atom_coordinates(molecule: openeye.oechem.OEMolBase) List[Tuple[float, float, float]] ¶
Retrieve the atom coordinates of an OpenEye molecule.
- Parameters
molecule (oechem.OEMolBase) – An OpenEye molecule for which the coordinates should be retrieved.
- Returns
coordinates – The coordinates of the given molecule atoms.
- Return type
list of tuple of float
- kinoml.modeling.OEModeling.renumber_structure(target_structure: openeye.oechem.OEMolBase, residue_ids: Iterable[int]) openeye.oechem.OEGraphMol ¶
Renumber the residues of a protein structure according to the given list of residue IDs.
- Parameters
target_structure (oechem.OEMolBase) – An OpenEye molecule holding the protein structure to renumber.
residue_ids (iterable of int) – An iterable of residue IDs matching the order of the target structure.
- Returns
renumbered_structure – An OpenEye molecule holding the cropped protein structure.
- Return type
oechem.OEMolBase
- Raises
ValueError – Number of given residue IDs does not match number of residues in the given structure.
ValueError – Given residue IDs contain wrong types, only int is allowed.
- kinoml.modeling.OEModeling.superpose_proteins(reference_protein: openeye.oechem.OEMolBase, fit_protein: openeye.oechem.OEMolBase, residues: Iterable = tuple(), chain_id: str = ' ', insertion_code: str = ' ') openeye.oechem.OEMolBase ¶
Superpose a protein structure onto a reference protein. The superposition can be customized to consider only the specified residues.
- Parameters
reference_protein (oechem.OEMolBase) – An OpenEye molecule holding a protein structure which will be used as reference during superposition.
fit_protein (oechem.OEMolBase) – An OpenEye molecule holding a protein structure which will be superposed onto the reference protein.
residues (Iterable of str) – Residues that should be used during superposition in format “GLY123”.
chain_id (str) – Chain identifier for residues that should be used during superposition.
insertion_code (str) – Insertion code for residues that should be used during superposition.
- Returns
superposed_protein – An OpenEye molecule holding the superposed protein structure.
- Return type
oechem.OEMolBase
- kinoml.modeling.OEModeling.update_residue_identifiers(structure: openeye.oechem.OEMolBase, keep_protein_residue_ids: bool = True, keep_chain_ids: bool = False) openeye.oechem.OEMolBase ¶
Update the atom, residue and chain IDs of the given molecular structure. All residues become part of chain A, unless ‘keep_chain_ids’ is set True. Atom IDs will start from 1. Residue IDs will start from 1, except ‘keep_protein_residue_ids’ is set True. This is especially useful, if molecules were merged, which can result in overlapping atom and residue IDs as well as separate chains.
- Parameters
structure (oechem.OEMolBase) – The OpenEye molecule structure for updating atom and residue ids.
keep_protein_residue_ids (bool) – If the protein residues should be kept.
keep_chain_ids (bool) – If the chain IDS should be kept.
- Returns
structure – The OpenEye molecule structure with updated atom and residue ids.
- Return type
oechem.OEMolBase
- kinoml.modeling.OEModeling.split_molecule_components(molecule: openeye.oechem.OEMolBase) List[openeye.oechem.OEGraphMol] ¶
Split an OpenEye molecule into its bonded components.
- Parameters
molecule (oechem.OEMolBase) – An OpenEye molecule holding multiple components.
- Returns
components – A list of OpenEye molecules holding the split components.
- Return type
list of oechem.OEGraphMol
- kinoml.modeling.OEModeling.residue_ids_to_residue_names(structure: openeye.oechem.OEMolBase, residue_ids: List[int], chain_id: Union[None, str] = None) List[str] ¶
Get the corresponding residue names for a list of residue IDs and a give OpenEye molecule with residue information.
- Parameters
structure (oechem.OEMolBase) – An OpenEye molecule with residue information.
residue_ids (list of int) – A list of residue IDs.
chain_id (None or str) – The chain ID to filter for.
- Returns
residue_names – The corresponding residue names as three letter codes.
- Return type
list of str
- Raises
ValueError – No residue found for residue ID {resid}.
ValueError – Found multiple residues for residue ID {resid}.