kinoml.modeling.OEModeling

Module Contents

kinoml.modeling.OEModeling.logger
kinoml.modeling.OEModeling.read_smiles(smiles: str, add_hydrogens: bool = True) openeye.oechem.OEGraphMol

Read molecule from a smiles string. Explicit hydrogens will be added by default.

Parameters
  • smiles (str) – Smiles string.

  • add_hydrogens (bool) – If explicit hydrogens should be added.

Returns

molecule – A molecule as OpenEye molecules.

Return type

oechem.OEGraphMol

Raises

ValueError – Could not interpret input SMILES.

kinoml.modeling.OEModeling.read_molecules(path: Union[str, pathlib.Path], add_hydrogens: bool = False) List[openeye.oechem.OEGraphMol]

Read molecules from a file. Explicit hydrogens will not be added by default.

Parameters
  • path (str, pathlib.Path) – Path to molecule file.

  • add_hydrogens (bool) – If explicit hydrogens should be added.

Returns

molecules – A List of molecules as OpenEye molecules.

Return type

list of oechem.OEGraphMol

Raises

ValueError – Given file does not contain valid molecules.

kinoml.modeling.OEModeling.read_electron_density(path: Union[str, pathlib.Path]) openeye.oegrid.OESkewGrid

Read electron density from a file.

Parameters

path (str, pathlib.Path) – Path to electron density file.

Returns

electron_density – A List of molecules as OpenEye molecules.

Return type

oegrid.OESkewGrid or None

Raises

ValueError – Not a valid electron density file or wrong format. Only MTZ is currently supported.

kinoml.modeling.OEModeling.write_molecules(molecules: List[openeye.oechem.OEMolBase], path: Union[str, pathlib.Path])

Save molecules to file.

Parameters
  • molecules (list of oechem.OEMolBase) – A list of OpenEye molecules for writing.

  • path (str, pathlib.Path) – File path for saving molecules.

kinoml.modeling.OEModeling.select_chain(molecule: openeye.oechem.OEMolBase, chain_id: str) openeye.oechem.OEMolBase

Select a chain from an OpenEye molecule.

Parameters
  • molecule (oechem.OEMolBase) – An OpenEye molecule holding a molecular structure.

  • chain_id (str) – Chain identifier.

Returns

selection – An OpenEye molecule holding the selected chain.

Return type

oechem.OEMolBase

Raises

ValueError – No atoms were found with given chain id.

kinoml.modeling.OEModeling.select_altloc(molecule: openeye.oechem.OEMolBase, altloc_id: str, altloc_fallback: bool = True) openeye.oechem.OEMolBase

Select an alternate location from an OpenEye molecule.

Parameters
  • molecule (oechem.OEMolBase) – An OpenEye molecule holding a molecular structure.

  • altloc_id (str) – Alternate location identifier.

  • altloc_fallback (bool) – If the alternate location with the highest occupancy should be used for residues that do not contain the given alternate location identifier.

Returns

selection – An OpenEye molecule holding the selected alternate location.

Return type

oechem.OEMolBase

Raises

ValueError – No atoms were found with given altloc id.

kinoml.modeling.OEModeling.remove_non_protein(molecule: openeye.oechem.OEMolBase, exceptions: Union[None, List[str]] = None, remove_water: bool = False) openeye.oechem.OEMolBase

Remove non-protein atoms from an OpenEye molecule. Water will be kept by default.

Parameters
  • molecule (oechem.OEMolBase) – An OpenEye molecule holding a molecular structure.

  • exceptions (None or list of str) – Exceptions that should not be removed.

  • remove_water (bool) – If water should be removed.

Returns

selection – An OpenEye molecule holding the filtered structure.

Return type

oechem.OEMolBase

kinoml.modeling.OEModeling.delete_residue(structure: openeye.oechem.OEMolBase, chain_id: str, residue_name: str, residue_id: int) openeye.oechem.OEGraphMol

Delete a residue from an OpenEye molecule.

Parameters
  • structure (oechem.OEMolBase) – An OpenEye molecule with residue information.

  • chain_id (str) – The chain id of the residue

  • residue_name (str) – The residue name in three letter code.

  • residue_id (int) – The residue id.

Returns

The OpenEye molecule without the residue.

Return type

oechem.OEMolBase

Raises

ValueError – Defined residue was not found in given structure.

kinoml.modeling.OEModeling.get_expression_tags(structure: openeye.oechem.OEMolBase, labels: Iterable[str] = ('EXPRESSION TAG', 'CLONING ARTIFACT')) List[Dict]

Get the chain id, residue name and residue id of residues in expression tags from a protein structure listed in the PDB header section “SEQADV”.

Parameters
  • structure (oechem.OEMolBase) – An OpenEye molecule with associated PDB header section “SEQADV”.

  • labels (Iterable of str) – The ‘SEQADV’ labels defining expression tags. Default: (‘EXPRESSION TAG’, ‘CLONING ARTIFACT’).

Returns

The chain id, residue name and residue id of residues in the expression tags.

Return type

list of dict

kinoml.modeling.OEModeling.assign_caps(structure: openeye.oechem.OEMolBase, real_termini: Union[Iterable[int] or None] = None) openeye.oechem.OEMolBase

Cap N and C termini of the given input structure. Real termini can be protected from capping by providing the corresponding residue ids via the ‘real_termini’ argument.

Parameters
  • structure (oechem.OEMolBase) – The OpenEye molecule holding the protein structure to cap.

  • real_termini (iterable of int or None) – The biologically relevant real termini that should be prevented from capping.

Returns

structure – The OpenEye molecule holding the capped structure.

Return type

oechem.OEMolBase

kinoml.modeling.OEModeling.prepare_structure(structure: openeye.oechem.OEMolBase, has_ligand: bool = False, electron_density: Union[openeye.oegrid.OESkewGrid, None] = None, loop_db: Union[str, None] = None, ligand_name: Union[str, None] = None, chain_id: Union[str, None] = None, alternate_location: Union[str, None] = None, cap_termini: bool = True, real_termini: Union[List[int], None] = None) openeye.oechem.OEDesignUnit

Prepare an OpenEye molecule holding a protein ligand complex for docking.

Parameters
  • structure (oechem.OEMolBase) – An OpenEye molecule holding a structure with protein and optionally a ligand.

  • has_ligand (bool) – If structure contains a ligand that should be used in design unit generation.

  • electron_density (oegrid.OESkewGrid) – An OpenEye grid holding the electron density.

  • loop_db (str or None) – Path to OpenEye Spruce loop database. You can request a copy at https://www.eyesopen.com/database-downloads. A testing subset (3TPP) is available at https://docs.eyesopen.com/toolkits/python/sprucetk/examples_make_design_units.html.

  • ligand_name (str or None) – The name of the ligand located in the binding pocket of interest.

  • chain_id (str or None) – The chain id of interest. If chain id is None, best chain will be selected according to OESpruce.

  • alternate_location (str or None) – The alternate location of interest. If alternate location is None, best alternate location will be selected according to OEChem.

  • cap_termini (bool) – If termini should be capped with ACE and NME.

  • real_termini (list of int or None) – Residue numbers of biologically real termini will not be capped with ACE and NME.

Returns

design_unit – An OpenEye design unit holding the prepared structure with the highest quality among all identified design units.

Return type

oechem.OEDesignUnit

Raises

ValueError – No design unit found with given chain ID, ligand name and alternate location.

kinoml.modeling.OEModeling.prepare_complex(protein_ligand_complex: openeye.oechem.OEMolBase, electron_density: Union[openeye.oegrid.OESkewGrid, None] = None, loop_db: Union[str, None] = None, ligand_name: Union[str, None] = None, chain_id: Union[str, None] = None, alternate_location: Union[str, None] = None, cap_termini: bool = True, real_termini: Union[List[int], None] = None) openeye.oechem.OEDesignUnit

Prepare an OpenEye molecule holding a protein ligand complex for docking.

Parameters
  • protein_ligand_complex (oechem.OEMolBase) – An OpenEye molecule holding a structure with protein and ligand.

  • electron_density (oegrid.OESkewGrid) – An OpenEye grid holding the electron density.

  • loop_db (str or None) – Path to OpenEye Spruce loop database.

  • ligand_name (str or None) – The name of the ligand located in the binding pocket of interest.

  • chain_id (str or None) – The chain id of interest. If chain id is None, best chain will be selected according to OESpruce.

  • alternate_location (str or None) – The alternate location of interest. If alternate location is None, best alternate location will be selected according to OEChem.

  • cap_termini (bool) – If termini should be capped with ACE and NME.

  • real_termini (list of int or None) – Residue numbers of biologically real termini will not be capped with ACE and NME.

Returns

design_unit – An OpenEye design unit holding the prepared structure with the highest quality among all identified design units.

Return type

oechem.OEDesignUnit

Raises

ValueError – No design unit found with given chain ID, ligand name and alternate location.

kinoml.modeling.OEModeling.prepare_protein(protein: openeye.oechem.OEMolBase, loop_db: Union[str, None] = None, chain_id: Union[str, None] = None, alternate_location: Union[str, None] = None, cap_termini: bool = True, real_termini: Union[List[int], None] = None) openeye.oechem.OEDesignUnit

Prepare an OpenEye molecule holding a protein structure for docking.

Parameters
  • protein (oechem.OEMolBase) – An OpenEye molecule holding a structure with protein.

  • loop_db (str) – Path to OpenEye Spruce loop database.

  • chain_id (str or None) – The chain id of interest. If chain id is None, best chain will be selected according to OESpruce.

  • alternate_location (str or None) – The alternate location of interest. If alternate location is None, best alternate location will be selected according to OEChem.

  • cap_termini (bool) – If termini should be capped with ACE and NME.

  • real_termini (list of int or None) – Residue numbers of biologically real termini will not be capped with ACE and NME.

Returns

design_unit – An OpenEye design unit holding the prepared structure with the highest quality among all identified design units.

Return type

oechem.OEDesignUnit or None

Raises

ValueError – No design unit found with given chain ID, ligand name and alternate location.

kinoml.modeling.OEModeling.generate_tautomers(molecule: Union[openeye.oechem.OEMolBase, openeye.oechem.OEMCMolBase], max_generate: int = 4096, max_return: int = 16, pKa_norm: bool = True) List[Union[openeye.oechem.OEMolBase, openeye.oechem.OEMCMolBase]]

Generate reasonable tautomers of a given molecule.

Parameters
  • molecule (oechem.OEMolBase or oechem.OEMCMolBase) – An OpenEye molecule.

  • max_generate (int) – Maximal number of tautomers to generate.

  • max_return (int) – Maximal number of tautomers to return.

  • pKa_norm (bool) – Assign the predominant ionization state at pH ~7.4.

Returns

tautomers – A list of OpenEye molecules holding the tautomers.

Return type

list of oechem.OEMolBase or oechem.OEMCMolBase

kinoml.modeling.OEModeling.generate_enantiomers(molecule: openeye.oechem.OEMolBase, max_centers: int = 12, force_flip: bool = False, enumerate_nitrogens: bool = False) List[openeye.oechem.OEMolBase]

Generate enantiomers of a given molecule.

Parameters
  • molecule (oechem.OEMolBase) – An OpenEye molecule.

  • max_centers (int) – The maximal number of stereo centers to enumerate.

  • force_flip (bool) – If specified stereo centers should be enumerated.

  • enumerate_nitrogens (bool) – If nitrogens with invertible pyramidal geometry should be enumerated.

Returns

enantiomers – A list of OpenEye molecules holding the enantiomers.

Return type

list of oechem.OEMolBase

kinoml.modeling.OEModeling.generate_conformations(molecule: openeye.oechem.OEMolBase, options: openeye.oeomega.OEOmegaOptions = oeomega.OEOmegaOptions(oeomega.OEOmegaSampling_Classic)) openeye.oechem.OEMCMolBase

Generate conformations of a given molecule.

Parameters
  • molecule (oechem.OEMolBase) – An OpenEye molecule.

  • options (oeomega.OEOmegaOptions, default=oeomega.OEOmegaOptions(oeomega.OEOmegaSampling_Classic)) – Options for generating conformations. If the given molecule is a macrocycle only the maximal number of conformations will be changed from the defaults defined in oeomega.OEMacrocycleOmegaOptions().

Returns

conformations – An OpenEye multi-conformer molecule holding the generated conformations.

Return type

oechem.OEMCMolBase

kinoml.modeling.OEModeling.generate_reasonable_conformations(molecule: openeye.oechem.OEMolBase, options: openeye.oeomega.OEOmegaOptions = oeomega.OEOmegaOptions(oeomega.OEOmegaSampling_Classic), pKa_norm: bool = True) List[openeye.oechem.OEMCMolBase]

Generate conformations of reasonable enantiomers and tautomers of a given molecule.

Parameters
  • molecule (oechem.OEMolBase) – An OpenEye molecule.

  • options (oeomega.OEOmegaOptions, default=oeomega.OEOmegaOptions(oeomega.OEOmegaSampling_Classic)) – Options for generating conformations. If the given molecule is a macrocycle only the maximal number of conformations will be changed from the defaults defined in oeomega.OEMacrocycleOmegaOptions().

  • pKa_norm (bool) – Assign the predominant ionization state at pH ~7.4.

Returns

conformations_ensemble – A list of OpenEye multi-conformer molecules.

Return type

list of oechem.OEMCMolBase

kinoml.modeling.OEModeling.overlay_molecules(reference_molecule: openeye.oechem.OEMolBase, fit_molecule: openeye.oechem.OEMCMolBase)

Overlay a multi-conformer molecule to a single-conformer molecule and calculate the TanimotoCombo score.

Parameters
  • reference_molecule (oechem.OEMolBase) – An OpenEye molecule holding a single conformation of the reference molecule for overlay.

  • fit_molecule (oechem.OEMCMolBase) – An OpenEye multi-conformer molecule holding the conformations of a molecule to fit during overlay.

Returns

The TanimotoCombo score and the OpenEye molecules of the best overlay

Return type

float, list of oechem.OEGraphMol

kinoml.modeling.OEModeling.enumerate_isomeric_smiles(molecule: openeye.oechem.OEMolBase) Set[str]

Enumerate reasonable isomeric SMILES representations of a given OpenEye molecule.

Parameters

molecule (oechem.OEMolBase) – An OpenEye molecule.

Returns

smiles_set – A set of reasonable isomeric SMILES strings.

Return type

set of str

kinoml.modeling.OEModeling.are_identical_molecules(molecule1: openeye.oechem.OEMolBase, molecule2: openeye.oechem.OEMolBase) bool

Check if two OpenEye molecules are identical.

Parameters
  • molecule1 (oechem.OEMolBase) – The first OpenEye molecule.

  • molecule2 (oechem.OEMolBase) – The second OpenEye molecule.

Returns

True if identical molecules, else False.

Return type

bool

kinoml.modeling.OEModeling.get_sequence(structure: openeye.oechem.OEMolBase) str

Get the amino acid sequence with one letter characters of an OpenEye molecule. All residues not perceived as standard amino acid will receive the character ‘X’.

Parameters

structure (oechem.OEMolBase) – An OpenEye molecule.

Returns

sequence – The amino acid sequence with one letter characters.

Return type

str

kinoml.modeling.OEModeling.get_structure_sequence_alignment(structure: openeye.oechem.OEMolBase, sequence: str) Tuple[str, str]

Generate an alignment between a protein structure and an amino acid sequence. The provided protein structure should only contain protein residues to prevent unexpected behavior. Also, this alignment was optimized for highly similar sequences, i.e. only few mutations, deletions and insertions. Non protein residues will be marked with “X”.

Parameters
  • structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure.

  • sequence (str) – A one letter amino acid sequence.

Returns

  • structure_sequence_aligned (str) – The aligned protein structure sequence with gaps denoted as “-“.

  • sequence_aligned (str) – The aligned amino acid sequence with gaps denoted as “-“.

kinoml.modeling.OEModeling.apply_deletions(target_structure: openeye.oechem.OEMolBase, template_sequence: str, delete_n_anchors: int = 2) openeye.oechem.OEMolBase

Apply deletions to a protein structure according to an amino acid sequence. The provided protein structure should only contain protein residues to prevent unexpected behavior.

Parameters
  • target_structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure for which deletions should be applied.

  • template_sequence (str) – A template one letter amino acid sequence, which holds potential deletions when compared to the target structure sequence.

  • delete_n_anchors (int) – Specify how many anchoring residues should be deleted at each side of the deletion. Important if connecting anchoring residues after deletion is intended, e.g. via apply_insertion. Only affects deletions in the middle of a sequence, not at the end or the beginning.

Returns

structure_with_deletions – An OpenEye molecule holding the protein structure with applied deletions.

Return type

oechem.OEMolBase

Raises

ValueError – Negative values are not allowed for ‘delete_n_anchors’.

kinoml.modeling.OEModeling.apply_insertions(target_structure: openeye.oechem.OEMolBase, template_sequence: str, loop_db: Union[str, pathlib.Path], ligand: Union[openeye.oechem.OEMolBase, None] = None) openeye.oechem.OEMolBase

Apply insertions to a protein structure according to an amino acid sequence. The provided protein structure should only contain protein residues to prevent unexpected behavior.

Parameters
  • target_structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure for which insertions should be applied.

  • template_sequence (str) – A template one letter amino acid sequence, which holds potential insertions when compared to the target structure sequence.

  • loop_db (str or Path) – The path to the loop database used by OESpruce to model missing loops.

  • ligand (oechem.OEMolBase or None, default=None) – An OpenEye molecule that should be checked for heavy atom clashes with built insertions.

Returns

structure_with_insertions – An OpenEye molecule holding the protein structure with applied insertions.

Return type

oechem.OEMolBase

kinoml.modeling.OEModeling.apply_mutations(target_structure: openeye.oechem.OEMolBase, template_sequence: str, fallback_delete: bool = True) openeye.oechem.OEMolBase

Mutate a protein structure according to an amino acid sequence. The provided protein structure should only contain protein residues to prevent unexpected behavior. Residues that could not be mutated will be deleted by default.

Parameters
  • target_structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure to mutate.

  • template_sequence (str) – A template one letter amino acid sequence, which holds potential mutations when compared to the target structure sequence.

  • fallback_delete (bool) – If the residue should be deleted if it could not be mutated.

Returns

An OpenEye molecule holding the mutated protein structure.

Return type

oechem.OEMolBase

Raises

ValueError – Mutation {oeresidue.GetName()}{oeresidue.GetResidueNumber()}{three_letter_code} failed! Only raised when fallback_delete is set False.

kinoml.modeling.OEModeling.delete_partial_residues(structure: openeye.oechem.OEMolBase) openeye.oechem.OEMolBase

Delete residues with missing sidechain or backbone atoms. The backbone is considered complete if atoms C, CA and N are present.

Parameters

structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure.

Returns

structure – An OpenEye molecule holding only residues with completely modeled side chains.

Return type

oechem.OEMolBase

kinoml.modeling.OEModeling.delete_short_protein_segments(structure: openeye.oechem.OEMolBase) openeye.oechem.OEMolBase

Delete protein segments consisting of 3 or less residues.

Parameters

structure (oechem.OEMolBase) – An OpenEye molecule holding a protein with possibly short segments.

Returns

structure – An OpenEye molecule holding the protein without short segments.

Return type

oechem.OEMolBase

kinoml.modeling.OEModeling.delete_clashing_sidechains(structure: openeye.oechem.OEMolBase, cutoff: float = 2.0) openeye.oechem.OEMolBase

Delete side chains that are clashing with other atoms of the given structure.

Note: Structures containing non-protein residues may lead to unexpected behavior, since also those residues will be deleted if clashing with other residues of the system. However, this behavior is important to be able to also check PTMs for clashes.

Parameters
  • structure (oechem.OEMolBase) – An OpenEye molecule holding a protein structure.

  • cutoff (float) – The distance cutoff that is used for defining a heavy atom clash. Note: Going bigger than 2.3 A may lead to the deletion of residues involved in strong hydrogen bonds.

Returns

processed_structure – An OpenEye molecule holding the protein structure without clashing sidechains.

Return type

oechem.OEMolBase

kinoml.modeling.OEModeling.get_atom_coordinates(molecule: openeye.oechem.OEMolBase) List[Tuple[float, float, float]]

Retrieve the atom coordinates of an OpenEye molecule.

Parameters

molecule (oechem.OEMolBase) – An OpenEye molecule for which the coordinates should be retrieved.

Returns

coordinates – The coordinates of the given molecule atoms.

Return type

list of tuple of float

kinoml.modeling.OEModeling.renumber_structure(target_structure: openeye.oechem.OEMolBase, residue_ids: Iterable[int]) openeye.oechem.OEGraphMol

Renumber the residues of a protein structure according to the given list of residue IDs.

Parameters
  • target_structure (oechem.OEMolBase) – An OpenEye molecule holding the protein structure to renumber.

  • residue_ids (iterable of int) – An iterable of residue IDs matching the order of the target structure.

Returns

renumbered_structure – An OpenEye molecule holding the cropped protein structure.

Return type

oechem.OEMolBase

Raises
  • ValueError – Number of given residue IDs does not match number of residues in the given structure.

  • ValueError – Given residue IDs contain wrong types, only int is allowed.

kinoml.modeling.OEModeling.superpose_proteins(reference_protein: openeye.oechem.OEMolBase, fit_protein: openeye.oechem.OEMolBase, residues: Iterable = tuple(), chain_id: str = ' ', insertion_code: str = ' ') openeye.oechem.OEMolBase

Superpose a protein structure onto a reference protein. The superposition can be customized to consider only the specified residues.

Parameters
  • reference_protein (oechem.OEMolBase) – An OpenEye molecule holding a protein structure which will be used as reference during superposition.

  • fit_protein (oechem.OEMolBase) – An OpenEye molecule holding a protein structure which will be superposed onto the reference protein.

  • residues (Iterable of str) – Residues that should be used during superposition in format “GLY123”.

  • chain_id (str) – Chain identifier for residues that should be used during superposition.

  • insertion_code (str) – Insertion code for residues that should be used during superposition.

Returns

superposed_protein – An OpenEye molecule holding the superposed protein structure.

Return type

oechem.OEMolBase

kinoml.modeling.OEModeling.update_residue_identifiers(structure: openeye.oechem.OEMolBase, keep_protein_residue_ids: bool = True, keep_chain_ids: bool = False) openeye.oechem.OEMolBase

Update the atom, residue and chain IDs of the given molecular structure. All residues become part of chain A, unless ‘keep_chain_ids’ is set True. Atom IDs will start from 1. Residue IDs will start from 1, except ‘keep_protein_residue_ids’ is set True. This is especially useful, if molecules were merged, which can result in overlapping atom and residue IDs as well as separate chains.

Parameters
  • structure (oechem.OEMolBase) – The OpenEye molecule structure for updating atom and residue ids.

  • keep_protein_residue_ids (bool) – If the protein residues should be kept.

  • keep_chain_ids (bool) – If the chain IDS should be kept.

Returns

structure – The OpenEye molecule structure with updated atom and residue ids.

Return type

oechem.OEMolBase

kinoml.modeling.OEModeling.split_molecule_components(molecule: openeye.oechem.OEMolBase) List[openeye.oechem.OEGraphMol]

Split an OpenEye molecule into its bonded components.

Parameters

molecule (oechem.OEMolBase) – An OpenEye molecule holding multiple components.

Returns

components – A list of OpenEye molecules holding the split components.

Return type

list of oechem.OEGraphMol

kinoml.modeling.OEModeling.residue_ids_to_residue_names(structure: openeye.oechem.OEMolBase, residue_ids: List[int], chain_id: Union[None, str] = None) List[str]

Get the corresponding residue names for a list of residue IDs and a give OpenEye molecule with residue information.

Parameters
  • structure (oechem.OEMolBase) – An OpenEye molecule with residue information.

  • residue_ids (list of int) – A list of residue IDs.

  • chain_id (None or str) – The chain ID to filter for.

Returns

residue_names – The corresponding residue names as three letter codes.

Return type

list of str

Raises
  • ValueError – No residue found for residue ID {resid}.

  • ValueError – Found multiple residues for residue ID {resid}.