kinoml.core.proteins

MolecularComponent objects that represent protein-like entities.

Module Contents

kinoml.core.proteins.logger
class kinoml.core.proteins.Protein(pdb_id: str = '', molecule: Union[openeye.oechem.OEMol, openeye.oechem.OEGraphMol, MDAnalysis.core.universe.Universe, None] = None, toolkit: str = 'OpenEye', name: str = '', sequence: str = '', uniprot_id: str = '', ncbi_id: str = '', metadata: Union[dict, None] = None, **kwargs)

Bases: kinoml.core.components.BaseProtein, kinoml.core.sequences.AminoAcidSequence

Create a new Protein object. A molecular representation is accessible via the molecule attribute.

Examples

Create a protein from file with OpenEye toolkit molecular representation:

>>> protein = Protein.from_file("data/proteins/4f8o.pdb", name="4f8o")

Create a protein from file with MDAnalysis toolkit molecular representation:

>>> protein = Protein.from_file("data/proteins/4f8o.pdb", name="4f8o", toolkit="MDAnalysis")

Create a protein from an OpenEye molecule:

>>> from kinoml.modeling.OEModeling import read_molecules
>>> molecule = read_molecules("data/proteins/4f8o.pdb")[0]
>>> protein = Protein(molecule=molecule, name="4f8o")

Create a protein from PDB ID:

>>> protein = Protein.from_pdb("4f8o")

Create a protein from PDB ID with lazy instantiation:

>>> protein = Protein(pdb_id="4f8o")

Create a protein from PDB ID with lazy instantiation and get access to the complete wildtype amino acid sequence via providing a UniProt ID:

>>> protein = Protein(pdb_id="4f8o", uniprot_id="P31522")
>>> protein.sequence
property pdb_id

Decorate pdb_id to modify setter.

property molecule

Decorate molecule to modify setter and getter.

molecule()

Get the _molecule attribute. If the pdb_id attribute is given and _molecule is None, a new molecule representation will be created from the given pdb_id, e.g. in case of lazy instantiation. The toolkit being used depends on the toolkit attribute.

Returns

The molecular representation of the protein.

Return type

Universe or AtomGroup or oechem.OEMol or oechem.OEGraphMol or None

classmethod from_file(file_path: Union[pathlib.Path, str], name: str = '', toolkit: str = 'OpenEye')

Create a Protein from file.

Parameters
  • file_path (pathlib.Path or str) – The path to the molecular file. Supported formats depend on the toolkit being used.

  • name (str, default="") – The name of the protein.

  • toolkit (str, default="OpenEye") – The toolkit to use for molecular representation.

classmethod from_pdb(pdb_id: str, name: str = '', toolkit: str = 'OpenEye')

Create a Protein from file.

Parameters
  • pdb_id (str) – The PDB ID of the protein structure of interest.

  • name (str, default="") – The name of the protein.

  • toolkit (str, default="OpenEye") – The toolkit to use for molecular representation.

class kinoml.core.proteins.KLIFSKinase(pdb_id: str = '', molecule: Union[openeye.oechem.OEMol, openeye.oechem.OEGraphMol, MDAnalysis.core.universe.Universe, None] = None, toolkit: str = 'OpenEye', name: str = '', sequence: str = '', uniprot_id: str = '', ncbi_id: str = '', structure_klifs_id: Union[int, None] = None, kinase_klifs_id: Union[int, None] = None, kinase_klifs_sequence: str = '', structure_klifs_sequence: str = '', structure_klifs_residues: Union[pandas.DataFrame, None] = None, metadata: Union[dict, None] = None, **kwargs)

Bases: Protein

Create a new KLIFSKinase object. A molecular representation is accessible via the molecule attribute. Allows access to the sequence and residues of the KLIFS binding pocket.

Examples

Create a KLIFS kinase from PDB ID with lazy instantiation:

>>> kinase = KLIFSKinase(pdb_id="4yne")

Create a KLIFS kinase from PDB ID with lazy instantiation and gain access to the wildtype KLIFS pocket sequence via providing a UniProt ID:

>>> kinase = KLIFSKinase(pdb_id="4yne", uniprot_id="P04629")
>>> kinase.kinase_klifs_sequence()

Create a KLIFS kinase from PDB ID with lazy instantiation and gain access to the wildtype KLIFS pocket sequence via providing a KLIFS specifc kinase ID:

>>> kinase = KLIFSKinase(pdb_id="4yne", kinase_klifs_id=480)
>>> kinase.kinase_klifs_sequence()  # wildtype, does not need to match the given PDB structure

Create a KLIFS kinase from PDB ID with lazy instantiation and gain access to the KLIFS pocket sequence and residues of the structure via providing a KLIFS specifc structure ID:

>>> kinase = KLIFSKinase(pdb_id="4yne", structure_klifs_id=3620)
>>> kinase.kinase_klifs_sequence()  # wildtype, does not need to match the given PDB structure
>>> kinase.structure_klifs_sequence()  # specific to the structure
>>> kinase.structure_klifs_residues()  # specific to the structure
property kinase_klifs_sequence

Decorate kinase_klifs_sequence to modify setter and getter.

property structure_klifs_sequence

Decorate structure_klifs_sequence to modify setter and getter.

property structure_klifs_residues

Decorate structure_klifs_residues to modify setter and getter.

_query_sequence_sources()

Query available sources for sequence details. Add additional methods below to allow fetching from other sources.

_query_klifs()

Query KLIFS for the Uniprot ID, which allows fetching of the sequence.

kinase_klifs_sequence()

Get the _kinase_klifs_sequence attribute. Query KLIFS for the respective sequence if _kinase_klifs_sequence is an empty string.

Returns

The kinase KLIFS binding pocket sequence.

Return type

str

Raises

ValueError – To allow access to the kinase KLIFS sequence, the KLIFSKinase object needs to be initialized with one of the following attributes: kinase_klifs_sequence kinase_klifs_id structure_klifs_id uniprot_id

structure_klifs_sequence()

Get the _structure_klifs_sequence attribute. Query KLIFS for the respective sequence if _structure_klifs_sequence is an empty string.

Returns

The structure-specific KLIFS binding pocket sequence.

Return type

str

Raises

ValueError – To allow access to the structure KLIFS sequence, the KLIFSKinase object needs to be initialized with one of the following attributes: structure_klifs_sequence structure_klifs_id

structure_klifs_residues()

Get the _structure_klifs_residues attribute. Query KLIFS for the respective residues if _structure_klifs_residues is None.

Returns

The structure-specific KLIFS residues formatted like by opencadd.databases.klifs.

Return type

pd.DataFrame or None

Raises

ValueError – To allow access to structure KLIFS residues, the KLIFSKinase object needs to be initialized with one of the following attributes: structure_klifs_residues structure_klifs_id