Skip to content

TattaBio/NPannotator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NPannotator

A cheminformatics tool that assigns substrates to enzymatic domains in Type I Polyketide Synthase (T1PKS) biosynthetic gene clusters. Given a known product SMILES and a set of unordered domains, NPannotator determines the correct module ordering and substrate assignments by iteratively modifying starter/extender units and filtering via chemical similarity.

Installation

pip install -e .

Key dependencies (rdkit, pandas, numpy) are installed automatically. The retrotide package (which provides the bcs module for domain/module/cluster objects) is also installed from GitHub.

Usage

import bcs
import pandas as pd
from NPannotator import Annotator

# Define target domains (unordered)
domains = [[bcs.AT],
           [bcs.AT, bcs.KR],
           [bcs.AT, bcs.KR, bcs.DH],
           [bcs.AT, bcs.KR, bcs.DH, bcs.ER]]

target_SMILES = "..."  # SMILES string of the known product

# Initialize and run the annotation pipeline
annotator = Annotator(target_SMILES=target_SMILES,
                      target_domains=domains,
                      scaffoldsDB=scaffoldsDB)

results_df = annotator.RunPipeline()

Running Tests

pytest tests/

About

Automated cheminformatics-based annotation of natural product gene clusters

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages