BACKGROUND
Whole genome sequencing provides a complete snapshot of an organism’s genetic code, enabling evolutionary inference through quantification of nucleotide similarities. However, reference databases exhibit bias toward well-studied species due to long-tail distributions and sampling gaps. These imbalances risk species misidentification and hinder the characterization of novel taxa.
SUMMARY OF TECHNOLOGY
OSU Researchers have developed TEPI (Taxonomy-Aware Embedding and Pseudo-Imaging), revolutionizing genome classification through innovative embedding and imaging techniques for enhanced species recognition. Embedding space calculated by k-mer co-occurrences aids in the capture of hierarchical relationships between species and facilitates efficient and accurate species recognition through convolutional neural networks (CNNs) represented through pseudo-imaging.
POTENTIAL AREAS OF APPLICATION
MAIN ADVANTAGES
STAGE OF DEVELOPMENT
https://innovations-okstate.technologypublisher.com/files/sites/image1517.png