AI RESEARCH
An Imbalanced Dataset with Multiple Feature Representations for Studying Quality Control of Next-Generation Sequencing
arXiv CS.LG
•
ArXi:2604.04981v1 Announce Type: cross Next-generation sequencing (NGS) is a key technique for studying the DNA and RNA of organisms. However, identifying quality problems in NGS data across different experimental settings remains challenging. To develop automated quality-control tools, researchers require datasets with features that capture the characteristics of quality problems. Existing NGS repositories, however, offer only a limited number of quality-related features.