Hybrid Quantum–Classical Benchmarks for Synthetic DNA Risk Classification: A Fully Simulated Study
Abstract
Hybrid quantum–classical architectures offer a promising direction for sequence modeling, yet their empirical behavior on realistic bioinformatics tasks remains insufficiently documented. Here, we present a fully simulated, reproducible benchmark for hybrid quantum machine learning applied to synthetic DNA disease-risk classification. A dataset of 5,000 sequences (200 bp) across five balanced classes was generated using motif-injection rules with controlled noise, from which 74 biological features were extracted. Baseline models (CNN, attention, classical ensemble) were compared against two quantum-hybrid models incorporating a 4-qubit ZZFeatureMap + RealAmplitudes ansatz, executed exclusively on Qiskit Aer Simulator with 1,024 shots. The attention model achieved the highest test accuracy (51.7%), while quantum-hybrid models produced comparable performance (51.1–51.3%), showing no measurable quantum advantage under these settings. The study establishes an honest, fully reproducible baseline for QML in genomics, highlighting current limitations of small-qubit encodings and motivating future work with trainable variational circuits and larger biological datasets.
Related articles
Related articles are currently not available for this article.