Machine learning (ML) is an intelligent data mining technique to recognize patterns in large-scale data sets, the capability of which in Big Data analysis was exemplified in the Go match between Google’s artificial intelligence program AlphaGo and the world-class Go players like Lee Sedol. We presented an ML-based methodology termed mlDNA for large-scale integration analysis of transcriptome data via comparison of gene coexpression networks (Figure 3). mlDNA substantially outperformed traditional statistical testing–based differential expression analysis in identifying stress-related genes, with markedly improved prediction accuracy. Some of the mlDNA predictions have been validated with phenotyping experiments. alternative text for search engines Figure . Statistics of the Candidate Stress-Related Genes Predicted by mlDNA.

Chuang Ma, Xiangfeng Wang. Machine learning-based differential network analysis: a study of stress-responsive transcriptomes in Arabidopsis.[J]. Plant Cell, 2014, 26(2):520-537.