Traces of Aging-associated Diseases in DNA

Dr Can Yang is developing statistical and machine learning methods for large scale genomic data analysis. Dr Yang’s team is actively collaborating with industrial companies and hospitals, creating a platform for large-scale health data analytics and serving for healthy aging. By analysing DNA data from the whole genome, he and his team seek to relate genetic variations to different complex diseases related to aging and identify the phenotypes of cardiovascular disease or osteoarthritis for aging people. Dr Yang is collaborating with a company called WEGENE to generate DNA data collected from hospitals in mainland China, conducting multiple collaborative projects with Shenzhen Research Institute of Big Data and Nanjing University hospital. After collecting genomic data as well as phenotype information, his team performs statistical analysis to identify disease risk genes, especially for aging phenotypes. They have also built risk prediction models to stratify individuals with higher risk of complex diseases, such as cardiovascular disease or osteoarthritis.

The distinctive characteristic of Dr Yang’s research is large-scale data analysis for the East Asian population, for which DNA data is still very limited. His collaborative projects for the first time provide a decent sample size for the East Asian population, which he and his team is integrating with Caucasian data from UK Biobank to facilitate cross population analysis. The UK Biobank data contains many different types of phenotypes that are associated with aging phenotypes. The success of cross-population analysis not only greatly improves statistical accuracy for East Asian population, but also provides unprecedented opportunities to understand the difference of genetic basis of aging diseases between these two populations.

Besides cross-population analysis, Dr Yang’s team is developing machine learning methods for analysis of imaging data and multi-Omics data (i.e. genomic data and transcriptome data). These projects include identification of genetic signature of human face by AI-driven phenotyping, and deep learning for 3D medical imaging data analysis.