The First Chromosome-Level Genome Assembly of an Individual from India

Category
Genome Assembly, Genome Resequencing, Next Generation sequencing
About This Project

Case Study: The First Chromosome-Level Genome Assembly of an Individual from Karnataka, India*

PI: Dr. Bibha Choudhary, IBAB Bengaluru

Nucleome Informatics is proud to have contributed to a landmark study that presents the first chromosome-level genome assembly of an individual from Karnataka, India (KIn1). This achievement marks a significant advancement in human genomics and provides a valuable reference for understanding genetic diversity within South Asian populations.

Project Overview

The KIn1 genome assembly was generated using a de novo approach with:

  • 12× PacBio HiFi sequencing data generated in-house by Nucleome Informatics
  • 30× Hi-C sequencing data from the most ethnically related available sample (HG04712 from Andhra Pradesh), sourced from the International Genome Sample Resource (IGSR)

This resulted in a highly contiguous genome with:

  • N50 of 137 Mb and L50 of 9, close to the maximum achievable N50 of 147 Mb and minimum achievable L50 of 8 for human genomes

Comparative Genomics and Structural Variations

The KIn1 assembly was compared with genomes of individuals from diverse ethnic backgrounds, including:

  • Puerto Rican (PR1)
  • Ashkenazi Jewish (Ash1)
  • Han Chinese (Hans1)
  • Northern European (T2T)

Key findings from this comparison include:

  • 80-90% alignment with reference chromosomes and near-collinearity in structure
  • A unique inversion in chromosome 7 within the euchromatic region
  • Major translocations in chromosomes 17 and 20

Additionally, a chromosome-level genome assembly was produced for another individual from Punjab, Lahore (HG03492) to enhance comparative analysis.

Population Genetics Insights

Using KIn1 as a reference genome, an analysis of IGSR database individuals revealed that:

  • Indian Telugu from the UK (ITU) is the closest ethnic match
  • Sri Lankan Tamil from the UK (STU) follows as another representative of South Asian ancestry

This study highlights the genomic uniqueness of the Indian population and underscores the need for more population-specific reference genomes to enhance medical genetics, ancestry studies, and precision medicine.

Impact and Future Applications

The successful completion of this chromosome-level human genome assembly showcases the power of long-read sequencing (PacBio HiFi) and Hi-C scaffolding for generating high-quality genomes. This first-of-its-kind assembly for Karnataka will:

  • Aid in genetic disease research and precision medicine
  • Provide insights into South Asian ancestry and population genetics
  • Serve as a reference for future human genome sequencing projects in India

Nucleome Informatics: Advancing Human Genomics

As a leading provider of next-generation sequencing (NGS) and genome assembly services, Nucleome Informatics continues to push the boundaries of human genomics research. Our collaboration with IBAB Bengaluru on this project reinforces our commitment to enabling cutting-edge genomic discoveries in India.

For more details on Nucleome’s sequencing solutions, visit our website.

doi: https://doi.org/10.1101/2024.08.18.608448

*This article is a preprint and has not been certified by peer review.