Segmentation of genomic data through multivariate statistical approaches: comparative analysis

Anjum, Arfa and Jaggi, Seema and Lall, Shwetank and Varghese, Eldho and Rai, Anil and Bhowmik, Arpan and Mishra, Dwijesh Chandra (2022) Segmentation of genomic data through multivariate statistical approaches: comparative analysis. Indian Journal of Agricultural Sciences, 92 (7). pp. 92-96. ISSN 0019-5022

[img] Text
Indian Journal of Agricultural Sciences_2022_Eldho Varghese.pdf
Restricted to Registered users only

Download (963kB) | Request a copy | Please mail the copy request to cmfrilibrary@gmail.com
Official URL: https://epubs.icar.org.in/index.php/IJAgS/article/...
Related URLs:

    Abstract

    Segmenting a series of measurements along a genome into regions with distinct characteristics is widely used to identify functional components of a genome. The majority of the research on biological data segmentation focuses on the statistical problem of identifying break or change-points in a simulated scenario using a single variable. Despite the fact that various strategies for finding change-points in a multivariate setup through simulation are available, work on segmenting actual multivariate genomic data is limited. This is due to the fact that genomic data is huge in size and contains a lot of variation within it. Therefore, a study was carried out at the ICAR-Indian Agricultural Statistics Research Institute, New Delhi during 2021 to know the best multivariate statistical method to segment the sequences which may influence the properties or function of a sequence into homogeneous segments. This will reduce the volume of data and ease the analysis of these segments further to know the actual properties of these segments. The genomic data of Rice (Oryza sativa L.) was considered for the comparative analysis of several multivariate approaches and was found that agglomerative sequential clustering was the most acceptable due to its low computational cost and feasibility.

    Item Type: Article
    Uncontrolled Keywords: Genome; Segmentation; Multivariate analysis; Sequential clustering
    Subjects: Marine Fisheries > System analysis
    Divisions: CMFRI-Kochi > Marine Capture > Fishery Resource Assessment
    Subject Area > CMFRI > CMFRI-Kochi > Marine Capture > Fishery Resource Assessment
    CMFRI-Kochi > Marine Capture > Fishery Resource Assessment
    Subject Area > CMFRI-Kochi > Marine Capture > Fishery Resource Assessment
    Depositing User: Arun Surendran
    Date Deposited: 30 Jul 2022 05:11
    Last Modified: 30 Jul 2022 05:31
    URI: http://eprints.cmfri.org.in/id/eprint/16126

    Actions (login required)

    View Item View Item