• kautau@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    ·
    10 hours ago

    Maybe at some point we’ll have version control for all DNA mapping so each minor change is a commit hash and each major release is a tag

    • tetris11@lemmy.ml
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      3 hours ago

      We do, the major versions have tag releases like mm7, mm8, mm9, etc. as defined by the current build, and minor patch releases too like mm10p14 as new sequences come in.

      https://hgdownload.cse.ucsc.edu/goldenpath/mm10/bigZips/

      Example, say you have 5 sequences: CAT, ATC, ATCG, CGT, and ATATA.

      One way of combining them up together to build a transcriptome is like this:

      5 sequences:     ATATA
                     CG-T  ATC
                   ATCG CAT
      
        Reference: ATCGATATATC
      

      ATCGATATATC isn’t the only solution to these sequences, but as you get more sequences to try and overlap, the more the uncertainty goes down