Tools for Comparative Genomics

VISTA Genome Alignment downloads

Notes for this run:

  1. MGSCv3: Mouse Genome Consortium version3, also distributed by UCSC under the label Mouse Feb. 2002 (mm2). Human NCBI build 30, also distributed by UCSC under the label Human June 2002 Freeze (hg12). Both genomes were downloaded from UCSC (
    Note that recent UCSC's corrections of the RepeatMasker tracks for both the Mouse and the Human genomes have been taken into account in this run.
  2. This run has been processed on Nov 3, 2002 and been labeled run #5. Please refer to this run using this number and date.
  3. This run has been processed using the global alignment tool AVID version 2.0 build 2. The Web site for AVID is at, please send requests regarding AVID to the authors.
  4. How to cite this whole genome alignment:
    • Berkeley Genome Pipeline, Mouse/Human whole genome alignments, run#5 Nov 3, 2002
    • O.Couronne, A.Poliakov, N.Bray, T.Ishkhanov, D.Ryaboy, E.Rubin, L.Pachter, I.Dubchak. Strategies and Tools for Whole Genome Alignments. Genome Research, 13 (2003) 73.
    • Bray, N., Dubchak, I. and Pachter, L. AVID: A Global Alignment Program. Genome Research, 13 (2003) 97.
  5. These data can be freely downloaded, processed, repackaged provided that they contain the above exact citation. Results published with this alignment should contain this acknowledgment as well.

Sources of the mouse data - Specific Conditions for use:

Mouse genome sequence information is released weekly into a public repository maintained by EBI and NCBI. This data is made available before scientific publication with the following understanding:

  1. The data may be freely downloaded, used in analyses, and repackaged in databases.
  2. Users are free to use the data in scientific papers analyzing particular genes and regions if the providers of this data (the Mouse Sequencing Consortium) are properly acknowledged.
  3. The Centers producing the data reserve the right to publish the initial large-scale analyses of the dataset, including large-scale identification of regions of evolutionary conservation and large-scale genomic assembly. Large-scale refers to regions with size on the order of a chromosome (that is, 30 Mb or more).
  4. Any redistribution of the data should carry this notice.