Tag: Comparative Genomics Resource (CGR)

Get Faster, More Focused Search Results with NCBI’s New BLAST Core Nucleotide Database (core_nt)

Get Faster, More Focused Search Results with NCBI’s New BLAST Core Nucleotide Database (core_nt)

Effective August 2024, core_nt will become the default 

Interested in faster nucleotide BLAST searches with more focused search results? As previously announced, NCBI has been re-evaluating the BLAST nucleotide database (nt) to make it more compact and more efficient. Thanks to your feedback, NCBI’s BLAST is excited to introduce the core nucleotide database (core_nt), an alternative to the default nt database that contains better-defined content and is less than half the size. 

Benefits of BLAST core_nt over nt
  • Enables faster searches  
  • Returns similar top results for most searches 
  • Reduces redundancy for some highly represented organisms 
  • Allows easier download and requires less storage space for database download for standalone BLAST 

Continue reading “Get Faster, More Focused Search Results with NCBI’s New BLAST Core Nucleotide Database (core_nt)”

RefSeq Release 225 Now Available!

RefSeq Release 225 Now Available!

Check out RefSeq release 225, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of July 8, 2024, this full release incorporates genomic, transcript, and protein data containing:

  • 448,507,905 records
  • 334,845,613 proteins
  • 63,542,774 RNAs
  • Sequences from 152,668 organisms

The release is provided in several directories as a complete dataset and also as divided by logical groupings. Continue reading “RefSeq Release 225 Now Available!”

Ortholog Groups Added for ~2 Million Insect Genes

Ortholog Groups Added for ~2 Million Insect Genes

Find evolutionarily related genes across insects and other arthropods on our new Ortholog webpages

NCBI recently released a set of orthologs for approximately 2 million insect genes. You can now find and access the orthologous genes, transcripts, and proteins by searching a species and gene name in NCBI All Databases, NCBI Gene, or NCBI Datasets. As previously described, these orthologs are based on comparisons to the Drosophila melanogaster annotated genome. Using Drosophila gene nomenclature for orthologs should lead to more informative gene symbols for insects and other arthropods.  Continue reading “Ortholog Groups Added for ~2 Million Insect Genes”

New! RefSeq Release 224

New! RefSeq Release 224

Check out RefSeq release 224, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of May 6, 2024, this full release incorporates genomic, transcript, and protein data containing:

  • 435,879,646 records
  • 324,246,652 proteins
  • 62,348,147 RNAs
  • Sequences from 150,742 organisms

The release is provided in several directories as a complete dataset and also as divided by logical groupings. Continue reading “New! RefSeq Release 224”

Cleaner BLAST Databases for More Accurate Results

Cleaner BLAST Databases for More Accurate Results

Removing contaminated sequences using NCBI quality assurance tools 

Do you use BLAST to identify a sequence or the evolutionary scope of a gene? That can be challenging if contaminated and misclassified sequences are in the BLAST databases and show up in your search results. To address this problem, we now use the NCBI quality assurance tools listed below to systematically remove these misleading sequences from the default nucleotide (nt) and protein (nr) BLAST databases.  Continue reading “Cleaner BLAST Databases for More Accurate Results”

Conserved Domain Database Version 3.21 Now Available!

Conserved Domain Database Version 3.21 Now Available!

Check out the newly released Conserved Domain Database (CDD) version 3.21. Updated content is available on the CDD FTP site.

What’s New?

Continue reading “Conserved Domain Database Version 3.21 Now Available!”

Browse Taxonomy Records with NCBI Datasets

Browse Taxonomy Records with NCBI Datasets

New & improved NCBI Datasets Taxonomy pages and command-line service 

NCBI Datasets is excited to introduce new features to our Taxonomy pages making it easier for you to access, browse, and download taxonomic information about organisms at any taxonomic level.  

What’s new?
  • Explore Taxonomy records with an updated look and feel  
  • Access and download taxonomic metadata from the web or with our updated command-line (CLI) tools 

Continue reading “Browse Taxonomy Records with NCBI Datasets”

New RefSeq Annotations Now Available!

New RefSeq Annotations Now Available!

In February and March, the NCBI Eukaryotic Genome Annotation Pipeline released forty-six new annotations in RefSeq!

New Annotations
  • Aedes albopictus (Asian tiger mosquito)
  • Anolis carolinensis (green anole)
  • Armigeres subalbatus (mosquito)
  • Bacillus rossius redtenbacheri (walking stick)
  • Bolinopsis microptera (comb jelly)
  • Bombyx mori (domestic silkworm)
  • Bubalus kerabau (carabao)
  • Candoia aspera (snake)
  • Cavia porcellus (domestic guinea pig) 
  • Continue reading “New RefSeq Annotations Now Available!”
Now Available: RefSeq Release 223

Now Available: RefSeq Release 223

Check out RefSeq release 223, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of March 4, 2024, this full release incorporates genomic, transcript, and protein data containing:

  • 425,594,654 records
  • 316,329,937 proteins
  • 60,886,133 RNAs
  • sequences from 147,591 organisms 

Continue reading “Now Available: RefSeq Release 223”

Foreign Contamination Screen Tool: Now Available in Galaxy!

Foreign Contamination Screen Tool: Now Available in Galaxy!

Check out our latest enhancements 

Do you submit genome assembly data to GenBank? If so, try out NCBI’s Foreign Contamination Screen (FCS) tool, a quality assurance process that you can run yourself. We will screen all prokaryotic and eukaryotic genome submissions to GenBank with this tool, but we encourage you to screen your data before submitting to save time. FCS offers sensitive contaminant detection to increase the quality of your genome submissions to GenBank. As part of our ongoing effort to improve your experience, we recently made several enhancements.  Continue reading “Foreign Contamination Screen Tool: Now Available in Galaxy!”