Introduction

Welcome to the Bionode documentation! Here we document only some of the currently more stable modules. If the module you are looking for is not here, please check for its GitHub repository in the list below and read the README.md file.

Bionode modules list

Bionode modules can be used as command line tools or JavaScript libraries! You can view code examples in the dark area to the right, and you can switch the programming language of the examples with the tabs in the top right.

Installation

Make sure you have the latest Node.JS installed. You can then install each module with npm (see shell examples on the right).

# Install module as command line tool available in PATH
# using npm -g option
npm install bionode-ncbi -g

# Install locally in the node_modules folder of a project
# for usage as a library. See the JavaScript tab.
npm install bionode-ncbi

bionode-ncbi

General usage

bionode-ncbi <command> [arguments] --limit [num] --pretty

## Options
--stdin, -s         Read STDIN  
--limit, -l         Limit number of results  
--throughput, -t    Number of items per API request  
--pretty, -p        Print human readable output instead of NDJSON  
export DEBUG='*'    Debug mode

# Display CLI help
bionode-ncbi --help

var ncbi = require('bionode-ncbi')

// Callback pattern
ncbi.search('genome', 'human', callback)

// Event pattern
ncbi.search('genome', 'human').on('data', console.log)

// Pipe pattern
var JSONStream = require('JSONStream')

ncbi.search('genome', 'human')
.pipe(JSONStream.stringify())
.pipe(process.stdout)

Search

Takes a database name and a query term. Returns the metadata.

For a list of NCBI database that can be used, see this documentation’s appendix

Usage

search <db> [term]

Parameter	Default	Description
db	none	One of these
term	none	Species, dataset ID, etc

bionode-ncbi search taxonomy 'solenopsis invicta' --limit 1 --pretty

ncbi.search('taxonomy', ''solenopsis invicta'').on('data', console.log)

{
  "uid": "13686",
  "status": "active",
  "rank": "species",
  "division": "ants",
  "scientificname": "Solenopsis invicta",
  "commonname": "red fire ant",
  "taxid": 13686,
  "akataxid": "",
  "genus": "Solenopsis",
  "species": "invicta",
  "subsp": "",
  "modificationdate": "2015/09/16 00:00",
  "genbankdivision": "Invertebrates"
}

// Arguments can be passed as an object instead:
ncbi.search({ db: 'sra', term: 'solenopsis' })
.on('data', console.log)

// Advanced options can be passed using the previous syntax:
var options = {
  db: 'assembly', // database to search
  term: 'human',  // optional term for search
  limit: 500,     // optional limit of NCBI results
  throughput: 100 // optional number of items per request
}

// The search term can also be passed with write:
var search = ncbi.search('sra').on('data', console.log)
search.write('solenopsis')

// Or piped, for example, from a file:
var split = require('split')

fs.createReadStream('searchTerms.txt')
.pipe(split())
.pipe(search)

Fetch

Takes a database name and a query term. Returns the data.

Usage

fetch <db> [term]

Parameter	Default	Description
db	none	One of these
term	none	Species, dataset ID, etc

bionode-ncbi fetch nucest p53 -l 1 --pretty

ncbi.fetch('nucest', 'p53').on('data', console.log)

{
  "id": "JZ923713.1 clone 186 Pelteobagrus fulvidraco spleen cDNA library Tachysurus fulvidraco cDNA similar to p53, mRNA sequence",
  "seq": "ACTCCACAACTTCACCCTGCACTTCCAGAAGTCTAGTACGGCCAAATCAGTCACCTGCACGTACTCCCCGGAGCTGAATAAACTCTTCTGTCAGTTAGCTAAGACGTGCCCTGTGCTCATGGCAGTGAGTTTTTCTCCACCACATGGTTCTGTGCTCAGAGCCACTGCTGTGT"
}

// With advanced parameters for sequence databases (all are optional):
var opts = {
   db: 'nucest',
   term: 'guillardia_theta',
   strand: 1,
   complexity: 4
 }
ncbi.fetch(opts).on('data', console.log)

For some databases there are multiple return types. A default one will be chosen automatically, however it is possible to specify this via the rettype option. The NCBI website provides a list of databasese supported by efetch here: http://www.ncbi.nlm.nih.gov/books/NBK25497/table/chapter2.T._entrez_unique_identifiers_ui/?report=objectonly

URLs

Takes either sra or assembly database name and query term. Returns URLs of datasets.

Usage

urls <dlsource> [term]

bionode-ncbi urls assembly human -l 1 -p

ncbi.urls('assembly', 'human').on('data', console.log)

{
  "uid": "1075781",
  "assembly_report": {
    "txt": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_00185686
5.2_ASM185686v2_assembly_report.txt"
  },
  "assembly_stats": {
    "txt": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_00185686
5.2_ASM185686v2_assembly_stats.txt"
  },
  "cds_from_genomic": {
    "fna": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_00185686
5.2_ASM185686v2_cds_from_genomic.fna.gz"
  },
  "feature_table": {
    "txt": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_00185686
5.2_ASM185686v2_feature_table.txt.gz"
  },
  "genomic": {
    "fna": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_001856865.2_ASM185686v2_genomic.fna.gz",
    "gbff": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_001856865.2_ASM185686v2_genomic.gbff.gz",
    "gff": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_00185686
5.2_ASM185686v2_genomic.gff.gz"
  },
  "protein": {
    "faa": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_001856865.2_ASM185686v2_protein.faa.gz",
    "gpff": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_0018568
65.2_ASM185686v2_protein.gpff.gz"
  },
  "rna_from_genomic": {
    "fna": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_00185686
5.2_ASM185686v2_rna_from_genomic.fna.gz"
  },
  "wgsmaster": {
    "gbff": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_0018568
65.2_ASM185686v2_wgsmaster.gbff.gz"
  },
  "README": {
    "txt": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/README.txt"
  },
  "annotation_hashes": {
    "txt": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/annotation_hashes.txt"
  },
  "assembly_status": {
    "txt": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/assembly_status.txt"
  },
  "md5checksums": {
    "txt": "http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/md5checksums.txt"
  }
}

# The following examples requires a json parser
bionode-ncbi urls assembly human -l 1 -p | json genomic.fna
# Returns: http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/856/865/GCA_001856865.2_ASM185686v2/GCA_001856865.2_ASM185686v2_genomic.fna.gz

Download

Takes either sra or assembly db name and query term. Downloads the corresponding SRA or assembly (genomic.fna) file into a folder named after the unique ID (UID).

Usage

download <dlsource> [term]

bionode-ncbi download assembly 'solenopsis invicta'

ncbi.download('assembly', 'solenopsis invicta').on('data', console.log)
.on('end', function(path) { console.log('File saved at ' + path) }

{"uid":"244018","url":"http://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/188/075/GCF_000188075.1_Si_gnG/GCF_000188075.1_Si_gnG_genomic.fna.gz","path":"244018/GCF_000188075.1_Si_gnG_genomic.fna.gz","status":"downloading","total":557056,"progress":0.4775266121244368,"speed":131072}

bionode-ncbi download assembly 'solenopsis invicta' --pretty
# Downloading GCF_000188075.1_Si_gnG_genomic.fna.gz
# [>                                            ] 0.7% of 116.65 MB (192.51 kB/s)

Link

Returns a unique ID (UID) from a destination database linked to another UID from a source database.

Usage

link <srcDB> <destDB> [srcUID]

bionode-ncbi link assembly bioproject 244018 --pretty

ncbi.link('assembly', 'bioproject', 244018).on('data', console.log)

{
  "srcDB": "assembly",
  "destDB": "bioproject",
  "srcUID": "244018",
  "destUIDs": [
    "268798",
    "49629"
  ]
}

Expand

Takes a property (e.g. biosample) and an optional destination property (e.g. sample) and looks for a field named property+id (e.g. biosampleid) in the Streamed object. Then it will do a ncbi.search for that id and save the result under Streamed object.property.

Usage

expand <property> [destProperty]

bionode-ncbi search genome 'solenopsis invicta' -l 1 | \
bionode-ncbi expand tax -s --pretty

ncbi.search('genome', 'solenopsis invicta').pipe(ncbi.expand('tax'))

{
  "uid": "2938",
  "organism_name": "Solenopsis invicta",
  "organism_kingdom": "Eukaryota",
  "organism_group": "",
  "organism_subgroup": "Insects",
  "defline": "Solenopsis invicta overview",
  "projectid": 49663,
  "project_accession": "PRJNA49663",
  "status": "Draft",
  "number_of_chromosomes": "0",
  "number_of_plasmids": "0",
  "number_of_organelles": "1",
  "assembly_name": "Si_gnG",
  "assembly_accession": "GCA_000188075.1",
  "assemblyid": 244018,
  "create_date": "2011/02/03 00:00",
  "options": "",
  "weight": "",
  "chromosome_assemblies": "0",
  "scaffold_assemblies": "1",
  "sra_genomes": "0",
  "taxid": 13686,
  "tax": {
    "uid": "13686",
    "status": "active",
    "rank": "species",
    "division": "ants",
    "scientificname": "Solenopsis invicta",
    "commonname": "red fire ant",
    "taxid": 13686,
    "akataxid": "",
    "genus": "Solenopsis",
    "species": "invicta",
    "subsp": "",
    "modificationdate": "2015/09/16 00:00",
    "genbankdivision": "Invertebrates"
  }
}

Plink

Similar to Link but takes the srcUID from a property of the Streamed object and attaches the result to a property with the name of the destination DB.

Usage

bionode-ncbi plink <property> <destDB>

bionode-ncbi search genome 'solenopsis invicta' -l 1 | \
bionode-ncbi expand tax -s | \
bionode-ncbi plink tax sra -s --pretty

ncbi.search('genome', 'solenopsis invicta')
.pipe(ncbi.expand('tax'))
.pipe(ncbi.plink('tax', 'sra')

{ "uid":"2938",
  "organism_name":"Solenopsis invicta",
  "organism_kingdom":"Eukaryota",
  "organism_group":"",
  "organism_subgroup":"Insects",
  "defline":"Solenopsis invicta overview",
  "projectid":49663,
  "project_accession":"PRJNA49663",
  "status":"Draft",
  "number_of_chromosomes":"0",
  "number_of_plasmids":"0",
  "number_of_organelles":"1",
  "assembly_name":"Si_gnG",
  "assembly_accession":"GCA_000188075.1",
  "assemblyid":244018,
  "create_date":"2011/02/03 00:00",
  "options":"",
  "weight":"",
  "chromosome_assemblies":"0",
  "scaffold_assemblies":"1",
  "sra_genomes":"0",
  "taxid":13686,
  "tax":{"uid":"13686",
  "status":"active",
  "rank":"species",
  "division":"ants",
  "scientificname":"Solenopsis invicta",
  "commonname":"red fire ant",
  "taxid":13686,
  "akataxid":"",
  "genus":"Solenopsis",
  "species":"invicta",
  "subsp":"",
  "modificationdate":"2015/09/16 00:00",
  "genbankdivision":"Invertebrates"},
  "sraid":["2209130","2209129","2209128","2209127","2209126","2209125","2209124","2209123","2209122","2209121","2209120","2209119","1094137","1094136","1094135","280243","280116","280115","280114","280113","280112","280111","280110","280109","280108","280107","280099","280098","280097","279869","279868","279867","279866","279865","279864","279863","279040","278922","278818","278816","278808","278807","278806","278805","278802","225471","225470","225469","225468","25256","25255","25254","25253","25252","25251","24418","23953","23952","23951","23920","23919","23918","23917","23914","23912","23468","23459","23457"]}

bionode-fasta

Streamable FASTA parser.

Usage

# bionode-fasta [options] [input file] [output file]
bionode-fasta input.fasta.gz output.json
# You can also use fasta files compressed with gzip
# If no output is provided, the result will be printed to stdout
# Options: -p, --path: Includes the path of the original file as a property of the output objects

// Returns a Writable Stream that parses a FASTA content Buffer
// into a JSON Buffer

var fasta = require('bionode-fasta')

fs.createReadStream('./input.fasta')
.pipe(fasta())
.pipe(process.stdout)

// Can also parse content from filenames Strings
// streamed to it

fs.createReadStream('./fasta-list.txt')
.pipe(split())
.pipe(fasta({filenameMode: true}))
.pipe(process.stdout)

{ "id": "contig1", "seq": "AGTCATGACTGACGTACGCATG" }
{ "id": "contig2", "seq": "ATGTACGTACTGCATGC" }

bionode-fasta input.fasta.gz output.json --path

// When filenames are Streamed like in the previous example,
// or passed directly to the parser Stream, they can be added
// to the output Objects
fasta({includePath: true}, './input.fasta')
.pipe(process.stdout)

{ "id": "contig1",
  "seq": "AGTCATGACTGACGTACGCATG",
  "path": "./input.fasta" }

// The output from the parser can also be available
// as Objects instead of Buffers

fasta({objectMode: true}, './input.fasta')
.on('data', console.log)

// Shortcut version of the previous example
fasta.obj('./input.fasta').on('data', console.log)

// Callback style can also be used, however they might
// not be the best for large files
fasta.obj('./input.fasta', function(data) {
  console.log(data)
})

bionode-seq

Module for DNA, RNA and protein sequences manipulation

Usage

This method currently only works as a JavaScript library and doesn’t provide a CLI interface (see issue #5).

checkType

Check sequence type

Takes a sequence string and checks if it’s DNA, RNA or protein. Follows IUPAC notation which allows ambiguous sequence notation. In this case the sequence is labelled as ambiguous nucleotide rather than amino acid sequence.

seq.checkType("ATGACCCTGAGAAGAGCACCG");
// Returns: "dna"
seq.checkType("AUGACCCUGAAGGUGAAUGAA");
// Returns: "rna"
seq.checkType("MAYKSGKRPTFFEVFKAHCSDS");
// Returns: "protein"
seq.checkType("AMTGACCCTGAGAAGAGCACCG");
// Returns: "ambiguousDna"
seq.checkType("AMUGACCCUGAAGGUGAAUGAA");
// Returns: "ambiguousRna"

Takes a sequence type argument and returns a function to complement bases.

reverse

Reverse sequence

Takes sequence string and returns the reverse sequence.

seq.reverse("ATGACCCTGAAGGTGAA");
// "AAGTGGAAGTCCCAGTA"

complement

(reverse) complement sequence

Takes a sequence string and optional boolean for reverse, and returns its complement.

seq.complement("ATGACCCTGAAGGTGAA");
// "TACTGGGACTTCCACTT"
seq.complement("ATGACCCTGAAGGTGAA", true);
// "TTCACCTTCAGGGTCAT"
//Alias
seq.reverseComplement("ATGACCCTGAAGGTGAA");
// "TTCACCTTCAGGGTCAT"

Takes a sequence string and returns the reverse complement (syntax sugar).

getTranscribedBase

Transcribe base

Takes a base character and returns the transcript base.

seq.getTranscribedBase("A");
// "U"
seq.getTranscribedBase("T");
// "A"
seq.getTranscribedBase("t");
// "a"
seq.getTranscribedBase("C");
// "G"

getTranslatedAA

Get codon amino acid

Takes an RNA codon and returns the translated amino acid.

seq.getTranslatedAA("AUG");
// "M"
seq.getTranslatedAA("GCU");
// "A"
seq.getTranslatedAA("CUU");
// "L"

removeIntrons

Remove introns

Take a sequence and an array of exonsRanges and removes them.

seq.removeIntrons("ATGACCCTGAAGGTGAATGACAG", [[1, 8]]);
// "TGACCCT"
seq.removeIntrons("ATGACCCTGAAGGTGAATGACAG", [[2, 9], [12, 20]]);
// "GACCCTGGTGAATGA"

transcribe

Transcribe sequence

Takes a sequence string and returns the transcribed sequence (dna <-> rna). If an array of exons is given, the introns will be removed from the sequence.

seq.transcribe("ATGACCCTGAAGGTGAA");
// "AUGACCCUGAAGGUGAA"
seq.transcribe("AUGACCCUGAAGGUGAA"); //reverse
// "ATGACCCTGAAGGTGAA"

translate

Translate sequence

Takes a DNA or RNA sequence and translates it to protein If an array of exons is given, the introns will be removed from the sequence.

seq.translate("ATGACCCTGAAGGTGAATGACAGGAAGCCCAAC"); //dna
// "MTLKVNDRKPN"
seq.translate("AUGACCCUGAAGGUGAAUGACAGGAAGCCCAAC"); //rna
// "MTLKVNDRKPN"
seq.translate("ATGACCCTGAAGGTGAATGACAGGAAGCC", [[3, 21]]);
// "LKVND"

reverseExons

Reverse exons

Takes an array of exons and the length of the reference and returns inverted coordinates.

seq.reverseExons([[2,8]], 20);
// [ [ 12, 18 ] ]
seq.reverseExons([[10,45], [65,105]], 180);
// [ [ 135, 170 ], [ 75, 115 ] ]

checkCanonicalTranslationStartSite

Find non-canonical translation start site

Takes a sequence and returns boolean for canonical translation start site.

seq.checkCanonicalTranslationStartSite("ATGACCCTGAAGGT");
// true
seq.checkCanonicalTranslationStartSite("AATGACCCTGAAGGT");
// false

getReadingFrames

Get reading frames

Takes a sequence and returns an array with the six possible Reading Frames (+1, +2, +3, -1, -2, -3).

seq.getReadingFrames("ATGACCCTGAAGGTGAATGACAGGAAGCCCAAC");
// [ 'ATGACCCTGAAGGTGAATGACAGGAAGCCCAAC',
//   'TGACCCTGAAGGTGAATGACAGGAAGCCCAAC',
//   'GACCCTGAAGGTGAATGACAGGAAGCCCAAC',
//   'GTTGGGCTTCCTGTCATTCACCTTCAGGGTCAT',
//   'TTGGGCTTCCTGTCATTCACCTTCAGGGTCAT',
//   'TGGGCTTCCTGTCATTCACCTTCAGGGTCAT' ]

getOpenReadingFrames

Get open reading frames

Takes a Reading Frame sequence and returns an array of Open Reading Frames.

seq.getOpenReadingFrames("ATGACCCTGAAGGTGAATGACAGGAAGCCCAAC");
// [ 'ATGACCCTGAAGGTGAATGACAGGAAGCCCAAC' ]
seq.getOpenReadingFrames("AUGACCCUGAAGGUGAAUGACAGGAAGCCCAAC");
// [ 'AUGACCCUGAAGGUGAAUGACAGGAAGCCCAAC' ]
seq.getOpenReadingFrames("ATGAGAAGCCCAACATGAGGACTGA");
// [ 'ATGAGAAGCCCAACATGA', 'GGACTGA' ]

getAllOpenReadingFrames

Get all open reading frames

Takes a sequence and returns all Open Reading Frames in the six Reading Frames.

seq.getAllOpenReadingFrames("ATGACCCTGAAGGTGAATGACA");
// [ [ 'ATGACCCTGAAGGTGAATGACA' ],
//  [ 'TGA', 'CCCTGA', 'AGGTGA', 'ATGACA' ],
//  [ 'GACCCTGAAGGTGAATGA', 'CA' ],
//  [ 'TGTCATTCACCTTCAGGGTCAT' ],
//  [ 'GTCATTCACCTTCAGGGTCAT' ],
//  [ 'TCATTCACCTTCAGGGTCAT' ] ]

findLongestOpenReadingFrame

Find longest open reading frame

Takes a sequence and returns the longest ORF from all six reading frames and corresponding frame symbol (+1, +2, +3, -1, -2, -3). If a frame symbol is specified, only look for longest ORF on that frame. When sorting ORFs, if there’s a tie, choose the one that starts with start codon Methionine. If there’s still a tie, return one randomly.

seq.findLongestOpenReadingFrame("ATGACCCTGAAGGTGAATGACA");
// [ 'ATGACCCTGAAGGTGAATGACA', '+1' ]
seq.findLongestOpenReadingFrame("ATGACCCTGAAGGTGAATGACA", "-1");
// "TGTCATTCACCTTCAGGGTCAT"

Appendix

bionode-ncbi databases

Database	Contains
gquery	All Databases
assembly	Genome assembly information
bioproject	Data related to a single initiative
biosample	Biological source materials used in experimental assays
biosystems	Literature, small molecules, and sequence data grouped by biological relationships
books	Books and documents in life science and healthcare
clinvar	Genomic variation and its relationship to human health
clone	Clones and libraries, including sequence data, map positions and distributor information
cdd	Conserved Domains. Annotation of functional units in proteins.
gap	Genotypes and Phenotypes (dbGaP). Interaction of genotype and phenotype in Humans
dbvar	Genomic structural variation – insertions, deletions, duplications, inversions, mobile element insertions, translocations, and complex chromosomal rearrangements
nucest	Expressed Sequence Tags (dbEST). Short single-read transcript sequences from GenBank
gene	Genes, focusing on genomes that have been completely sequenced
genome	Genomes including sequences, maps, chromosomes, assemblies, and annotations
gds	GEO DataSets. Curated gene expression and molecular abundance DataSets assembled from the Gene Expression Omnibus (GEO)
geoprofiles	GEO Profiles. Individual gene expression and molecular abundance Profiles assembled from the Gene Expression Omnibus (GEO)
nucgss	Database of Genome Survey Sequences (dbGSS). A division of GenBank that contains short single-pass reads of genomic DNA.
gtr	Genetic Testing Registry (GTR). A voluntary registry of genetic tests and laboratories
homologene	A gene homology tool that compares nucleotide sequences between pairs of organisms in order to identify putative orthologs
medgen	A portal to information about medical genetics
mesh	MeSH (Medical Subject Headings) is the NLM controlled vocabulary thesaurus used for indexing articles for PubMed
ncbisearch	NCBI Web Site
nlmcatalog	NLM bibliographic data for journals, books, audiovisuals, computer software, electronic resources and other materials
nuccore	Nucleotide. A collection of sequences from several sources, including GenBank, RefSeq, TPA and PDB
omim	Online Mendelian Inheritance in Man (OMIM). Human genes and genetic disorders
pmc	PubMed Central (PMC). Full-text biomedical and life sciences journal literature, including clinical medicine and public health.
popset	Related DNA sequences that originate from comparative studies
probe	Nucleic acid reagents designed for use in a wide variety of biomedical research applications, together with information on reagent distributors, probe effectiveness, and computed sequence similarities
protein	Protein sequence records from a variety of sources, including GenPept, RefSeq, Swiss-Prot, PIR, PRF, and PDB.
proteinclusters	Protein sequences (clusters), consisting of Reference Sequence proteins encoded by complete prokaryotic and organelle plasmids and genomes
pcassay	PubChem BioAssay. Bioactivity assays used to screen the chemical substances contained in the PubChem Substance database
pccompound	PubChem Compound. Contains unique, validated chemical structures (small molecules)
pcsubstance	PubChem Substance. Samples and links to biological screening results that are available in PubChem BioAssay
pubmed	Citations and abstracts for biomedical literature from MEDLINE and additional life science journals
pubmedhealth	Clinical effectiveness reviews and other resources to help consumers and clinicians use and understand clinical research results. These are drawn from the NCBI Bookshelf and PubMed, including published systematic reviews from organizations such as the Agency for Health Care Research and Quality, The Cochrane Collaboration, and others (see complete listing). Links to full text articles are provided when available.
snp	A variety of tools are available for searching the SNP database, allowing search by genotype, method, population, submitter, markers and sequence similarity using BLAST. These are linked under “"Search”“ on the left side bar of the dbSNP main page.
sparcle	SPARCLE (Subfamily Protein Architecture Labeling Engine) is a resource for the functional characterization and labeling of protein sequences that have been grouped by their characteristic conserved domain architecture
sra	Sequence Read Archive (SRA). Sequencing data from the next generation of sequencing platforms
structure	Structure (Molecular Modeling Database). Macromolecular 3D structures derived from the Protein Data Bank
taxonomy	Phylogenetic lineages of more than 160,000 organisms
toolkit	The NCBI C++ Toolkit
toolkitall	The NCBI C++ Toolkit
toolkitbook	The NCBI C++ Toolkit
toolkitbookgh	The NCBI C++ Toolkit
unigene	Expressed Sequence Tags (ESTs) organized by organism, tissue type and developmental stage