HUMAnN

The HMP Unified Metabolic Analysis Network

What is HUMAnN?

HUMAnN (HMP Unified Metabolic Analysis Network) is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data.

HUMAnN answers the question: β€œWhat metabolic functions is the microbial community performing?”

Current version: HUMAnN 3 (also called HUMAnN3)


When to Use HUMAnN

Use HUMAnN when you have shotgun metagenomic (or metatranscriptomic) data and you want to quantify:

  • Gene families β€” individual protein-coding genes, grouped by UniRef clusters
  • Metabolic pathways β€” metabolic pathways from the MetaCyc database
  • Pathway coverage β€” how completely a pathway is present
Note

For 16S amplicon data, use PICRUSt2 instead for functional prediction.


Installation

Via pip

pip install humann

Databases

HUMAnN requires reference databases (ChocoPhlAn and UniRef). Download them after installation:

humann_databases --download chocophlan full /path/to/databases
humann_databases --download uniref uniref90_diamond /path/to/databases

Basic Usage

humann \
  --input sample.fastq.gz \
  --output output_directory/ \
  --threads 8

Key options

Option Description
--input Input FASTQ file (can be gzipped)
--output Output directory
--threads Number of CPU threads
--taxonomic-profile Pre-computed MetaPhlAn profile (speeds up run)
--protein-database Path to UniRef database
--nucleotide-database Path to ChocoPhlAn database

Output Files

HUMAnN produces three main output tables:

File Contents
*_genefamilies.tsv Abundance of UniRef90 gene families
*_pathabundance.tsv Abundance of metabolic pathways
*_pathcoverage.tsv Fraction of each pathway covered

Each table reports both community-level totals and per-species contributions (stratified output).

Renormalizing output

# Normalize to copies per million (CPM)
humann_renorm_table \
  --input sample_genefamilies.tsv \
  --output sample_genefamilies_cpm.tsv \
  --units cpm

Joining multiple samples

humann_join_tables \
  --input output_directory/ \
  --output all_samples_pathabundance.tsv \
  --file_name pathabundance

Tips & Gotchas

Tip

Speed up runs by providing a pre-computed MetaPhlAn profile with --taxonomic-profile. This skips the MetaPhlAn step.

Warning

Memory requirements β€” The UniRef90 database can require 40+ GB RAM for DIAMOND alignment. Consider using UniRef50 (uniref50_diamond) on smaller machines.

Tip

Low-depth samples may produce uninformative results. HUMAnN works best with at least 10 million reads.


Further Reading