PhyloPhlAn
High-resolution Microbial Phylogenetics
What is PhyloPhlAn?
PhyloPhlAn is a tool for high-resolution microbial phylogeny reconstruction, genome characterization, and taxonomic assignment of microbial genomes. It uses a large database of universal single-copy marker proteins to build comprehensive and accurate phylogenetic trees from whole genomes or metagenome-assembled genomes (MAGs).
PhyloPhlAn answers the question: βWhere does this genome fit on the tree of life?β
Current version: PhyloPhlAn 3
- π GitHub
- π Documentation
- ποΈ Paper: Asnicar et al. 2020, Nature Communications
When to Use PhyloPhlAn
Use PhyloPhlAn when you want to:
- Place new genomes or MAGs on a reference phylogenetic tree
- Characterize the taxonomy of novel organisms
- Build high-quality, large-scale phylogenetic trees
- Assign taxonomy to unannotated genomes
Installation
Via conda (recommended)
conda create -n phylophlan -c bioconda -c conda-forge phylophlan
conda activate phylophlanVia pip
pip install phylophlanExternal dependencies
PhyloPhlAn requires: - mash β for initial genome sketching and clustering - muscle or mafft β for multiple sequence alignment - trimal β for alignment trimming - raxml or iqtree β for phylogenetic tree inference
Install all at once via conda:
conda install -c bioconda mash muscle mafft trimal raxml iqtreeBasic Usage
Database setup
# List available databases
phylophlan_databases --help
# Download a database (e.g., for genome phylogeny)
phylophlan_setup_database \
-d phylophlan \
--database_folder databases/Phylogenetic tree construction
phylophlan \
-i genome_folder/ \
-d phylophlan \
--databases_folder databases/ \
-o output_tree/ \
--diversity medium \
--nproc 8 \
-f supermatrix_aa.cfgMetagenomic strain tracking
phylophlan_metagenomic \
-i mag_folder/ \
-d SGB.Jan19 \
--databases_folder databases/ \
-o output_sgbs/ \
--nproc 8Key Modes
| Mode | Command | Use case |
|---|---|---|
| Genome phylogeny | phylophlan |
Build trees from genomes |
| MAG classification | phylophlan_metagenomic |
Assign MAGs to SGBs |
| Database creation | phylophlan_setup_database |
Build custom databases |
Output Files
| File | Contents |
|---|---|
*.tre |
Newick-format phylogenetic tree |
*.xml |
PhyloXML-format tree |
*_refined.tre |
Tree after outlier removal |
Tips & Gotchas
Choose the right diversity setting β Use --diversity low for closely related strains, medium for species-level analyses, and high for genus/family-level trees.
Input genome quality matters β Highly fragmented or low-quality genomes (e.g., MAGs with <50% completeness) may produce poorly placed branches. Filter by completeness using CheckM first.
PhyloPhlAn integrates with MetaPhlAn β MetaPhlAnβs SGB (Species-level Genome Bins) taxonomy is based on PhyloPhlAn trees.