

Given the AABBAA synteny pattern, a B-to-A transfer would appear to be the more likely mechanism. However, the pair of species A and B have strong hits (>k 2) to all loci, and so WAAFLE concludes that this contig may represent an A+B LGT. In Example 2, no single species can explain all of the loci (the minimum score for each species is below k 1). Hence, WAAFLE will report this contig as a one-species contig explained by species C. In Example 1, genes from species C are able to explain all of the loci reasonably well (with scores exceeding k 1). If all per-locus scores for a pair of species exceed a stringent homology threshold (k 2), then the contig is considered a putative LGT between those species.īoth cases consider contigs with six protein-coding loci (determined from WAAFLE itself or an independent ORF-calling program such as Prodigal). Otherwise, the process is repeated for pairs of species. If one or more species meet this criterion, then the contig is assigned to the species with the best average score. WAAFLE then looks for a species whose minimum per-locus score exceeds a lenient homology threshold (k 1).

More specifically, for each locus in a contig, WAAFLE identifies the best hit to each species in a pangenome database. WAAFLE integrates gene sequence homology and taxonomic provenance to identify metagenomic contigs explained by pairs of microbial clades but not by single clades (i.e.
