Introduction

This report contains tables and plots to help interpret the results of wf-amplicon. The workflow was run in variant calling mode. The individual sections of the report summarize the outcomes of the different steps of the workflow (read filtering, mapping against the reference file containing the amplicon sequences, variant calling).

Note: If the sequence IDs in the reference file contained special characters, they were replaced with underscores.

The input data contained:

1 sample:
sample_bc20

1 amplicon:
folp1_amplicon_leprae

At a glance

Key results for the individual samples are shown below. You can use the dropdown menu to view the results for a different sample.

Reads

2,306

Bases

656,409

Mean length

284.7

Mean quality

11.7

Amplicons detected

1 / 1

Mean coverage across all amplicons

1697.6

Smallest mean coverage for any amplicon

1697.6

SNVs

1

Indels

0

Preprocessing

Some basic stats covering the raw reads and the reads remaining after the initial filtering step (based on length and mean quality) as well as after downsampling and trimming are illustrated in the table below.

Condition Reads Bases Min read length Max read length Mean quality
Raw 2.3 k 903.0 k 112 1,465 10.6
Filtered 2.3 k 903.0 k 112 1,465 10.6
Downsampled, trimmed 2.3 k 656.4 k 46 1,349 11.7

The following plots show the read quality and length distributions as well as the base yield after filtering (but before downsampling / trimming) for each sample (use the dropdown menu to view the plots for the individual samples).

Summary

The two tables below (one per tab) briefly summarize the main results of mapping the reads to the provided amplicon references and subsequent variant calling. Percentages of unmapped reads are relative to the number of reads for that particular sample. Other percentages are relative to the total number of reads / bases including all samples.

Sample alias Barcode Type Reads Bases Median read length Amplicons Unmapped Variants (indels)
sample_bc20 barcode20 test_sample
2.3 k (100%)
656.4 k (100%)
317 1
540 (23%)
1 (0)
Amplicon Reads Bases Median read length Samples Mean cov. Mean acc. Variants (indels)
folp1_amplicon_leprae
1.8 k (77%)
570.6 k (87%)
319 1 98.5 93.2 1 (0)
Unmapped
540 (23%)
85.9 k (13%)
141 1 0.0 0.0 0 (0)

The following table breaks the results down further (one sample–amplicon combination per row).

Sample Amplicon Reads Bases Median read length Mean cov. Mean acc. Variants (indels)
sample_bc20 folp1_amplicon_leprae
1.8 k (77%)
570.6 k (87%)
319 98.5 93.2 1 (0)
sample_bc20 Unmapped
540 (23%)
85.9 k (13%)
141 0.0 0.0 0 (0)

Depth of coverage

Coverage along the individual amplicon, (use the dropdown menu to view the plots for the individual amplicons).

Variants

Haploid variant calling was performed with Medaka. Variants with low depth (i.e. smaller than --min_coverage) are shown under the "Low depth" tab. The numbers in the "depth" column relate to the sequencing depth used to perform variant calling.

Sample Amplicon Position Ref. allele Alt. allele Type Depth
sample_bc20 folp1_amplicon_leprae 89 C T SNP 300
Sample Amplicon Position Ref. allele Alt. allele Type Depth

Software versions

Name Version
python 3.8.19
fastcat 0.18.6
ezcharts 0.11.2
dominate 2.9.1
numpy 1.24.4
pandas 2.0.3
pysam 0.22.0
si-prefix 1.3.3
seqkit v2.8.2
porechop 0.2.4
samtools 1.19.2
minimap2 2.28-r1209
mosdepth 0.3.7
miniasm 0.3-r179
racon 1.5.0
csvtk 0.27.2
medaka 2.2.0

Workflow parameters

Key Value
fastq /data/new-hansen/lep_amr/runs/20260613_122631_02b071fd/work/input_fastq
reference /data/new-hansen/lep_amr/runs/20260613_122631_02b071fd/work/screening/selected_amplicons.fasta
out_dir /data/new-hansen/lep_amr/runs/20260613_122631_02b071fd/work/epi2me_output
sample None
sample_sheet /data/new-hansen/lep_amr/runs/20260613_122631_02b071fd/work/sample_sheet.csv
analyse_unclassified False
combine_results False
igv False
store_dir None
min_read_length 80
max_read_length None
min_read_qual 10
drop_frac_longest_reads 0
take_longest_remaining_reads False
reads_downsampling_size 0
min_n_reads 40
number_depth_windows 100
min_coverage 20
override_basecaller_cfg dna_r10.4.1_e8.2_400bps_hac@v4.2.0
medaka_target_depth_per_strand 150
force_spoa_length_threshold 2000
spoa_minimum_relative_coverage 0.15
spoa_max_allowed_read_length 5000
minimum_mean_depth 30
primary_alignments_threshold 0.7
threads 4