Each event in the VCF is checked against a few independent noise-flag rules. A flag does not drop the event from the output — flagged events still appear in the HTML report, but render greyed/dashed in the figures so the eye goes to the unflagged signal first.
The directional sense of the raise / lower columns is consistent: raising a threshold makes the filter looser (fewer events flagged), lowering makes it stricter (more events flagged).
| flag | default | raise → | lower → |
|---|---|---|---|
--cov-ratio |
auto (= max(2.0, p99)) |
passing a higher fixed number (e.g. --cov-ratio 4) flags fewer events; only extreme coverage spikes get caught. |
passing a lower fixed number (e.g. --cov-ratio 2) flags more events; even mild coverage anomalies are excluded. The default auto adapts per sample. |
--cov-vaf-max |
0.35 |
flags more high-coverage events as noise (some real mosaic / subgermline events may get greyed). | flags fewer; only unambiguously low-VAF artefacts. |
--mark-acrocentric |
on for hg38, off for t2t | binary — on means chr13/14/15/21/22 p-arm-only BNDs render grey. Off when investigating real acrocentric biology, or for T2T (where those p-arms are real, fully-resolved sequence). | |
--min-svlen |
50 |
shows only larger SVs everywhere (density tracks and stdout summaries); cleaner picture, but small events (Alu insertions, micro-deletions) drop out. | includes very small events (10–50 bp); polymorphism noise dominates. Set 0 to disable. |
--focus-window |
1000 |
accepts more candidate matches around a focus position; useful for IMPRECISE breakpoints. | tighter exact-position match. |
--cov-ratio + --cov-vaf-max)A long-read SV with max(COVERAGE) > N × genome-median-coverage AND VAF < threshold is the canonical repeat-collapse / mismapping signature. Read pile-up is high (multiple repeat copies aligning to the same locus) but the variant fraction stays low (most reads agree with reference). Default flag rule: VAF < 0.35 and the per-event max(COVERAGE) / median_coverage is at or above the threshold described next.
--cov-ratio defaults to auto, which sets the threshold per sample to max(2.0, p99 of the in-sample max-coverage / median-coverage distribution). This adapts to each sample’s coverage profile (high-noise samples get a stricter cutoff; clean samples fall to the 2.0× floor). The chosen value is printed at run start, e.g. --cov-ratio auto: 2.72x (p99 of 25,167 PASS events; floor 2.0x).
Pass a fixed number to override (int or float — e.g. --cov-ratio 3 or --cov-ratio 2.5 — reproduces a literature-style constant for samples where you’d rather pin the threshold).
--mark-acrocentric)On hg38, the p-arms of chr13/14/15/21/22 are largely unresolved (rDNA arrays, satellite DNA). BNDs with both endpoints in those p-arms are almost always mapping artefacts on hg38. The flag is off by default for T2T-CHM13v2.0, where those regions are properly resolved sequence.
--min-svlen)Non-BND SVs (INS / DEL / DUP / INV) shorter than --min-svlen (default 50 bp) are dropped from every downstream consumer: density tracks and stdout summaries. BND records have no SVLEN and are not affected. Set --min-svlen 0 to keep every event.
--focus-window)When --focus CHR:POS is given, BNDs whose endpoints fall within ±--focus-window bp of the requested coordinate are kept and the rest are filtered out for that figure.