Refines breakpoints within a segment using minor allele frequency (MAF) data. If enough informative MAF sites are present, the segment is binned and can be split into finer regions using either stepwise merging or CBS (circular binary segmentation). Optionally, PON-based bias correction is applied to the resulting segments.
Usage
SearchBreakpoint(
seg_row,
maf,
pon_ref,
gender,
mergeai = 0.15,
snpmin = 3,
maxgap = 1e+06,
snpnum = 20,
maxbinsize = 1e+06,
minbinsize = 5e+05,
minsnpcov = 20,
segmethod = "cbs",
cbssmooth = "no"
)Arguments
- seg_row
Data frame row (list or tibble row) representing a single segment. Must have columns: Sample, Chromosome, Start, End, Num_Probes, Segment_Mean, Segment_Mean_raw, Count, Baseline_cov, gatk_gender, pipeline_gender, size.
- maf
Data frame or tibble containing MAF data. Must include columns: Chromosome, Pos, maf.
- pon_ref
Data frame. Panel of normal reference for bias correction (required for bias correction step).
- gender
Character. If
"female", the X chromosome will also be proceed.- mergeai
Numeric. Threshold for the difference in MAF (gmm_mean) between adjacent segments to allow merging under
"merge"mode segmentation.- snpmin
Numeric. Minimum SNP count required for a segment to be considered as a separate segment under
"merge"mode segmentation.- maxgap
Numeric. Maximum allowed gap between SNPs within a bin.
- snpnum
Integer. Target number of SNPs per bin.
- maxbinsize
Numeric. Maximum allowed bin size (bp).
- minbinsize
Numeric. Minimum allowed bin size (bp). The minimum segment size under
"merge"mode is 2*minbinsize.- minsnpcov
Integer. Minimum coverage of SNP sites to be included.
- segmethod
Character. Segmentation method to use: if
"merge", perform stepwise merging; if"cbs", perform CBS (circular binary segmentation).- cbssmooth
Character. If using the
"cbs"segmentation method, set to"yes"to apply smoothing before segmentation, or"no"to skip smoothing.
Value
A data frame with the refined segment(s), including updated breakpoints, MAF metrics, and a BreakpointSource column indicating whether breakpoints were post-processed or from GATK.
Details
The function first bins the MAF data within the segment. If segmethod = "merge", segments are merged stepwise based on the MAF difference and SNP count. If segmethod = "cbs", CBS segmentation is performed on the binned MAF values, with optional smoothing. After segmentation, bias correction using the panel of normal can be applied. The function returns refined segments with updated metrics and a BreakpointSource label.
