Skip to contents

Reads and bins allelic imbalance (AI) data from a panel of normal (PoN), summarizes bin-level statistics, and estimates beta-binomial dispersion (theta) for downstream modeling.

Usage

PONAIprocess(
  ai_pon_file,
  aitype,
  minsnpcov = 20,
  output,
  prefix,
  maxgap = 2e+06,
  maxbinsize = 5e+06,
  minbinsize = 5e+05,
  snpnum = 30,
  gender
)

Arguments

ai_pon_file

Character. Path to a text file listing PoN AI file paths (one per line).

aitype

Character. Type of AI input file (passed to ReadPonAI(), e.g., "gatk", "dragen", "other").

minsnpcov

Integer. Minimum SNP coverage to include a site in the AI calculation. (default: 20)

output

Character. Output directory for the processed PoN AI Rdata file.

prefix

Character. Prefix for the output file.

maxgap

Numeric. Maximum allowed gap between SNPs within a bin. (default: 2000000)

maxbinsize

Numeric. Maximum allowed bin size (bp). (default: 5000000)

minbinsize

Numeric. Minimum allowed bin size (bp). (default: 500000)

snpnum

Integer. Target number of SNPs per bin. (default: )

gender

Character. Gender of sample "male" or "female".

Value

Invisibly returns NULL. Saves an Rdata file containing the processed PoN reference (pon_ref) and the estimated dispersion parameters (theta_fit).

Details

This function reads all PoN AI files, bins the AI data using BinMaf(), summarizes bin-level BAF/MAF and depth statistics, and estimates beta-binomial over-dispersion (theta) stratified by depth. The reference table and theta estimates are saved as an Rdata file for use in downstream analysis.