Determines the optimal number of bins (strata) for grouping samples by SNP median depth, ensuring each bin is sufficiently populated and diverse for downstream modeling.
Usage
ChooseNbins(
normals_dt,
target_bins = 8,
min_rows_per_stratum = 2000,
min_unique_bins = 100,
max_bins = 10,
min_bins = 3
)
Arguments
- normals_dt
Data frame or data.table. Must has depth information of each bin.
Value
Integer. The chosen number of bins that satisfies the population and diversity constraints.
Details
This function iteratively tests different bin (strata) numbers to find the optimal value based on SNP counts and sequencing depth.