metaloci.utility_scripts package
Subset of modules to run some utility scripts, useful for METALoci but that are not part of the main pipeline.
Submodules
metaloci.utility_scripts.bts module
Find the best combination persistence length and cut-off for your Hi-C, given a Hi-C resolution.
Check first the maximum resolution yout Hi-C allows, and supply it to this script. This script will then determine, by computing layouts on a sample of regions, the best combination of persistence length and cut-off for your Hi-C. You can then use these parameters to run ‘metaloci layout’ on your regions of interest.
This can take from minutes to a few hours, depending on the resolution, the size of the regions and the number of cpu’s available in your machine. Once you run it for a specific Hi-C and a specific resolution, the parameters will remain the same for other runs with other signals.
- metaloci.utility_scripts.bts.bts_ratio(hic_path: str, resolution: int, region: str, cutoff: float, pl: float) float
Calculate the ratio of the correlation between the linear and spherical layouts against the layout we actually want to use. This is a good mesure to estimate the best set of parameters for the Kamada-Kawai layout.
- Returns:
linear_correlation / spherical_correlation – Ratio of the correlation between the linear and spherical layouts.
- Return type:
float
- metaloci.utility_scripts.bts.param_search(row: Series, args: Series, progress=None)
Test METALoci parameters to optimise Kamada-Kawai layout.
- Parameters:
row (pd.Series) – Row of the region file.
args (pd.Series) – Arguments to test.
progress (mp.Manager().dict) – Progress bar.
- metaloci.utility_scripts.bts.pl_estimation(row: Series, args: Series, progress=None) float
Wrapper function to count the number of interacctions in a subseted Hi-C matrix.
- Parameters:
row (pd.Series) – Row of the region file.
args (pd.Series) – Arguments to test.
progress (mp.Manager().dict) – Progress bar.
- Returns:
sum_dict – Dictionary with the sum of interactions for each set of parameters
- Return type:
dict
- metaloci.utility_scripts.bts.run(opts: list)
Main function to run bts.
- Parameters:
opts (list) – List of arguments
- metaloci.utility_scripts.bts.sum_hic_columns(hic_path: str, resolution: int, region: str, cutoff: float) float
Function to count the number of interacctions in a subseted Hi-C matrix.
- Parameters:
hic_path (str) – Path to the Hi-C file.
resolution (int) – Resolution of the Hi-C file.
region (tuple) – Region of interest.
cutoff (float) – Cutoff to use for the subset matrix.
- Returns:
median_sum – Median sum of the interactions in the subset matrix.
- Return type:
float
metaloci.utility_scripts.gene_selector module
This script parses the LMI information files created by METALoci.
The output file will contain regions that pass the quadrant and p-value threshold for a given signal. In case it doesn’t pass this filters, the script will output NA.
- metaloci.utility_scripts.gene_selector.run(opts: list)
Funtion to run this section of METALoci with the needed arguments
- Parameters:
opts (list) – List of arguments
metaloci.utility_scripts.sniffer module
Takes a .gft file or a .bed file and parses it into a region list, with a specific resolution and extension. The point of interest of the regions is the middle bin.
- metaloci.utility_scripts.sniffer.run(opts: list)
Funtion to run this section of METALoci with the needed arguments.
- Parameters:
opts (list) – List of arguments.