metaloci.spatial_stats package
Subset of modules that contain methods to compute spatial statistics on the Hi-C data. For now, only Moran’s I is implemented.
Submodules
metaloci.spatial_stats.lmi module
Functions to compute Local Moran’s I for a given signal in a METALoci object.
- metaloci.spatial_stats.lmi.aggregate_signals(mlobject: MetalociObject)
Aggregates the signals in the signals_dict of the METALoci object.
- Parameters:
mlobject (mlo.MetalociObject) – METALoci object that contains the needed information
- metaloci.spatial_stats.lmi.compute_lmi(mlobject: MetalociObject, signal_type: str, neighbourhood: float, n_permutations: int = 9999, signipval: float = 0.05, silent: bool = False, poi_only=False, del_args=None) DataFrame
Computes Local Moran’s Index for a signal type and outputs information of the LMI value and its p-value for each bin for a given signal, as well as some other information.
- Parameters:
mlobject (mlo.MetalociObject) –
METALoci with signals for that region (MetalociObject.signals_dict) and MetalociObject.lmi_geometry calculated with:
lmi.construct_voronoi()
signal_type (str) – Name of the type of the signal to use for the computation.
neighbourhood (float) – Radius of the circle that determines the neighbourhood of each of the points in the Kamada-Kawai layout.
n_permutations (int, optional) – Number of permutations to do in the randomization, by default 9999.
signipval (float, optional) – Significancy threshold for p-value, by default 0.05.
silent (bool, optional) – Controls the verbosity of the function (useful for multiprocessing), by default False.
- Returns:
Dataframe with ID, bin index, chromosome, start, end, value of signal, moran index, moran quadrant, LMI score, LMI p-value and LMI inverse of p-value for this signal.
- Return type:
pd.DataFrame
- metaloci.spatial_stats.lmi.construct_voronoi(mlobject: MetalociObject, buffer: float, del_args: Series = None) DataFrame
Takes a Kamada-Kawai layout in a METALoci object and calculates the geometry of each voronoi around each point of the Kamada-Kawai, in order the make a gaudi plot.
- Parameters:
mlobject (mlo.MetalociObject) – METALoci object with Kamada-Kawai layout coordinates in it (MetalociObject.kk_coords).
buffer (float) – Distance to buffer around the point to be painted in the gaudi plot.
- Returns:
df_geometry – Dataframe containing the information about the geometry of each point of the gaudi plot, for each bin. Needed for plotting the gaudi plot.
- Return type:
pd.DataFrame
- metaloci.spatial_stats.lmi.get_bed(mlobject: MetalociObject, lmi_geometry: DataFrame, neighbourhood: float, quadrants: list = None, signipval: float = 0.05, poi: int = None, silent: bool = True) DataFrame
Get the bins that are significant in the Local Moran’s I for a given point of interest.
- Parameters:
mlobject (mlo.MetalociObject) – METALoci object with the needed information.
lmi_geometry (pd.DataFrame) – DataFrame with the geometry of the bins.
neighbourhood (float) – Radius of the circle that determines the neighbourhood of each of the points in the Kamada-Kawai layout.
bfact (float) – Factor to multiply the neighbourhood by.
quadrants (list, optional) – List of quadrants to consider, by default None.
signipval (float, optional) – Significancy threshold for p-value, by default 0.05.
poi (int, optional) – Point of interest, by default None.
silent (bool, optional) – Controls the verbosity of the function (useful for multiprocessing), by default True.
- Returns:
bed – BED file with the bins that are significant in the Local Moran’s I.
- Return type:
pd.DataFrame
- metaloci.spatial_stats.lmi.load_region_signals(mlobject: MetalociObject, signal_data: dict, signal_types: list) dict
Does a subset of the signal file to contain only the signal corresponding to the region being processed.
- Parameters:
mlobject (mlo.MetalociObject) – METALoci object with kk_coords in it.
signal_data (dict) – Dictionary with signal for each chromosome. Each key is chrN, each value is a dataframe with its corresponding signal.
signal_file (Path) – Path to text file with all the signal types to compute, one per line.
- Returns:
signals_dict (dict) – Dictionary containing signal information, but only for the region being processed.
signal_types (list(str)) – List containing the signal types to compute for this region.