metaloci.spatial_stats package

Subset of modules that contain methods to compute spatial statistics on the Hi-C data. For now, only Moran’s I is implemented.

Submodules

metaloci.spatial_stats.lmi module

Functions to compute Local Moran’s I for a given signal in a METALoci object.

metaloci.spatial_stats.lmi.aggregate_signals(mlobject: MetalociObject)

Aggregates the signals in the signals_dict of the METALoci object.

Parameters:

mlobject (mlo.MetalociObject) – METALoci object that contains the needed information

metaloci.spatial_stats.lmi.compute_lmi(mlobject: MetalociObject, signal_type: str, neighbourhood: float, n_permutations: int = 9999, signipval: float = 0.05, silent: bool = False, poi_only=False, del_args=None) DataFrame

Computes Local Moran’s Index for a signal type and outputs information of the LMI value and its p-value for each bin for a given signal, as well as some other information.

Parameters:
  • mlobject (mlo.MetalociObject) –

    METALoci with signals for that region (MetalociObject.signals_dict) and MetalociObject.lmi_geometry calculated with:

    lmi.construct_voronoi()

  • signal_type (str) – Name of the type of the signal to use for the computation.

  • neighbourhood (float) – Radius of the circle that determines the neighbourhood of each of the points in the Kamada-Kawai layout.

  • n_permutations (int, optional) – Number of permutations to do in the randomization, by default 9999.

  • signipval (float, optional) – Significancy threshold for p-value, by default 0.05.

  • silent (bool, optional) – Controls the verbosity of the function (useful for multiprocessing), by default False.

Returns:

Dataframe with ID, bin index, chromosome, start, end, value of signal, moran index, moran quadrant, LMI score, LMI p-value and LMI inverse of p-value for this signal.

Return type:

pd.DataFrame

metaloci.spatial_stats.lmi.construct_voronoi(mlobject: MetalociObject, buffer: float, del_args: Series = None) DataFrame

Takes a Kamada-Kawai layout in a METALoci object and calculates the geometry of each voronoi around each point of the Kamada-Kawai, in order the make a gaudi plot.

Parameters:
  • mlobject (mlo.MetalociObject) – METALoci object with Kamada-Kawai layout coordinates in it (MetalociObject.kk_coords).

  • buffer (float) – Distance to buffer around the point to be painted in the gaudi plot.

Returns:

df_geometry – Dataframe containing the information about the geometry of each point of the gaudi plot, for each bin. Needed for plotting the gaudi plot.

Return type:

pd.DataFrame

metaloci.spatial_stats.lmi.get_bed(mlobject: MetalociObject, lmi_geometry: DataFrame, neighbourhood: float, quadrants: list = None, signipval: float = 0.05, poi: int = None, silent: bool = True) DataFrame

Get the bins that are significant in the Local Moran’s I for a given point of interest.

Parameters:
  • mlobject (mlo.MetalociObject) – METALoci object with the needed information.

  • lmi_geometry (pd.DataFrame) – DataFrame with the geometry of the bins.

  • neighbourhood (float) – Radius of the circle that determines the neighbourhood of each of the points in the Kamada-Kawai layout.

  • bfact (float) – Factor to multiply the neighbourhood by.

  • quadrants (list, optional) – List of quadrants to consider, by default None.

  • signipval (float, optional) – Significancy threshold for p-value, by default 0.05.

  • poi (int, optional) – Point of interest, by default None.

  • silent (bool, optional) – Controls the verbosity of the function (useful for multiprocessing), by default True.

Returns:

bed – BED file with the bins that are significant in the Local Moran’s I.

Return type:

pd.DataFrame

metaloci.spatial_stats.lmi.load_region_signals(mlobject: MetalociObject, signal_data: dict, signal_types: list) dict

Does a subset of the signal file to contain only the signal corresponding to the region being processed.

Parameters:
  • mlobject (mlo.MetalociObject) – METALoci object with kk_coords in it.

  • signal_data (dict) – Dictionary with signal for each chromosome. Each key is chrN, each value is a dataframe with its corresponding signal.

  • signal_file (Path) – Path to text file with all the signal types to compute, one per line.

Returns:

  • signals_dict (dict) – Dictionary containing signal information, but only for the region being processed.

  • signal_types (list(str)) – List containing the signal types to compute for this region.