cell2net.preprocessing.filter_motifs_by_genes#
- cell2net.preprocessing.filter_motifs_by_genes(motifs, mdata, rna_mod='rna', inplace=True)#
Filter motifs by matching their names to expressed gene names in the RNA modality.
This function identifies transcription factor (TF) motifs whose names overlap with the expressed genes in the RNA modality of a multimodal MuData object. The filtered motifs are returned or stored in mdata.uns[“motifs”]
- Parameters:
motifs (
Iterable
) –A collection of motif objects to be filtered. Can be obtained using get_motifs_from_jaspar. Each motif should have:
name: Motif name (string).
matrix_id: Unique identifier for the motif.
mdata (
MuData
) – A multimodal data object containing at least an RNA modality. The RNA modality should have genes stored in .var_names.rna_mod (
str
(default:'rna'
)) – The key for the RNA modality in mdata.inplace (
bool
(default:True
)) – If True, stores the filtered motifs in mdata.uns[“motifs”]. If False, returns the filtered DataFrame.
- Return type:
- Returns:
If inplace=True: Returns None. The filtered motifs are stored in mdata.uns[“motifs”].
If inplace=False: Returns a DataFrame with filtered motifs and their corresponding genes. The DataFrame has the following columns:
”motif_name”: Name of the motif.
”motif_id”: Unique identifier of the motif.
”gene_name”: Name of the matching gene.
Notes
Gene names from the RNA modality are converted to uppercase for case-insensitive matching with motif names.
Duplicate motifs are removed based on their uppercased names, retaining the last occurrence.
The filtered DataFrame or stored result includes only motifs with names matching gene names.
Examples
Filter motifs by genes and store the results in mdata.uns:
>>> filter_motifs_by_genes(motifs, mdata, rna_mod="rna", inplace=True) >>> mdata.uns["motifs"]
Filter motifs by genes and return the filtered DataFrame:
>>> df_filtered = filter_motifs_by_genes(motifs, mdata, inplace=False) >>> print(df_filtered)
Access filtered motifs after storing in mdata:
>>> mdata.uns["motifs"] >>> mdata.uns["motifs"].head()