cell2net.preprocessing.one_hot_to_seq#

cell2net.preprocessing.one_hot_to_seq(one_hot)#

Converts a one-hot encoded DNA matrix back to a nucleotide sequence.

Parameters:

one_hot (ndarray) – A NumPy array of shape (sequence_length, 4), where each row represents a nucleotide in one-hot encoding format.

Return type:

str

Returns:

The reconstructed DNA sequence, where each character represents a nucleotide.

Notes

  • If a row has all zeros, the function assigns “N” to represent an unknown nucleotide.

  • The function assumes a valid one-hot encoding where each row has at most one “1”.

Examples

Convert a one-hot encoded DNA sequence back to a string:

>>> import numpy as np
>>> one_hot = np.array([
...     [1, 0, 0, 0],  # A
...     [0, 1, 0, 0],  # C
...     [0, 0, 0, 0],  # N (unknown base)
...     [0, 0, 1, 0],  # G
...     [0, 0, 0, 1]   # T
... ])
>>> one_hot_to_seq(one_hot)
'ACNGT'