pyranges.get_fasta¶
Module Contents¶
Functions¶
|
Get fasta sequence. |
- pyranges.get_fasta.get_fasta(gr, path)¶
Get fasta sequence.
- Parameters:
gr (PyRanges) – Coordinates.
path (str) – Path to fasta file
- Returns:
Sequences, one per interval.
- Return type:
Series
Note
Sorting the PyRanges is likely to improve the speed.
Warning
Note that the names in the fasta header and gr must be the same.
Examples
>>> gr = pr.from_dict({"Chromosome": ["chr1", "chr1"], ... "Start": [5, 0], "End": [8, 5]})
>>> gr +--------------+-----------+-----------+ | Chromosome | Start | End | | (category) | (int32) | (int32) | |--------------+-----------+-----------| | chr1 | 5 | 8 | | chr1 | 0 | 5 | +--------------+-----------+-----------+ Unstranded PyRanges object has 2 rows and 3 columns from 1 chromosomes. For printing, the PyRanges was sorted on Chromosome.
>>> tmp_handle = open("temp.fasta", "w+") >>> _ = tmp_handle.write("> chr1\n") >>> _ = tmp_handle.write("ATTACCAT") >>> tmp_handle.close()
>>> seq = pr.get_fasta(gr, "temp.fasta")
>>> seq 0 CAT 1 ATTAC dtype: object
>>> gr.seq = seq >>> gr +--------------+-----------+-----------+------------+ | Chromosome | Start | End | seq | | (category) | (int32) | (int32) | (object) | |--------------+-----------+-----------+------------| | chr1 | 5 | 8 | CAT | | chr1 | 0 | 5 | ATTAC | +--------------+-----------+-----------+------------+ Unstranded PyRanges object has 2 rows and 4 columns from 1 chromosomes. For printing, the PyRanges was sorted on Chromosome.