|Open In Colab| .. |Open In Colab| image:: https://colab.research.google.com/assets/colab-badge.svg :target: https://colab.research.google.com/drive/1tsbLl_yB9lcC_lo4sjwEoI0r1KAjPkoB CRISPR Reporter Screen analysis with BEAN ========================================= In this tutorial, we will cover \* Obtaining target variant editing rate from allele count information \* ReporterScreen object slicing (indexing) \* Allele filtering based on significance For the basic API of ReporterScreen object, see `perturb-tools tutorial of Screen API `__. .. code:: ipython3 ! pip install -q crispr-bean .. parsed-literal:: Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Collecting beans==0.1.2 Downloading beans-0.1.2.tar.gz (464 kB)  |████████████████████████████████| 464 kB 4.1 MB/s [?25h Installing build dependencies ... [?25l[?25hdone Getting requirements to build wheel ... [?25l[?25hdone Preparing wheel metadata ... [?25l[?25hdone Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from beans==0.1.2) (1.21.6) Collecting perturb-tools>=0.0.16 Downloading perturb-tools-0.1.4.tar.gz (147 kB)  |████████████████████████████████| 147 kB 51.2 MB/s [?25hCollecting matplotlib>=3.4 Downloading matplotlib-3.5.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.2 MB)  |████████████████████████████████| 11.2 MB 51.2 MB/s [?25hCollecting anndata>=0.7.1 Downloading anndata-0.8.0-py3-none-any.whl (96 kB)  |████████████████████████████████| 96 kB 4.9 MB/s [?25hRequirement already satisfied: pandas>=1.1.2 in /usr/local/lib/python3.7/dist-packages (from perturb-tools>=0.0.16->beans==0.1.2) (1.3.5) Collecting biopython>=1.79 Downloading biopython-1.79-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (2.3 MB)  |████████████████████████████████| 2.3 MB 57.1 MB/s [?25hCollecting pool-sharq>=0.0.12 Downloading pool_sharq-0.0.12-py3-none-any.whl (7.8 kB) Requirement already satisfied: plotly in /usr/local/lib/python3.7/dist-packages (from perturb-tools>=0.0.16->beans==0.1.2) (5.5.0) Requirement already satisfied: regex in /usr/local/lib/python3.7/dist-packages (from perturb-tools>=0.0.16->beans==0.1.2) (2022.6.2) Requirement already satisfied: typing_extensions in /usr/local/lib/python3.7/dist-packages (from anndata>=0.7.1->perturb-tools>=0.0.16->beans==0.1.2) (4.1.1) Requirement already satisfied: importlib_metadata>=0.7 in /usr/local/lib/python3.7/dist-packages (from anndata>=0.7.1->perturb-tools>=0.0.16->beans==0.1.2) (4.12.0) Requirement already satisfied: packaging>=20 in /usr/local/lib/python3.7/dist-packages (from anndata>=0.7.1->perturb-tools>=0.0.16->beans==0.1.2) (21.3) Requirement already satisfied: scipy>1.4 in /usr/local/lib/python3.7/dist-packages (from anndata>=0.7.1->perturb-tools>=0.0.16->beans==0.1.2) (1.7.3) Requirement already satisfied: natsort in /usr/local/lib/python3.7/dist-packages (from anndata>=0.7.1->perturb-tools>=0.0.16->beans==0.1.2) (5.5.0) Requirement already satisfied: h5py>=3 in /usr/local/lib/python3.7/dist-packages (from anndata>=0.7.1->perturb-tools>=0.0.16->beans==0.1.2) (3.1.0) Requirement already satisfied: cached-property in /usr/local/lib/python3.7/dist-packages (from h5py>=3->anndata>=0.7.1->perturb-tools>=0.0.16->beans==0.1.2) (1.5.2) Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib_metadata>=0.7->anndata>=0.7.1->perturb-tools>=0.0.16->beans==0.1.2) (3.8.1) Collecting fonttools>=4.22.0 Downloading fonttools-4.37.3-py3-none-any.whl (959 kB)  |████████████████████████████████| 959 kB 44.2 MB/s [?25hRequirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.4->perturb-tools>=0.0.16->beans==0.1.2) (3.0.9) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.4->perturb-tools>=0.0.16->beans==0.1.2) (1.4.4) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.4->perturb-tools>=0.0.16->beans==0.1.2) (0.11.0) Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.4->perturb-tools>=0.0.16->beans==0.1.2) (7.1.2) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.4->perturb-tools>=0.0.16->beans==0.1.2) (2.8.2) Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=1.1.2->perturb-tools>=0.0.16->beans==0.1.2) (2022.2.1) Collecting licorice>=0.0.2 Downloading licorice-0.0.3-py3-none-any.whl (8.1 kB) Collecting requests>=2.26.0 Downloading requests-2.28.1-py3-none-any.whl (62 kB)  |████████████████████████████████| 62 kB 1.4 MB/s [?25hRequirement already satisfied: tqdm>=4.62.3 in /usr/local/lib/python3.7/dist-packages (from pool-sharq>=0.0.12->perturb-tools>=0.0.16->beans==0.1.2) (4.64.1) Collecting pyrequisites>=0.0.2 Downloading pyrequisites-0.0.2-py3-none-any.whl (4.1 kB) Collecting beautifulsoup4>=4.10.0 Downloading beautifulsoup4-4.11.1-py3-none-any.whl (128 kB)  |████████████████████████████████| 128 kB 66.4 MB/s [?25hCollecting soupsieve>1.2 Downloading soupsieve-2.3.2.post1-py3-none-any.whl (37 kB) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7->matplotlib>=3.4->perturb-tools>=0.0.16->beans==0.1.2) (1.15.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests>=2.26.0->pool-sharq>=0.0.12->perturb-tools>=0.0.16->beans==0.1.2) (2.10) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests>=2.26.0->pool-sharq>=0.0.12->perturb-tools>=0.0.16->beans==0.1.2) (1.24.3) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests>=2.26.0->pool-sharq>=0.0.12->perturb-tools>=0.0.16->beans==0.1.2) (2022.6.15) Requirement already satisfied: charset-normalizer<3,>=2 in /usr/local/lib/python3.7/dist-packages (from requests>=2.26.0->pool-sharq>=0.0.12->perturb-tools>=0.0.16->beans==0.1.2) (2.1.1) Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.7/dist-packages (from plotly->perturb-tools>=0.0.16->beans==0.1.2) (8.0.1) Building wheels for collected packages: beans, perturb-tools Building wheel for beans (PEP 517) ... [?25l[?25hdone Created wheel for beans: filename=beans-0.1.2-cp37-cp37m-linux_x86_64.whl size=506725 sha256=43eb6b2b41a112f7ea2d37a17e1e9bd91dffd4599710f3958769f84e59b3e6b8 Stored in directory: /root/.cache/pip/wheels/68/81/58/49b78bc024e2a5312dc830797a25c15c7acda4225b5b25d232 Building wheel for perturb-tools (setup.py) ... [?25l[?25hdone Created wheel for perturb-tools: filename=perturb_tools-0.1.4-py3-none-any.whl size=54209 sha256=e383c22165c92cf7c22605aec1da8689f60d55d1c1ea530a925aa89e5ee6608d Stored in directory: /root/.cache/pip/wheels/35/75/47/47a3d25b1958a24d7d3ee488b8956a9f43bda0bb0761b4d67c Successfully built beans perturb-tools Installing collected packages: soupsieve, licorice, requests, pyrequisites, fonttools, beautifulsoup4, pool-sharq, matplotlib, biopython, anndata, perturb-tools, beans Attempting uninstall: requests Found existing installation: requests 2.23.0 Uninstalling requests-2.23.0: Successfully uninstalled requests-2.23.0 Attempting uninstall: beautifulsoup4 Found existing installation: beautifulsoup4 4.6.3 Uninstalling beautifulsoup4-4.6.3: Successfully uninstalled beautifulsoup4-4.6.3 Attempting uninstall: matplotlib Found existing installation: matplotlib 3.2.2 Uninstalling matplotlib-3.2.2: Successfully uninstalled matplotlib-3.2.2 Successfully installed anndata-0.8.0 beautifulsoup4-4.11.1 beans-0.1.2 biopython-1.79 fonttools-4.37.3 licorice-0.0.3 matplotlib-3.5.3 perturb-tools-0.1.4 pool-sharq-0.0.12 pyrequisites-0.0.2 requests-2.28.1 soupsieve-2.3.2.post1 .. code:: ipython3 import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt import bean as be .. code:: ipython3 !gdown 18Azb8YmmMvFZo9urc2TxZr540xXWOv_v .. parsed-literal:: Downloading... From: https://drive.google.com/uc?id=18Azb8YmmMvFZo9urc2TxZr540xXWOv_v To: /content/bean_count_072121_ABE_topbot_LDLvar.h5ad 100% 40.2M/40.2M [00:00<00:00, 160MB/s] .. code:: ipython3 bdata = be.read_h5ad("bean_count_072121_ABE_topbot_LDLvar.h5ad") Data Structure -------------- ReporterScreen object is a shallow wrapper around `AnnData `__. More comprehensive data wrangling documentation can be found in their documentation. .. code:: ipython3 bdata - ``ReporterScreen.X``: guide count - ``ReporterScreen.guides``: guide metadata. Reference to ``AnnData.obs`` - ``ReporterScreen.samples``: sample/condition metadata. Reference to ``AnnData.var``. - ``ReporterScreen.layers["X_bcmatch"]``: barcode-matched guide counts - ``ReporterScreen.layers["edits"]``: edit counts - ``ReporterScreen.uns["allele_counts"]``: allele counts per guide and condition - ``ReporterScreen.uns["edit_counts"]``: edit counts per guide and condition ``.guides`` attribute contains the information about each guide. .. code:: ipython3 bdata.guides ``.samples`` attribute contains the sample and condition specific information. .. code:: ipython3 bdata.samples Per-guide allele count information is stored in ``.uns['allele_counts']``. .. code:: ipython3 bdata.uns["allele_counts"] .. raw:: html
guide allele rep1_bot rep2_bot rep3_VPA_bot rep4_VPA_bot rep1_bulk rep2_bulk rep3_VPA_bulk rep4_VPA_bulk rep1_top rep2_top rep3_VPA_top rep4_VPA_top
0 LDLR_SA_3_g5 26 13 9 13 60 21 27 31 32 26 43 43
1 LDLR_SA_3_g5 0:13:+:A>G 6 16 11 24 29 21 28 22 27 11 20 13
2 LDLR_SA_3_g5 -12:1:+:A>G,0:13:+:A>G 2 16 1 24 16 5 9 4 17 12 7 3
3 LDLR_SA_3_g5 0:13:+:A>G,7:20:+:A>G 2 0 0 0 0 0 0 9 0 0 0 0
4 LDLR_SA_3_g5 -12:1:+:A>G,-8:5:+:A>G,0:13:+:A>G,10:23:+:A>G 1 0 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
165506 2:164588224GAG_Maj_ABE_10_g3 -10:1:+:C>T,-8:3:+:A>T 0 0 0 0 0 0 0 0 0 0 0 1
165507 2:164588224GAG_Maj_ABE_10_g3 -8:3:+:A>T,-3:8:+:C>A,-1:10:+:G>A 0 0 0 0 0 0 0 0 0 0 0 1
165508 rs4921914_Min_ABE_501_g4 -9:3:+:A>G,-8:4:+:A>G,-3:9:+:A>G,2:14:+:A>G,4:... 0 0 0 0 0 0 0 0 0 0 0 1
165509 rs191388787_Maj_ABE_121_g2 -7:3:+:A>G,-5:5:+:A>G,-3:7:+:A>G 0 0 0 0 0 0 0 0 0 0 0 1
165510 rs113408797_Maj_ABE_41_g1 -8:1:+:A>G,-1:8:+:A>G,3:12:+:A>G,5:14:+:A>G,17... 0 0 0 0 0 0 0 0 0 0 0 1

165511 rows × 14 columns

Per-variant aggregated count information is stored in ``.uns['edit_counts']``. .. code:: ipython3 bdata.uns["edit_counts"] Changing column names --------------------- ``ReporterScreen.guides`` and ``ReporterScreen.var`` are equivalent to ``AnnData.obs`` and ``AnnData.var``, which are Pandas DataFrames. Column names can be changed as in Pandas: .. code:: ipython3 bdata.guides = bdata.guides.rename(columns={"Reporter":"reporter"}) New columns can be assigned likewise: .. code:: ipython3 bdata.samples["lower_quantile"] = [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.7, 0.7, 0.7, 0.7, ] Subsetting & addition --------------------- Works as anndata, supports allele & edit count operations. Subsetting & selection ~~~~~~~~~~~~~~~~~~~~~~ .. code:: ipython3 bdata.samples["replicate"], bdata.samples["sort"] = zip(*bdata.samples.index.map(lambda s: s.rsplit("_", 1))) .. code:: ipython3 bdata_subset = bdata[:10,bdata.samples.sort == "bulk"] .. code:: ipython3 bdata.uns["allele_counts"] = bdata.uns['allele_counts'].loc[bdata.uns['allele_counts'].allele.map(str) != ""] Getting edit rates from allele counts ------------------------------------- .. code:: ipython3 bdata.uns['allele_counts'] .. raw:: html
guide allele rep1_bot rep2_bot rep3_VPA_bot rep4_VPA_bot rep1_bulk rep2_bulk rep3_VPA_bulk rep4_VPA_bulk rep1_top rep2_top rep3_VPA_top rep4_VPA_top
1 LDLR_SA_3_g5 0:13:+:A>G 6 16 11 24 29 21 28 22 27 11 20 13
2 LDLR_SA_3_g5 -12:1:+:A>G,0:13:+:A>G 2 16 1 24 16 5 9 4 17 12 7 3
3 LDLR_SA_3_g5 0:13:+:A>G,7:20:+:A>G 2 0 0 0 0 0 0 9 0 0 0 0
4 LDLR_SA_3_g5 -12:1:+:A>G,-8:5:+:A>G,0:13:+:A>G,10:23:+:A>G 1 0 0 0 0 0 0 0 0 0 0 0
5 LDLR_SA_3_g5 -12:1:+:A>G,0:13:+:A>G,7:20:+:A>G 1 0 0 0 0 2 2 0 1 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
165506 2:164588224GAG_Maj_ABE_10_g3 -10:1:+:C>T,-8:3:+:A>T 0 0 0 0 0 0 0 0 0 0 0 1
165507 2:164588224GAG_Maj_ABE_10_g3 -8:3:+:A>T,-3:8:+:C>A,-1:10:+:G>A 0 0 0 0 0 0 0 0 0 0 0 1
165508 rs4921914_Min_ABE_501_g4 -9:3:+:A>G,-8:4:+:A>G,-3:9:+:A>G,2:14:+:A>G,4:... 0 0 0 0 0 0 0 0 0 0 0 1
165509 rs191388787_Maj_ABE_121_g2 -7:3:+:A>G,-5:5:+:A>G,-3:7:+:A>G 0 0 0 0 0 0 0 0 0 0 0 1
165510 rs113408797_Maj_ABE_41_g1 -8:1:+:A>G,-1:8:+:A>G,3:12:+:A>G,5:14:+:A>G,17... 0 0 0 0 0 0 0 0 0 0 0 1

162065 rows × 14 columns

.. code:: ipython3 bdata.uns["edit_counts"] = bdata.get_edit_from_allele() .. code:: ipython3 bdata.get_edit_mat_from_uns("A", "G", target_pos_col = "target_pos") .. parsed-literal:: New edit matrix saved in .layers['edits']. Returning old edits. .. parsed-literal:: array([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]]) .. code:: ipython3 window_edit_rate= bdata.get_guide_edit_rate(normalize_by_editable_base = True, edited_base = "A", editable_base_start = 3, editable_base_end = 8, bcmatch_thres = 5, prior_weight = 1, return_result = True) .. code:: ipython3 plt.hist(window_edit_rate, bins=30) .. parsed-literal:: (array([579., 364., 299., 249., 221., 214., 191., 134., 159., 133., 97., 90., 83., 76., 70., 67., 48., 53., 45., 56., 38., 37., 32., 21., 24., 15., 8., 7., 1., 2.]), array([4.13052458e-04, 2.89016995e-02, 5.73903465e-02, 8.58789936e-02, 1.14367641e-01, 1.42856288e-01, 1.71344935e-01, 1.99833582e-01, 2.28322229e-01, 2.56810876e-01, 2.85299523e-01, 3.13788170e-01, 3.42276817e-01, 3.70765464e-01, 3.99254111e-01, 4.27742758e-01, 4.56231405e-01, 4.84720052e-01, 5.13208699e-01, 5.41697346e-01, 5.70185993e-01, 5.98674640e-01, 6.27163287e-01, 6.55651934e-01, 6.84140582e-01, 7.12629229e-01, 7.41117876e-01, 7.69606523e-01, 7.98095170e-01, 8.26583817e-01, 8.55072464e-01]), ) .. image:: ReporterScreen_api_files/ReporterScreen_api_35_1.png