- gene: genomic element that encodes various transcripts
- target: RNA transcript
- position: offset on a transcript which matches the end of the microRNA seed
- microRNA: small RNA of about 22 nucleotides
For each possible quadruplet, our model predicts an equilibrium concentration `quantity` according to its equilibrium constant $K_m$. In particular, our solution satisfies:
$K_m = \frac{[E_m][S_{t,p}]}{[E_mS_{t,p}]}$
Where $[E_m]$ is the free concentration of microRNA $m$, $[S_{t,p}]$ is the free concentration of target site $(t, p)$ and $[E_mS_{t,p}]$ is the duplex formed at that particular location.
%% Cell type:code id: tags:
``` python
microtargetome_df.head()
```
%% Cell type:code id: tags:
``` python
microtargetome_df.loc[:,:]
```
%% Cell type:markdown id: tags:
# NumPy
You can use [NumPy](https://numpy.org/) routines directly on your dataframes and series.
However, this is not exactly equal because the available substrate concentration is actually a bit more complicated to calculate since we have to account for overlapping sites.
%% Cell type:markdown id: tags:
# Jointure, merge and concatenation
These three concepts are similar, but behave differently.
- jointure are fast and work on indexes
- merge are slow and work on columns
- concat is similar to a jointure, but require matching indexes and works with many dataframes and series
But first, let's automate the process of fetching data from miRBooking-scan so that we can study a couple of cell lines.
"A = fetch_from_mirbooking_scan('ENCSR172GTQ')\n",
"B = fetch_from_mirbooking_scan('ENCSR066FYC')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Efficiency"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Efficiency measures the degree of specificity of an interaction.\n",
"\n",
"Highly specific interactions can be completely inefficient if $K_m$ is high and high affinity interactions can end-up being completely inefficient if they face strong competitors or binding many substrates.\n",
"\n",
"If the enzyme is exclusive to its substrate, the efficiency will be very high. If the enzyme is shared among many substrates, its free concentration will be lower and the formed complex $[ES]$ will be lower as well. Conversly, if many enzymes are competing for a given substrate, the substrate free concentration will be lower and the formed complexes will be lower as well.\n",
"\n",
"From a network perspective, it summarizes the local density surrounding an edge."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# It is best to introduce a function for this purpose\n",
"It's important here to not combine sponged fraction from different microRNA because they are not compatible. The solution is to use a central tendency measure such as a mean or a median."
"To make a clear point for question 2 about efficiency, we can clearly see that changes in interaction efficiency are not reflected with changes of substrate binding."
Efficiency measures the degree of specificity of an interaction.
Highly specific interactions can be completely inefficient if $K_m$ is high and high affinity interactions can end-up being completely inefficient if they face strong competitors or binding many substrates.
If the enzyme is exclusive to its substrate, the efficiency will be very high. If the enzyme is shared among many substrates, its free concentration will be lower and the formed complex $[ES]$ will be lower as well. Conversly, if many enzymes are competing for a given substrate, the substrate free concentration will be lower and the formed complexes will be lower as well.
From a network perspective, it summarizes the local density surrounding an edge.
%% Cell type:code id: tags:
``` python
# It is best to introduce a function for this purpose
It's important here to not combine sponged fraction from different microRNA because they are not compatible. The solution is to use a central tendency measure such as a mean or a median.
To make a clear point for question 2 about efficiency, we can clearly see that changes in interaction efficiency are not reflected with changes of substrate binding.
%% Cell type:code id: tags:
``` python
plt.scatter(gene_log2fc,gene_efficiency_log2fc)
plt.xlabel('Number of occupants $\log_2$ fold-changes')
plt.ylabel('Efficiency $\log_2$ fold-changes')
```
%% Cell type:markdown id: tags:
## Detailed fold-changes
If we dig deeper, we can see that some microARN substantially increase.