```
\n",
"#\n",
"# Python is sensitive to indentation, unlike other languages that\n",
"# use special symbols for end-of-line or to delimit blocks of code. Lines\n",
"# with the same indentation belong to the same \"block\". This will become\n",
"# more important when we're seeing control structures like if/then.\n",
"#\n",
"# You may know that Python has a special syntax for writing function\n",
"# documentation - but we're not going to use it (yet), so we can keep \n",
"# things bare-bones.\n",
"\n",
"# experiment() runs one complete mouse experiment, for known parameters,\n",
"# and returns the \"significance\" of the difference between the control\n",
"# and mutant mouse samples: the p-value according to a two-tailed t-test.\n",
"#\n",
"def experiment(mu0, mu1, sigma, N): \n",
" # Simulate N control and N mutant mouse times.\n",
" x0 = np.random.normal(mu0, sigma, N) # np.random.normal() is a call to a NumPy function to sample N numbers from a Gaussian.\n",
" x1 = np.random.normal(mu1, sigma, N) # So x0 is the control experiment (N observations), x1 is the mutant experiment.\n",
" # x0 is a vector, or array, of N numbers. x0[0] is the first value in it, a single number.\n",
"\n",
" # Calculate the significance (p-value) of the difference between means of the two samples.\n",
" # We'll worry later in the course about what a two-tailed t-test is actually doing.\n",
" # For now, all we're doing is testing the authors' assertions about the false positive\n",
" # and false negative rate of the t-test for n=8 mice.\n",
" #\n",
" t_stat, pvalue = stats.ttest_ind(x0, x1) # This is a call to the t-test in SciPy's stats module.\n",
" # It returns what Python calls a `tuple`: an ordered list of things.\n",
" # Here, that's a tuple of two numbers: the t-test statistic, and its p-value.\n",
" \n",
" # What the heck is a t statistic and a t-test, you wonder?\n",
" # Oh, ok, let's peek a little bit inside the hood. We don't need this code for what we're doing today but...\n",
" \n",
" mean0 = np.mean(x0) # np.mean() just calculates the mean (arithmetic average) of a vector, and returns it.\n",
" mean1 = np.mean(x1)\n",
" \n",
" var0 = np.var(x0, ddof=1) # and np.var() just calculates the variance.\n",
" var1 = np.var(x1, ddof=1) # `ddof=1` is an example of passing an optional argument to a function by name=value, \n",
" # rather than simply by order. ddof is \"degrees of freedom\", a technicality in the\n",
" # t-test calculation that we'll ignore for now, and get into later.\n",
" \n",
" # Here's the eqn for a t-test statistic. SciPy's stats.ttest_ind function calculated this, \n",
" # then it calculated how often you'd observe a t-statistic at least this\n",
" # large, if the two means were equal: that's the \"p-value\".\n",
" #\n",
" t_mine = (x0 - x1) / np.sqrt( var0 / N + var1 / N)\n",
" \n",
" # How do you print stuff in Python?\n",
" # It's as simple as \n",
" # print(\"hello world!\")\n",
" # but if you want to stick variables into what you print, you do something like:\n",
" # print(\"value 1 = {0} and value 2 = {1}\".format(v1, v2))\n",
" # and if you want to be super fancy and format your output variables to certain\n",
" # width and precision so tables look organized, something like {0:6.2f} means \"format \n",
" # variable 0 as a 6-wide, 2-digits-after-the-decimal, floating point number\".\n",
" #\n",
" # If you want to see a summary of each experiment that gets simulated, uncomment the\n",
" # line below; if you don't want to see a big table of 10,000 lines for 10,000 experiments,\n",
" # comment it out.\n",
" #print(\"{0:6.2f} {1:6.2f} {2:7.4f} {3:7.4f} {4:6.4f}\".format(mean0, mean1, t_stat, t_mine, pvalue))\n",
" \n",
" # To get a result back from a function, you 'return' it: either a single value, or a tuple.\n",
" return pvalue"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### how are we doing?\n",
"\n",
"Seriously, despite how much stuff I just wrote, there's really not much code above; it's mostly comments.\n",
"Take your time in studying it. Now we're going to run many simulated experiments."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Negative control\n",
"\n",
"In our negative control, we simulate 10,000 experiments where there's no difference: both the control and mutant mice have the same mean $\\mu_0$. The authors said they choose a p-value threshold of 0.05, which means that they're going to tolerate a 5% false positive probability. This literally means that 5% of the time, you observe a \"significant\" difference, even though there is no real difference, just because of statistical fluctuation. If we're using the t-test properly, 5% of the 10,000 negative control experiments should be \"significant\":"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# A `for` loop iterates the same code block.\n",
"# `range(n_tot)` produces an array of every integer from 0 to n_tot-1.\n",
"# `for i in range(n_tot)` means run the code block with i=0, i=1, ..., i=n_tot-1.\n",
"# That is, run it 10,000 times if n_tot=10,000.\n",
"\n",
"n_sig = 0 # Initialize n_sig to zero, to use it as a counter.\n",
"for i in range(n_tot):\n",
" pvalue = experiment(mu0, mu0, sigma, N) # Do one experiment with identical means mu0; return p-value of observed difference.\n",
" if pvalue <= 0.05: # Authors say they're going to set their false positive p-value threshold to 0.05.\n",
" n_sig += 1 # So count how many \"significant\" results we get, with p-value < 0.05."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"and is it 5%?"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of significant results = 456/10000 (0.046)\n"
]
}
],
"source": [
"print(\"Number of significant results = {0}/{1} ({2:.3f})\".format(n_sig, n_tot, n_sig/n_tot))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Yes, seems close enough. Cool, we're using the t-test properly."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Positive control\n",
"\n",
"Or call it our experiment, if you prefer: now we simulate 10,000 experiments where the control and mutant mice really do have different means $\\mu_0$ and $\\mu_1$, and ask how often we detect a \"significant\" result. The authors claimed this would be about 80%. Seems like BS to me. Is it? Same code, without the comments, and now calling `experiment()` with different true means:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"n_sig = 0 \n",
"for i in range(n_tot):\n",
" pvalue = experiment(mu0, mu1, sigma, N) \n",
" if pvalue <= 0.05: \n",
" n_sig += 1 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"and is it 80%?..."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of significant results = 1559/10000 (0.156)\n"
]
}
],
"source": [
"print(\"Number of significant results = {0}/{1} ({2:.3f})\".format(n_sig, n_tot, n_sig/n_tot))"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"yeah... well... um.\n",
"\n",
"With n=8 animals, the authors are nowhere near being able to reliably detect an effect as small as what they say they want to detect. They're barely getting more significant results than they expect just by chance.\n",
"\n",
"If you treat statistics as black box recipes, it's way too easy to make silly mistakes. We're going to learn in the course how to do simple simulations as control experiments to double check that you're not making silly errors."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## stuff you can do\n",
"\n",
"How many mice do they really need to achieve an 80% probability of detecting a \"significant\" result when there's really a difference? Change the parameters and re-run the page, iteratively, until you narrow in on an answer.\n",
"\n",
"Conversely, if they're really only going to use n=8 control and mutant mice, how big does the effect size have to be, for them to detect it 80% of the time?\n",
"\n",
"\n",
"## stuff you can think about\n",
"\n",
"Is it ok to do science experiments where you have a 5% chance of reporting an effect, when no effect is present?\n",
"\n",
"Is it ok to do science experiments where you have a 20% chance of missing an effect that's really there?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```