Hypothesis Testing Part I

Jonathan Schein
6 min readDec 23, 2020

Introduction:

In this blog I will write about the importance of experimental design and hypothesis testing. Hypothesis testing is at the forefront of any experiment and is absolutely necessary when doing research to determine the significance of scientific experiments. As a data scientist, you will be asked to design, perform and analyze statistical tests and this blog is a great way to begin your learning. I will explain concepts and give examples of how to perform certain tests.

  1. Experimental Design:

Experimental design is very important in order to draw the correct conclusions. It is the foundation of any good scientific research and is commonly used in many fields. Here I will explain how it applies to hypothesis testing with data scientists. The general structure of any experimental design is as follows:

1. Making an observation:

The first step is to observe something that you want to test. You must observe something until you come up with the question that you want to answer. Some examples are, “does this drug have an effect on X?” or “does the color of the website have an effect on the sales?”. Usually you want the question to be very specific.

2. Examine the research:

Next you want to see what data and research is already out there and how it can help your experiment. This will help you form a strategy for approaching this experiment.

3. Form a hypothesis:

Contrary to popular belief, or what you might have learned in school, a hypothesis is not just an educated guess that you try to prove. In the case of hypothesis testing, it’s slightly more complicated than that. During this stage you will come up with two hypotheses, the alternative hypothesis and the null hypothesis.

4. Conduct an experiment:

One needs to be very meticulous in this step. You need to analyze the data that comes in and know whether A has an effect on B or it was just random. It is very common to come to false conclusions during this step if you are not careful and avoid common mistakes. Know that correlation does not necessarily imply causation.

5. Analyze results:

This step is for filtering out unnecessary noise and details and trying to determine if something is statistically significant.

6. Draw conclusions:

Assuming all the previous steps were done correctly, this step is pretty simple. All that is left to do is take the results from the analyzing steps and seeing if your hypothesis was correct. There are two outcomes to this step. Either you reject the null hypothesis or you fail to reject the null hypothesis.

Key Words:

1. Null hypothesis:

Definition: The null hypothesis states that there is no relationship between A and B.

Example: “There is no relationship between a medicine and a quick recovery from a sickness”.

Symbol: H0

Assumption: No difference between A and B (A = B).

Note: The null hypothesis is more conservative than the alternative hypothesis.

2. Alternative hypothesis:

Definition: This is what people assume to be the main hypothesis.

Example: “This medicine helps the patient quickly recover from a sickness”.

Symbol: H1

Assumption: The claim you are trying to prove with an experiment.

3. P-values:

Definition: The probability of observing a test statistic at least as large as the one observed, by random chance, assuming that the null hypothesis is true.

Example: If you calculate a p-value and it comes out to .02, then that is equivalent to saying that there is a 2% chance of getting these results if the null hypothesis is true.

4. Alpha value:

Definition: The threshold at which you’re ok with rejecting the null hypothesis.

An alpha value is anywhere between 0 and 1. But, the most common alpha value is 0.05. This is essentially saying that “I’m okay with accepting my alternative hypothesis as true if there is less than a 5% chance that the results that I’m seeing are actually due to randomness.”

Results: When conducting an experiment, you will compare your p-value to the alpha-value. If p< then you reject the null hypothesis and accept that there is no relationship between the dependent and independent variables.

  • P < alpha: Reject null hypothesis, and accept alternative hypothesis
  • P >= alpha: Fail to reject the null hypothesis

5. One-tailed test:

When the region of rejection is on the right or left of the sampling distribution.

Example:

H0: a >=b — The treatment group given this weight loss drug will not lose more weight on average than the control group that was given a competitor’s weight loss drug.

H1: a < b — The treatment group given this weight loss drug will lose more weight on average than the control group that was given a competitor’s weight loss drug.

6. Two-tailed test:

When the region of rejection is on both sides of the sampling distribution.

Example:

H0: a = b — People in the experimental group that are administered this drug will lose the same amount of weight as the people in the control group.

H1: a != b — People in the experimental group that are administered this drug will not lose the same amount of weight as the people in the control group. They will be heavier or lighter.

Importance of the null hypothesis:

A good experiment does not prove a relationship between a dependent and an independent variable. Instead, it proves that there is not enough evidence to believe that there is no relationship between the dependent and independent variables. Therefore a null hypothesis is used so we can be very specific in our findings.

Conclusion:

In this blog post I went over how to create an experiment and how to interpret the results. In upcoming blog posts I will go into more detail about different tests that can be used and explain further one-tailed and two-tailed t tests. I will now recap the noteworthy information.

Recap:

  • Make sure you have a clear and correct approach to experimental design to be able to accurately display your findings.
  • Begin any experiment with looking at what research has already been done and see if it can affect your experiment.
  • Clearly state the alternative and null hypotheses.
  • Gather a valid control group.
  • Make sure that your sample size is big enough.
  • A good experiment has reproducible results.
  • P-value: determines the likelihood that the outcome can occur under the null hypothesis.
  • Alpha value: threshold where you are comfortable rejecting the null hypothesis.
  • Most people choose 0.05 to be there alpha.
  • Statistical significance: combines effect size and sample size.
  • Effect size: measures the size of the difference between the two groups under observation.
  • One sample t-test: Determines whether a sample comes from a population with a specific mean.
  • Two sample t-test: Determines if two population means are equal.
  • Type I errors: False positives. When we accept an alternative hypothesis that is actually false.
  • Type II errors: False negatives. When we reject an alternative hypothesis that is actually true.
  • Alpha is the likelihood that we get a type I error by random chance.
  • Resampling: used for improved precision in estimating sample statistics and validating models by using random subsets. These methods include: Bootstrapping, Jackknifing and permutation tests.

--

--

Jonathan Schein

Data Scientist, Brandeis University Alum and Flatiron School Alum