Hypothesis Testing for Data Science

5 min readJun 23, 2024

Statistical test:

Procedure to determine whether there is enough evidence to support certain claims, accounting for the inherent randomness in the data we observed (i.e. limited/noisy sample)

Intuition: It provides a mechanism to guess on your belief based on evidence in sample to judge on possibilities with a level of confidence.

Steps of Testing Hypotheses:

Formulate null (H0) and alternate (Ha) hypotheses: Broadly, alternative hypothesis is the statement that you want to support it or find evidence in its favor.
Set confidence level for test: The probability of incorrectly rejecting (H0) when it is in fact true. Broadly, it is the level that you are confident to reject (H0) against your guess (Ha).
Based on assumptions about population and data generating process for your data:

A. Calculate the relevant test statistic in your sample.

B. Find what is the chance that test statistic happen if (H0) is true.

C. If the occurrence chance is smaller than your confidence level, you can reject the (H0) in favor of your (Ha). Otherwise, you don’t have enough evidence for (Ha)!

Note: H0 must always contain equality(=). Ha always contains difference(≠, >, <).

For example, if we were to test the equality of average means (µ) of two groups:
for a two-tailed test, we define H0: µ1 = µ2 and Ha: µ1≠µ2
for a one-tailed test, we define H0: µ1 = µ2 and Ha: µ1 > µ2 or Ha: µ1 < µ2.

Significance Level (α): The significance level, often denoted by α (alpha), represents the probability of rejecting the null hypothesis when it is actually true. Commonly used significance levels include 0.05 and 0.01, indicating a 5% and 1% chance of Type I error, respectively.

4. P-value: It is the proportion of samples (assuming the Nll Hypothesis is true) that would be as extreme as the test statistic. It is denoted by the letter p.

5. Critical Value: Denoted by C and it is a value in the distribution beyond which leads to the rejection of the Null Hypothesis. It is compared to the test statistic.

Now, assume we are running a two-tailed Z-Test at 95% confidence. Then, the level of significance (α) = 5% = 0.05. Thus, we will have (1-α) = 0.95 proportion of data at the center, and α = 0.05 proportion will be equally shared to the two tails. Each tail will have (α/2) = 0.025 proportion of data.

The critical value i.e., Z95% or Zα/2 = 1.96 is calculated from the Z-scores table.

Steps of Hypothesis Testing:

The steps of hypothesis testing typically involve the following process:

Formulate Hypotheses: State the null hypothesis and the alternative hypothesis.
Choose Significance Level (α): Select a significance level (α), which determines the threshold for rejecting the null hypothesis. Commonly used significance levels include 0.05 and 0.01.
Select Appropriate Test: Choose a statistical test based on the research question, type of data, and assumptions. Common tests include t-tests, chi-square tests, ANOVA, correlation tests, and regression analysis, among others.
Collect Data and Calculate Test Statistic: Collect relevant sample data and calculate the appropriate test statistic based on the chosen statistical test.
Determine Critical Region: Define the critical region or rejection region based on the chosen significance level and the distribution of the test statistic.
Calculate P-value: Determine the probability of observing a test statistic as extreme as, or more extreme than, the one obtained from the sample data, assuming the null hypothesis is true. The p-value is compared to the significance level to make decisions about the null hypothesis.
Make Decision: If the p-value is less than or equal to the significance level (p ≤ α), reject the null hypothesis in favor of the alternative hypothesis. If the p-value is greater than the significance level (p > α), fail to reject the null hypothesis.
Draw Conclusion: Interpret the results based on the decision made in step 7. Provide implications of the findings in the context of the research question or problem.
Check Assumptions and Validate Results: Assess whether the assumptions of the chosen statistical test are met. Validate the results by considering the reliability of the data and the appropriateness of the statistical analysis.

By following these steps systematically, researchers can conduct hypothesis tests, evaluate the evidence, and draw valid conclusions from their analyses.

Decision Rules

The two methods of concluding the Hypothesis test are using the Test-statistic value and p-value.

In both methods, we start assuming the Null Hypothesis to be true, and then we reject the Null hypothesis if we find enough evidence.

The decision rule for the Test-statistic method:

if test-statistic (t) > critical Value C, we reject Null Hypothesis.
If test-statistic (t) ≤ critical value C, we fail to reject Null Hypothesis.

The decision rule for the p-value method:

if p-value (p) > level of significance (α), we fail to reject Null Hypothesis
if p-value (p) ≤ level of significance (α), we reject Null Hypothesis.

Type I error: Occurs when we reject a True Null Hypothesis and is denoted as α.

Type II error: Occurs when we accept a False Null Hypothesis and is denoted as β.

Accuracy: Number of correct predictions / Total number of cases.

Hypothesis Testing if the Data is Continuous:

When dealing with continuous data, several common hypothesis tests are used, depending on characteristics of the data. Some of the most widely used hypothesis tests for continuous data include:

One-Sample t-test: Used to compare the mean of a single sample to a known value or hypothesized population mean.
Paired t-test: Compares the means of two related groups (e.g., before and after treatment) to determine if there is a significant difference.
Independent Samples t-test: Compares the means of two independent groups to determine if there is a significant difference between them.
Analysis of Variance (ANOVA): Used to compare means across three or more independent groups to determine if there are any statistically significant differences.
Correlation Test (Pearson’s correlation coefficient): Determines if there is a linear relationship between two continuous variables.
Regression Analysis: Evaluates the relationship between one dependent variable and one or more independent variables.

Hypothesis Tests if the Data is Discrete:

When dealing with discrete data, several common hypothesis tests are used to analyze differences between groups, associations, or proportions. Some of the most widely used hypothesis tests for discrete data include:

Chi-Square Test of Independence: Determines whether there is a significant association between two categorical variables by comparing observed frequencies to expected frequencies.
Chi-Square Goodness-of-Fit Test: Assesses whether the observed frequency distribution of a single categorical variable differs significantly from a hypothesized or expected distribution.
Binomial Test: Determines whether the proportion of successes in a series of independent Bernoulli trials differs significantly from a hypothesized value.
Poisson Test: Tests whether the observed counts of events in a fixed interval of time or space follow a Poisson distribution, often used in count data analysis.

This article underscores the pivotal role of hypothesis testing in data science for informed decision-making.

Thanks for reading this!!!!