## Libraries

First, you need to install the following librairies:

- matplotlib is used for plot creating the charts
- pandas is used to put the data into a dataframe
`numpy`

is used to generate some data

The **Student t-test** will be done using `scipy`

: install it using the `pip install scipy`

command

```
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
```

## Dataset

When creating **nice output tables**, we first need to have the dataframe with the values we want.

In this post, we'll use *fake weather data* from different cities. We'll take a look at different simple features of pandas to make this table more **aesthetically appealing**.

```
sample_size = 100
groupA = np.random.normal(10, 10, sample_size)
groupB = np.random.normal(40, 10, sample_size)
df = pd.DataFrame({'value': np.concatenate([groupA, groupB]),
'category': ['GroupA']*sample_size + ['GroupB']*sample_size})
```

## Get statistical values

First, we'll start by retrive the values we want to add on the plot: the **p value** and the **t statistic**. For this, we need to use the `ttest_rel()`

function from `scipy`

.

Also, we retrieve the **mean** of each group.

*Important: This post does not cover any statistical/math details*

```
# groups
groupA = df[df['category']=='GroupA']['value']
groupB = df[df['category']=='GroupB']['value']
# Perform a paired t-test
t_statistic, p_value = stats.ttest_rel(groupA, groupB)
# Get means
mean_groupA = groupA.mean()
mean_groupB = groupB.mean()
# Print the results
print("T-statistic:", t_statistic)
print("P-value:", p_value)
print("Mean groupA:", mean_groupA)
print("Mean groupB:", mean_groupB)
```

```
T-statistic: -21.422755428886422
P-value: 5.923713845065175e-39
Mean groupA: 10.795220725982492
Mean groupB: 39.594683650277446
```

Let's **round them** in order to make the chart **more readable** at the end

```
t_statistic = round(t_statistic,2)
p_value = round(p_value,5) # more decimal since it's a lower value in general
mean_groupA = round(mean_groupA,2)
mean_groupB = round(mean_groupB,2)
```

## Histogram with statistical elements

Now let's use the stats we got above and add them to the plot of histograms of each group using the `text()`

function from matplotlib

```
# Get group names and define colors
group_name = df['category'].unique()
colors = ['purple', 'orange']
# Init plots
fig, ax = plt.subplots(figsize=(8,6))
# Create the histograms
for i, group in enumerate(group_name):
# Filter on the group
subgroup = df[df['category']==group]['value']
# Add histogram of the subgroup, with the right color
ax.hist(subgroup, bins=5, color=colors[i])
# Add a legend
ax.legend(group_name)
# Add the p value and the t
p_value_text = f'p-value: {p_value}'
ax.text(-12, 40, p_value_text, weight='bold')
t_value_text = f't-value: {t_statistic}'
ax.text(-12, 37, t_value_text, weight='bold')
# Add a title and axis label
ax.set_title('Student t-test between GroupA and GroupB')
ax.set_xlabel('Value')
ax.set_ylabel('Frequency')
# Show the plot
plt.show()
```

## Boxplot with statistical elements

Now let's use the stats we got above and add them to the plot of boxplots of each group using the `text()`

function from matplotlib.

For this graph, we'll also add the **average of each group** next to its associated boxplot.

*Warning: the positions of the texts need to be changed compared to the histogram plot.*

```
# Group our dataset with our 'Group' variable
grouped = df.groupby('category')['value']
# Init a figure and axes
fig, ax = plt.subplots(figsize=(8, 6))
# Create the plot with different colors for each group
boxplot = ax.boxplot(x=[group.values for name, group in grouped],
labels=grouped.groups.keys(),
patch_artist=True,
medianprops={'color': 'black'}
)
# Define colors for each group
colors = ['orange', 'purple']
# Assign colors to each box in the boxplot
for box, color in zip(boxplot['boxes'], colors):
box.set_facecolor(color)
# Add the p value and the t
p_value_text = f'p-value: {p_value}'
ax.text(0.7, 50, p_value_text, weight='bold')
t_value_text = f't-value: {t_statistic}'
ax.text(0.7, 45, t_value_text, weight='bold')
# Add the mean for each group
ax.text(1.1, mean_groupA, f'Mean of Group A: {mean_groupA}')
ax.text(1.4, mean_groupB, f'Mean of Group B: {mean_groupB}')
# Add a title and axis label
ax.set_title('Student t-test between GroupA and GroupB')
# Add a legend
legend_labels = ['Group A', 'Group B']
legend_handles = [plt.Rectangle((0,0),1,1, color=color) for color in colors]
ax.legend(legend_handles, legend_labels)
# Display it
plt.show()
```

## Going further

This post explains how to represent the **results of a student t-test** in a histogram and a boxplot.

This post explains how to represent the **results of a student t-test** in a histogram and a boxplot.