Python is not only great at data visualization, but it is also powerful for running statistical analysis, providing a comprehensive toolkit for both beginners and seasoned statisticians to extract meaningful insights from complex datasets.

This section shows how to visualize the results of your statistical analysis, like Principal Component Analysis (PCA), linear modeling, ANOVA, t-tests and more.

It does not focus on how to run the test, but on how to make clean outputto present your findings in a appealing manner.

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) is a transformative technique widely used in the realm of data science to reduce the dimensionality of large datasets while preserving as much variance as possible. By transforming the original variables into a new set of orthogonal components, PCA offers a concise yet informative perspective, making it easier to visualize and analyze high-dimensional data.

Python is a powerful tool when it comes to PCA thanks to its scikit-learn library.

The following post teaches how to perform a PCA with scikit-learn and focus on how to build clean outputs using matplotlib.


👋 This document is a work by Yan Holtz. You can contribute on github, send me a feedback on twitter or subscribe to the newsletter to know when new examples are published! 🔥