The course provides the basic knowledge to understand the role of the python language and its libraries in the context of the main descriptive and predictive data analysis methods. In particular, it explains why python plays such a key role in the Data Science community.
Modules
- Data-driven Decision Making for Data-driven organizations [pdf]
- Solving problems with Big Data, Data Science, and python [pdf]
- Big Data techs are like “crude oil”
- Data Science is “refining crude oil”
- Python is “the refinery”
- Data analysis: pandas, NumPy, SciPy, Scikit-learn, pyspark, PyTorch and Keras
- Data visualization: Matplotlib, and Seaborn
- Business-driven examples of data analysis in python [git repository]
- Predict house pricing with Linear Regression using numpy and Seaborn [notebook on git]
- Classify wines as good or bad with Logistic regression using Scikit-learn and Seaborn [notebook on git]
- Classify wines quality with Decision Trees and Random Forests using Scikit-learn [notebook on git]
- Clustering flowers with K-means using Scikit-learn and Seaborn [notebook on git]
- Interpretare la scrittura a mano libera con il deep learning (Scikit-learn, PyTorch and Keras) [databricks notebook]
Resources for Learning Python
- Books
- MOOCs
- datacamp course on “intro to python for data science”
- coursera course on “Applied Data Science with Python Specialization“
- Code academy course on “Learn Python 3”