Learn how to load, explore, transform, analyse, visualise and derive actionable insights from structured, semi-structured and unstructured data using industry-standard Python libraries for data analysis.
This hands-on course introduces essential and industry-standard Python libraries for data analysis, namely NumPy, Pandas and Matplotlib, in order to derive actionable insights from data. You will learn how to design and build end-to-end data pipelines capable of loading, exploring, transforming, merging, analysing and visualing real-world data sets applied to real-world business problems. This course also provides a foundation for more complex data engineering, such as building distributed and real-time data pipelines, and developing data science models using statistical learning, machine learning and deep learning techniques.
- 1. The Power of Data
- 2. NumPy Part 1 - Managing Numbers
- 3. NumPy Part 2 - Advanced Functions
- 4. NumPy Part 3 - Descriptive Statistics
- 5. Pandas Part 1 - The Basics
- 6. Pandas Part 2 - Loading Data
- 7. Pandas Part 3 - Transforming Data
- 8. Matplotlib Part 1 - Graphs and Plots
- 9. Matplotlib Part 2 - Advanced Plots
- 10. Real-World Projects
- The ability to load structured data (e.g. CSV files, Microsoft Excel spreadsheets and relational database tables), semi-structured data (e.g. XML and JSON files) and unstructured data (e.g. images, audio and videos) into efficient in-memory data structures.
- The ability to design and build end-to-end data pipelines capable of loading, merging and transforming disparate datasets, and saving post-transformed and post-modelled data into databases.
- The ability to analyse, visualise and derive actionable insights from disparate datasets in order to solve real-world business problems (e.g. descriptive statistics, trend analysis and forecasting).
- An intermediate-level understanding of essential and industry-standard Python libraries for data analysis, namely NumPy, Pandas and Matplotlib.
- Foundational analytical coding knowledge from which to develop advanced data engineering pipelines, including distributed and real-time data pipelines, and data science models using advanced statistical learning, machine learning and deep learning techniques.