Python is a popular programming language that has gained attention for its ease-of-use and wide support for a variety of tasks. This advanced course is aimed at equipping data analysts to handle data on a larger scale and produce interactive dashboards to effectively communicate results to business stakeholders.
Data science is a “concept to unify statistics, data analysis, machine learning and their related methods" in order to “understand and analyze actual phenomena" with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, and information science
Utilize NumPy and Dask to manipulate large data sets.
Generate visually compelling interactive dashboards using Plotly.
This course runs for 2 days
The class will be conducted using Jupyter interactive notebooks. Participants are required to install Anaconda Python v3.8 prior to attending this course.
A machine running MacOS/Unix/Windows with at least 4GB RAM are recommended.
Python for Data Science (Beginner)
Numerical Python (NumPy)
– NumPy arrays
– Broadcasting rules
– Working with image data
We begin by taking a deeper look at Numerical Python (NumPy), a Python package which is frequently used in handling data for production systems and forms the basis for Pandas data frames. Participants will learn the basics of NumPy’s behavior including broadcasting rules and how to use it to their advantage and how we can improve the speed of calculations in specific cases by working with NumPy arrays vs regular Pandas data frames.
Working with large data sets
– In-memory vs on-disk
– Dask data frames
In this section we introduce Dask, a Python package for lazy evaluation to handle data frames that are too large to fit in memory. Dask provides a Pandas-friendly interface that allows for manipulation of large data sets without the need for distributed computing platforms such as Spark.
In this section, participants learn to build beautiful interactive dashboards from their data with the aid of a Python packaged called Plotly.
– Live coding exercises on a large dataset
In the final section for this course, participants will get their hands dirty by working on a large dataset to simulate a real-life data workflow and put their Python knowledge to the test.
FEEL FREE TO ENQUIRE ABOUT OUR RANGE OF TRAINING COURSES