Welcome to TED Learning!

  • 12-07, Binjai 8 Premium SOHO, No. 2 Lorong Binjai, 50450 Kuala Lumpur
  • Phone: +603-2742 1828
  • [email protected]

What Is Python?

Python is a popular programming language that has gained attention for its ease-of-use and wide support for a variety of tasks. This advanced course is aimed at equipping data analysts to handle data on a larger scale and produce interactive dashboards to effectively communicate results to business stakeholders.

What Is Data Science?

Data science is a “concept to unify statistics, data analysis, machine learning and their related methods" in order to “understand and analyze actual phenomena" with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, and information science

Learning Objectives

Utilize NumPy and Dask to manipulate large data sets.

Generate visually compelling interactive dashboards using Plotly.

Other Requirements

Course Durations

This course runs for 2 days

Technical Requirement

The class will be conducted using Jupyter interactive notebooks. Participants are required to install Anaconda Python v3.8 prior to attending this course.

A machine running MacOS/Unix/Windows with at least 4GB RAM are recommended.


Python for Data Science (Beginner)

Course Outline

Numerical Python (NumPy)

– NumPy arrays

– Broadcasting rules

– Working with image data

We begin by taking a deeper look at Numerical Python (NumPy), a Python package which is frequently used in handling data for production systems and forms the basis for Pandas data frames. Participants will learn the basics of NumPy’s behavior including broadcasting rules and how to use it to their advantage and how we can improve the speed of calculations in specific cases by working with NumPy arrays vs regular Pandas data frames.

Working with large data sets

– In-memory vs on-disk

– Dask data frames

In this section we introduce Dask, a Python package for lazy evaluation to handle data frames that are too large to fit in memory. Dask provides a Pandas-friendly interface that allows for manipulation of large data sets without the need for distributed computing platforms such as Spark.

– Plotly
In this section, participants learn to build beautiful interactive dashboards from their data with the aid of a Python packaged called Plotly.

Practice Project
– Live coding exercises on a large dataset
In the final section for this course, participants will get their hands dirty by working on a large dataset to simulate a real-life data workflow and put their Python knowledge to the test.