• Length:
    6 Weeks
  • Effort:
    3–6 hours per week
  • Price:

    FREE
    Add a Verified Certificate for $249 USD

  • Institution
  • Subject:
  • Level:
    Introductory
  • Language:
    English
  • Video Transcript:
    English
  • Course Type:
    Self-paced on your time

Associated Programs:

About this course

Skip About this course

Introduction to Text Analytics with Python is part one of the Text Analytics with Python professional certificate. This first course introduces the core techniques of natural language processing (NLP). But we introduce these techniques from data science alongside the cognitive science that makes them possible.

How can we make sense out of the incredible amount of knowledge that has been stored as text data? This course is a practical and scientific introduction to text analytics. That means you’ll learn how it works and why it works at the same time.

On the practical side, you’ll learn how to actually do an analysis in Python: creating pipelines for text classification and text similarity that use machine learning. These pipelines are automated workflows that go all the way from data collection to visualization. You’ll learn to use Python packages like pandas, scikit-learn, and tensorflow.

On the scientific side, you’ll learn what it means to understand language computationally. Artificial intelligence and humans don’t view documents in the same way. Sometimes AI sees patterns that are invisible to us. Sometimes AI misses the obvious. We have to understand the limits of a computational approach to language and the ethical guidelines for applying it to real-world problems. For example, we can identify individuals from their tweets. But we could never predict future criminal behaviour using social media.

This course will cover topics you may have heard of, like text processing, text mining, sentiment analysis, and topic modeling.

What you'll learn

Skip What you'll learn

1. Construct applications using unstructured data like news articles and tweets.

2. Apply machine learning classifiers to categorize documents by content and author.

3. Assess the scientific and ethical foundations of text analysis.

Module 1. Why Use Text Analytics?

Learn how artificial intelligence can help us work with language data

Module 2. Working with Text Data

Learn what language looks like to both humans and machines

Module 3. Text Classification

Learn how to use machine learning to categorize documents based on content, authorship, and sentiment

Meet your instructors

Jonathan Dunn
Lecturer
University of Canterbury
Tom Coupe
Associate Professor
University of Canterbury
Jeanette King
Professor
University of Canterbury
Girish Prayag
Professor
University of Canterbury

Pursue a Verified Certificate to highlight the knowledge and skills you gain
$249 USD

View a PDF of a sample edX certificate
  • Official and Verified

    Receive an instructor-signed certificate with the institution's logo to verify your achievement and increase your job prospects

  • Easily Shareable

    Add the certificate to your CV or resume, or post it directly on LinkedIn

  • Proven Motivator

    Give yourself an additional incentive to complete the course

  • Support our Mission

    EdX, a non-profit, relies on verified certificates to help fund free education for everyone globally