course content

Course Content

Python for Data Science: Text Summarizer

Create Sentence ListCreate Sentence List

Now we will create a list of sentences of our story using the sent_tokenize function. sent_tokenize is a function in the Natural Language Toolkit (NLTK) library in Python that is used to tokenize text into sentences. Given a piece of text as input, it will split the text into a list of sentences, where each sentence is an element of the list.

The sent_tokenize() function uses an unsupervised machine learning algorithm, which means it does not need any labeled data to work. It uses punctuation marks, capitalization, and other heuristics to identify the boundaries between sentences.


  1. Use the sent_tokenize() function to extract sentences from our story;
  2. Print the sentences.

Everything was clear?

Section 1. Chapter 4