Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
course content

Course Content

Tweet Sentiment Analysis

Words CountWords Count

Now we would like to explore the most represented number in our DataFrame. For this reason we will create a collection where we will store the most frequent words and then, plot it.

Methods description

  • from collections import Counter; import nltk: Imports the Counter class from the collections module and the nltk library;
  • from nltk.corpus import stopwords: Imports a list of common stopwords from NLTK;
  • nltk.download("stopwords"): Downloads the stopwords dataset from NLTK;
  • def remove_stopword(x): This defines a function named remove_stopword that takes a list x as input and returns a new list with stopwords removed;
  • return [y for y in x if y not in stopwords.words("english")]: This comprehension expression filters out stopwords from the input list x using the list of English stopwords from NLTK;
  • Counter: A class from the collections module used to count occurrences of elements in a list or iterable;
  • stopwords.words("english"): A method from NLTK that returns a list of stopwords for the English language;
  • temp.most_common(25): Returns the 25 most common elements (words) and their counts from the Counter object temp;
  • temp.iloc[1:,:]: Indexes a DataFrame temp to exclude the first row and select all columns;
  • temp.style.background_gradient(...): Applies a background gradient style to a DataFrame temp.

Task

Create a collection to count word occurrences using the Counter module:

  1. Remove stopwords from our tweets texts.
  2. Create a collection.
  3. Create a DataFrame with the newly created list.
  4. Change the background color to "Blues".

Mark tasks as Completed

Everything was clear?

Section 1. Chapter 8
AVAILABLE TO ULTIMATE ONLY
course content

Course Content

Tweet Sentiment Analysis

Words CountWords Count

Now we would like to explore the most represented number in our DataFrame. For this reason we will create a collection where we will store the most frequent words and then, plot it.

Methods description

  • from collections import Counter; import nltk: Imports the Counter class from the collections module and the nltk library;
  • from nltk.corpus import stopwords: Imports a list of common stopwords from NLTK;
  • nltk.download("stopwords"): Downloads the stopwords dataset from NLTK;
  • def remove_stopword(x): This defines a function named remove_stopword that takes a list x as input and returns a new list with stopwords removed;
  • return [y for y in x if y not in stopwords.words("english")]: This comprehension expression filters out stopwords from the input list x using the list of English stopwords from NLTK;
  • Counter: A class from the collections module used to count occurrences of elements in a list or iterable;
  • stopwords.words("english"): A method from NLTK that returns a list of stopwords for the English language;
  • temp.most_common(25): Returns the 25 most common elements (words) and their counts from the Counter object temp;
  • temp.iloc[1:,:]: Indexes a DataFrame temp to exclude the first row and select all columns;
  • temp.style.background_gradient(...): Applies a background gradient style to a DataFrame temp.

Task

Create a collection to count word occurrences using the Counter module:

  1. Remove stopwords from our tweets texts.
  2. Create a collection.
  3. Create a DataFrame with the newly created list.
  4. Change the background color to "Blues".

Mark tasks as Completed

Everything was clear?

Section 1. Chapter 8
AVAILABLE TO ULTIMATE ONLY
some-alt