Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Oppiskele Challenge: Tokenizing a Sentence | Text Preprocessing Fundamentals
Introduction to NLP

book
Challenge: Tokenizing a Sentence

Tehtävä

Swipe to start coding

Your task is to tokenize the given excerpt from 'The Adventures of the Musgrave Ritual' into sentences and then tokenize the last sentence into words. Use the nltk functions for this purpose and use the negative index to retrieve the last sentence and convert this sentence to lowercase.

Ratkaisu

# Import necessary functions
from nltk.tokenize import sent_tokenize, word_tokenize
import nltk
nltk.download('punkt_tab')
text = 'Sherlock Holmes picked them up one by one, and laid them along the edge of the table. Then he reseated himself in the chair, and looked over with a gleam of satisfaction in his eyes. "These," said he, "are all that I have left to remind me of ‘The Adventure of the Musgrave Ritual’."'
# Tokenize the text into sentences
sentences = sent_tokenize(text)
# Tokenize the last sentence converted to lowercase
tokens = word_tokenize(sentences[-1].lower())
print(tokens)

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 1. Luku 4
# Import necessary functions
from ___ import ___, ___
import nltk
nltk.download('punkt_tab')
text = 'Sherlock Holmes picked them up one by one, and laid them along the edge of the table. Then he reseated himself in the chair, and looked over with a gleam of satisfaction in his eyes. "These," said he, "are all that I have left to remind me of ‘The Adventure of the Musgrave Ritual’."'
# Tokenize the text into sentences
sentences = ___(___)
# Tokenize the last sentence converted to lowercase
tokens = ___(___)
print(tokens)

Kysy tekoälyä

expand
ChatGPT

Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme

some-alt