Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Défi : Tokenisation d'une Phrase | Fondamentaux du Prétraitement de Texte
Introduction au NLP

book
Défi : Tokenisation d'une Phrase

Tâche

Swipe to start coding

Votre tâche consiste à tokeniser l'extrait donné de 'The Adventures of the Musgrave Ritual' en phrases, puis à tokeniser la dernière phrase en mots. Utilisez les fonctions nltk à cet effet et utilisez l'index négatif pour récupérer la dernière phrase et convertir cette phrase en minuscules.

Solution

# Import necessary functions
from nltk.tokenize import sent_tokenize, word_tokenize
import nltk
nltk.download('punkt_tab')
text = 'Sherlock Holmes picked them up one by one, and laid them along the edge of the table. Then he reseated himself in the chair, and looked over with a gleam of satisfaction in his eyes. "These," said he, "are all that I have left to remind me of ‘The Adventure of the Musgrave Ritual’."'
# Tokenize the text into sentences
sentences = sent_tokenize(text)
# Tokenize the last sentence converted to lowercase
tokens = word_tokenize(sentences[-1].lower())
print(tokens)

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 4
# Import necessary functions
from ___ import ___, ___
import nltk
nltk.download('punkt_tab')
text = 'Sherlock Holmes picked them up one by one, and laid them along the edge of the table. Then he reseated himself in the chair, and looked over with a gleam of satisfaction in his eyes. "These," said he, "are all that I have left to remind me of ‘The Adventure of the Musgrave Ritual’."'
# Tokenize the text into sentences
sentences = ___(___)
# Tokenize the last sentence converted to lowercase
tokens = ___(___)
print(tokens)
toggle bottom row
some-alt