Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Challenge: Flag Duplicate Entries | Handling Missing and Duplicate Data
Python for Data Cleaning
Abschnitt 2. Kapitel 6
single

single

bookChallenge: Flag Duplicate Entries

Swipe um das Menü anzuzeigen

Flagging duplicates within a dataset is a crucial step in many data cleaning workflows, especially when you need to investigate or audit data quality rather than simply removing repeated entries. There are many situations where you might not want to drop duplicates immediately. For instance, you may want to review which records are repeated before deciding on the best course of action, or you may need to report on the prevalence of duplication in your data to stakeholders. Sometimes, duplicate entries can indicate data entry errors, system glitches, or even fraudulent activity, so keeping them flagged allows for further analysis and traceability. By adding a column to your dataset to indicate whether a row is a duplicate, you retain all original information while making it easy to filter, summarize, or visualize duplication patterns later in your workflow.

123456789
import pandas as pd data = { "id": [1, 2, 2, 3, 4, 4, 4], "name": ["Alice", "Bob", "Bob", "Charlie", "David", "David", "David"], "score": [85, 90, 90, 95, 80, 80, 80] } df = pd.DataFrame(data) print(df)
copy
Aufgabe

Swipe to start coding

Write a function that adds a boolean column 'is_duplicate' to the DataFrame, marking rows as duplicates if they appear more than once.

  • The function must create a new column 'is_duplicate' in the DataFrame.
  • The column must be True for all rows that are duplicates of another row, and False for unique rows.
  • All original columns and rows must be preserved in the returned DataFrame.
  • The function must not modify the input DataFrame in place.
  • A good practice will be not modifying the original DataFrame inside the function.

Lösung

Switch to desktopWechseln Sie zum Desktop, um in der realen Welt zu übenFahren Sie dort fort, wo Sie sind, indem Sie eine der folgenden Optionen verwenden
War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 2. Kapitel 6
single

single

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

some-alt