Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Working with Raw Imported Data | Understanding Dirty Data
Clean Data in Excel

bookWorking with Raw Imported Data

Swipe to show menu

In real-world work, you will rarely create datasets from scratch in Excel. Most of the time, you will work with data that comes from external sources — CSV files, reports from systems, or data copied from websites. This type of data is called raw imported data, and it is one of the main sources of problems in Excel.

The key issue is that Excel does not always correctly recognize the structure and types of imported data. Even if everything looks fine visually, the data may already be “dirty” the moment you open or paste it.

For example, when you open a CSV file, Excel automatically decides how to interpret each column. Sometimes it guesses correctly, but often it doesn't. Numbers may become text, dates may be misinterpreted depending on regional settings, and some values may lose their original format.

Copy-pasting data creates its own set of problems. Extra spaces often appear at the beginning or end of cells, invisible characters can be inserted, and formatting may become inconsistent. Data copied from websites is especially problematic, because it may include hidden HTML-related characters that are not visible in Excel.

Key Insight

Raw imported data should never be trusted immediately. Before using it, you should always assume that: formats may be inconsistent and values may be interpreted incorrectly.

The first step is not analysis — it's checking and preparing the data.

question mark

What is the main risk when working with imported data?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 1. Chapter 3

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Section 1. Chapter 3
some-alt