We will now spend two chapters on EDA. **Exploratory Data Analysis** (EDA) is an approach used to analyze and summarize datasets in order to understand their main characteristics, patterns, and relationships. EDA is a crucial step in the data analysis process, as it allows analysts to gain insights into the data and identify potential problems or issues before building models or making predictions.

The main goal of EDA is to explore the data and generate hypotheses about the underlying structure of the data rather than to confirm preconceived hypotheses or test hypotheses about specific relationships.
## Methods description
- `groupby("sentiment")`: This method is used to group the DataFrame `data` by the unique values in the "sentiment" column;
- `count()["text"]`: After grouping, the `count()` method counts the occurrences of each sentiment group, and `["text"]` selects only the "text" column from the resulting DataFrame;
- `reset_index()`: This method resets the index of the DataFrame resulting from the groupby operation, converting the grouped columns into regular columns and generating a new default index;
- `sort_values(by="text", ascending=False)`: This method sorts the DataFrame by the values in the "text" column in descending order (`ascending=False`), arranging the sentiment groups based on the count of texts associated with each sentiment;
- `temp.style.background_gradient(cmap="Purples")`: Finally, this applies a background gradient style to the DataFrame `temp` using the "Purples" colormap, with darker shades representing higher values in the DataFrame.


In this project, we are going to classify tweets according to their sentiment.

Tweet Sentiment Analysis

EDA

Methods description

Solution