Summary  
This chapter explains how to load a CSV file into a Polars DataFrame, select specific columns, and filter rows based on conditions using Polars methods.

General domain of usage  
Data analysis

This video introduces you to the main syntax of the `polars` library for Python. You will see how to import polars, read a CSV file, select columns, filter rows, and display data. The instructor walks through each operation slowly, explaining what each line of code does and how the output looks. Short code examples are shown on screen, each with the expected output, so you can clearly see the effect of every command. By the end, you will understand how to perform basic data operations in polars and how its syntax compares to other data libraries.

大規模なデータセットを扱う際には、効率的なデータ操作が不可欠です。`polars`ライブラリは高性能なデータ処理のために設計されており、Pythonで大規模データを扱う際によく利用されています。本章では、`polars`を使ったデータの読み込み、特定の列の選択、行のフィルタリング方法について学びます。これらの基本操作は、より複雑なデータ変換の基礎となります。

下表は、これらの基本操作を実行するための`polars`の主な関数をまとめたものです。

Displays the first/last rows of the DataFrame

import polars as pl

url = "https://staging-content-media-cdn.codefinity.com/b8f3c268-0e60-4ff0-a3ea-f145595033d8/section1/large_file.csv"

# Read data from a CSV file
df = pl.read_csv(url)

# Display the first 5 rows
print(df.head())

このコードでは、`polars`ライブラリをインポートし、`pl.read_csv()`関数を使用して`"data/people.csv"`というファイルからデータを読み込みます。生成された**DataFrame**は変数`df`に格納されます。`df.head()`を呼び出すことで、DataFrameの最初の5行を表示でき、データを読み込んだ直後に素早く内容を確認するのに便利です。

import polars as pl

url = "https://staging-content-media-cdn.codefinity.com/b8f3c268-0e60-4ff0-a3ea-f145595033d8/section1/large_file.csv"

# Read data from a CSV file
df = pl.read_csv(url)

# Select the "name" and "age" columns
selected = df.select(["Variable name"])

print(selected)

ここでは、`select()`メソッドを使用してDataFrameから`"name"`と`"age"`の列のみを選択しています。これにより、これらの列だけを含む新しいDataFrameである**selected**が作成されます。特定のデータ部分に注目してさらに分析を行いたい場合、列の選択は一般的な操作です。

polarsでCSVファイルを読み込むために使用されるメソッドはどれですか？

実践的かつハンズオンのコースで、現実世界の大規模データ課題に取り組む意欲的なデータサイエンティスト向けです。Pythonと主要なライブラリを用いて、大規模データセットの効率的な処理、サンプリング、分析方法を学びます。各セクションには、分かりやすいビデオ解説とインタラクティブな課題が含まれており、専門知識を身につけることができます。

メモリに収まりきらない大規模データセットを扱うための基礎的な戦略として、チャンク処理やストリーミング手法を学びます。

オーバーサンプリングやアンダーサンプリングを含む、大規模データセットのバランス調整およびサンプリング手法を探求します。

高速かつメモリ効率の良いデータ処理のためにpolarsライブラリを使用する方法を学習します。

Polarsにおける基本的なデータ操作