FP-growth Algorithm.FP-tree

An FP-tree, or frequent pattern tree, is a data structure that represents a compact and efficient way to store transactional datasets, where each transaction consists of a set of items.
The FP-tree organizes transactions by their common items, linking them in a tree structure based on their frequency of occurrence in a particular item combination.

How to construct FP-tree

Scan the Database: Traverse the transaction database and count the frequency of each item among all dataset. Sort items in descending order of frequency;
Create the Root of the FP-Tree: Initialize an empty root node;
Scan the Database Again: For each transaction in the database, filter and sort the items according to their total frequency calculated on the first point of the algorithm;
Update the FP-Tree: For each filtered transaction:
- Start at the root node of the tree;
- If an item in the transaction is already a child node of the current node, increment its count;
- Otherwise, create a new child node with the item and set its count to 1;
- Move to the next item in the transaction and repeat the process until all items are processed.
Link Similar Items: After processing all transactions, link nodes representing the same item together by their link pointers;
Return the FP-Tree: The constructed FP-tree is ready for further mining.

FP-tree example

Let's consider the following dataset:

Let's arrange our transaction data in descending order of total frequency across the entire dataset, prioritizing items with higher frequency over those with lower frequency.
Total frequency table looks like this:

Item	Total Frequency
bread	6
eggs	6
butter	4
jam	4
milk	4
cheese	3

As a result we can create sorted transaction table:

The FP-tree for this dataset will look like this:

As a result we have a FP-tree that represents out transaction data. Red arrows are needed to connect similar products in different tree branches - this will be used in calculating support for frequent itemsets mining.

Все було зрозуміло?

Дякуємо за ваш відгук!

Секція 2. Розділ 4

Запитати АІ

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Зміст курсу

Association Rule Mining

1. Introduction to Association Rule Mining

Definition and Overview of ARM Frequent Itemsets and Association rules Support, Confidence, and Lift Measures Challenge: Metrics Calculation Apriori Principle and Its Significance

2. Mining Frequent Itemsets

3. Additional Applications of ARM

Exploratory Data Analysis Recommendation Systems Other Applications