ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • Association Analysis
    MLAI/RecommendSystem 2022. 7. 7. 17:31

    Definition

    As a role-based model, it is an algorithm that finds out how an item relates to another item. This association exists in two forms.

    1. How often are they purchased together? (Frequent)
    2. If someone bought item A, will he also buy item B?

    It is also expressed as a shopping Market basket analysis because it is similar to looking at which products are contained in one shopping basket.

    Ex) Famous anecdote is that when purchasing beer at Walmart, there is a high tendency to buy diapers together, so he also set up a strategy to display the two together.

    Minsup

    $$s(X)=\frac{\sigma(X)}{N}$$

    An itemset X is called frequent if $s(X)$ is greater than some user-defined threshold, minsup.

    Association Rule

    Frequent Itemset Generation, whose objective is to find all the itemsets that satisfy the minsup threshold.

    An association rule is an implication expression of the form X→Y, where X and Y are disjoint itemsets (X∩Y=∅).

    The strength of an association rule can be measured in terms of its support and confidence. A rule that has very low support may occur simply by chance. Confidence measures the reliability of the inference made by a rule.

    Support

    For the rule A → B,

    $$support(A)=P(A, B)$$

    OR

    $$\sigma(X) \ is \ the \ support \ count \ of \ X \\ N \ is \ the \ count \ of \ the \ transactions \ set \ T \\ s(X \rightarrow \ Y)=\frac{\sigma(X\cup Y)}{N}$$

    Confidence

    $$confidence(A \rightarrow B)=\frac{P(A,B)}{P(A)}$$

    Lift

    Measure how frequently events occur at the same time or independent

    $$lift(A\rightarrow B)=\frac{P(A,B)}{P(A)\times P(B) }$$

    $$lift(A, B)\left\{\begin{matrix}
    = 1, if \ A \ and \ B \ are \ independent \\
    > 1, if \ A \ and \ B \ are \ positively \ related \\
    < 1, if \ A \ and \ B \ are \ negatively \ related \end{matrix}\right.$$

    Rule Generation

    Rule generation, whose objective is to extract all the high confidence rules from the frequent itemsets found in the Frequent Itemset Generation. These rules are called strong rules.

    Extract all rules from the itemsets

    Problem

    The number of rules increases exponentially as items increase

    Example

    TID Items
    1 {Bread, Milk}
    2 {Bread, Diapers, Beer, Eggs}
    3 {Milk, Diapers, Beer, Cola}
    4 {Bread, Milk, Diapers, Beer}
    5 {Bread, Milk, Diapers, Cola}

    {Beer, Diaspers, Milk} Support = $\frac{\sigma(X \cup Y)}{N}$ = $\frac{2}{5}$

    {Milk, Diapers} -> {Beer} Confidence = $\frac{\sigma(X \cup Y)}{\sigma (X)}$ = $\frac{2}{3}$

     

    Reference

    https://chih-ling-hsu.github.io/2017/03/25/Data-Mining-Association-Analysis

    https://livebook.manning.com/book/machine-learning-in-action/chapter-11/33

    https://www.youtube.com/watch?v=43gb7WK56Sk

    https://www-users.cse.umn.edu/~kumar001/dmbook/ch5_association_analysis.pdf

    'MLAI > RecommendSystem' 카테고리의 다른 글

    Matrix Factorization  (1) 2022.07.12
    Item-Item Collaborative Filtering  (0) 2022.07.12
    AWS Personalize  (0) 2022.07.07
    User-User Collaborative Filtering  (0) 2022.07.07

    댓글

Designed by Tistory.