Notice: Trying to access array offset on value of type bool in /home/flixwrit/domains/raywriters.com/public_html/wp-content/themes/Divi/includes/builder/functions.php on line 2421
Expert answer:ITS836 Cumberlands Association Rules Apriori Algor - Ray writers
Select Page

Solved by verified expert:•HW05 Exercise 1 – review apriori algorithm and create slides:–https://www.hackerearth.com/blog/machine-learning/beginners-tutorial-apriori-algorithm-data-mining-r-implementation/ •HW05 Exercise 2: Grocery Dataset with R •HW05 Exercise 3: Apply priori Rules to “marketbasket.csv” •HW05 Exercise 4: “R for Data Science” Module 4
its_836_hw05_association_rules.pptx

lecture_05_hw_support_files.zip

Don't use plagiarized sources. Get Your Custom Essay on
Expert answer:ITS836 Cumberlands Association Rules Apriori Algor
Just from \$10/Page

Unformatted Attachment Preview

School of Computer &
Information Sciences
ITS 836 Data Science and Big Data Analytics
ITS 836
1
Lecture 05 – HW05 Association Rules
• HW05 Exercise 1 – review apriori algorithm and create slides:

https://www.hackerearth.com/blog/machine-learning/beginners-tutorial-apriori-algorithm-data-mining-rimplementation/
• HW05 Exercise 2: Grocery Dataset with R
• HW05 Exercise 3: Apply priori Rules to “marketbasket.csv”
• HW05 Exercise 4: “R for Data Science” Module 4
ITS 836
2
Exercise 1
• Summarize the aPriori algorithm

https://www.hackerearth.com/blog/machine-learning/beginners-tutorial-apriori-algorithm-data-mining-rimplementation/
Support
Confidence
Lift
How the apriori algorithm works?
ITS 836
3
Exercise 2: Grocery Store Transactions from
textbook
>
Packages -> Install -> arules, arulesViz
# don’t enter next line
install.packages(c(“arules”, “arulesViz”)) # appears on console
library(‘arules’)
library(‘arulesViz’)
data(Groceries)
summary(Groceries)
# indicates 9835 rows
Class of dataset Groceries is transactions, containing 3 slots
1.
2.
3.
transactionInfo
transactions
itemInfo
data
# data frame with vectors having length of
# data frame storing item labels
# binary evidence matrix of labels in transactions
Groceries@itemInfo[1:10,]
apply(Groceries@data[,10:20],2,function(r)
paste(Groceries@itemInfo[r,”labels”],collapse=”, “))
Exercise 2 Grocery Store Transactions
Section – 5.5.2 Frequent Itemset Generation
To illustrate the Apriori algorithm, the code below does each iteration separately.
Assume minimum support threshold = 0.02 (0.02 * 9853 = 198 items), get 122 itemsets total
First, get itemsets of length 1
itemsets<-apriori(Groceries,parameter=list(minlen=1,maxlen=1,support=0.02,target="frequent itemsets")) summary(itemsets) # found 59 itemsets inspect(head(sort(itemsets,by="support"),10)) # lists top 10 Second, get itemsets of length 2 itemsets<-apriori(Groceries,parameter=list(minlen=2,maxlen=2,support=0.02,target="frequent itemsets")) summary(itemsets) # found 61 itemsets inspect(head(sort(itemsets,by="support"),10)) # lists top 10 Third, get itemsets of length 3 > itemsets<-apriori(Groceries,parameter=list(minlen=3,maxlen=3,support=0.02,target="frequent itemsets")) > summary(itemsets)
# found 2 itemsets
# lists top 10
> summary(itemsets)
# lists top 10 supported items
Exercise 2 Grocery Store Transactions
5.5.3 Rule Generation and Visualization
The Apriori algorithm will now generate rules.
Set minimum support threshold to 0.001 (allows more rules, presumably for the
scatterplot) and minimum confidence threshold to 0.6 to generate 2,918 rules.
> rules summary(rules)
# finds 2918 rules
> plot(rules)
# displays scatterplot
The scatterplot shows that the highest lift occurs at a low support and a low
confidence.
Exercise 2 Grocery Store Transactions
5.5.3 Rule Generation and Visualization
Exercise 2 Grocery Store Transactions
5.5.3 Rule Generation and Visualization
Get scatterplot matrix to compare the
support, confidence, and lift of the 2918
rules
plot(rules@quality)
# displays scatterplot matrix
Lift is proportional to confidence with
several linear groupings.
Note that Lift = Confidence/Support(Y), so
when support of Y remains the same, lift is
proportional to confidence and the slope
of the linear trend is the reciprocal of
Support(Y).
Exercise 2 Grocery Store Transactions
5.5.3 Rule Generation and Visualization
Compute the 1/Support(Y) which is the
slope
> slope unlist(lapply(split(slope,f=slope),length))
Display the top 10 rules sorted by lift
Rule {Instant food products, soda} ->
{hamburger meat}
has the highest lift of 19 (page 154)
Exercise 2 Grocery Store Transactions
5.5.3 Rule Generation and Visualization
Visualize the top 5 rules with the highest lift.
plot(highLiftRules,method=”graph”,control=list(type=
“items”))
In the graph, the arrow always points from an
item on the LHS to an item on the RHS.
For example, the arrows that connects
ham, processed cheese, and white
{ham, processed cheese} -> {white
Size of circle indicates support and shade
represents lift
Exercise 3 – Apply priori Rules to “marketbasket.csv”
• Use the data
– apply Exercise 2 method to the data
• You can also use the following references
ITS 836
11
Exercise 4: Model & Graphics
28 Graphics
28.2.1
28.3.1
28.4.4
23 Model Basics
23.2.1
23.3.1
23.4.5
24 Model Building
24.2.3
24.3.5
25 Many Models
25.2.5
25.4.5
25.5.3
ITS 836
12
Questions?
ITS 836
13
”R for Data Science” 5 Modules
I Explore
II Wrangle
III Program
IV Model
R for Data Science, Garrett Grolemund & Hadley Wickham