Expert answer:Module 4: Data Mining


Solved by verified expert:Attached Files:
Data MIning Assignment_with rubric

(181.304 KB)

Don't use plagiarized sources. Get Your Custom Essay on
Expert answer:Module 4: Data Mining
Just from $10/Page
Order Essay

(621.785 KB)

(31.055 KB)

(142.184 KB) This assignment provides you with practice using R for data mining techniques. You will use R to classify and cluster dataset to show how data mining methods can be used to classify and cluster data. Before beginning this assignment, review the learning resources for this module, especially Introduction to Data Mining with R from R, reviewing the steps taken to classify and cluster the iris data set in R. The purpose of clustering is to form new classification from numerical variables. Therefore, it is important that you remove original classification from the data set prior to conducting clustering. For example, the species variable needs to be removed from the Iris data set because species is a classification. You may then merge back the original specifies variable and compare the newly formed clusters against the original classification to see how they differ. Complete the following steps and write a report to record your work, results and analysis.
Install and load the *factoextra and **NbClust packages.
Select an appropriate data set in R or the MASS library and use the sample(), ctree() and predict() functions to build a decision tree and plot it. You may also use one of the data sets (usoccupations, uscars, uspopulation) attached to this module for this assignment (you may import directly or convert to CSV first).
Determine the appropriate number of clusters and produce a k-means cluster. Explain your findings.
Produce a density-based cluster with DBSCAN or use logistic regression to construct a binary classification and explain your findings.
*The factoextra package is used to determine the optimal number clusters for a given clustering methods and for data visualization **The NcClust package provides 30 indices for determining the relevant number of clusters and the best clustering scheme from the different results obtained by varying all combinations of number of clusters, distance measures, and clustering methods. It can simultaneously compute all the indices and determine the number of clusters in a single function call. Report Your assignment/project should have a good cover/title page, introduction of what the goals of the project and the methods you use. It also should follow APA format with at least 1000 words (excluding title page and references page) and references page. In the body of your project you should incorporate the R codes and R outputs with interpretation of your results. Be sure to show all the elements in the official hypothesis, including the null and alternative hypothesis, the critical values, calculation of the test statistics and p-values. Finally, you need to make sense of your results to make good points with proper conclusions, to show your understanding of the course material and its application to the dataset. Graphs, figures, charts, tables are very useful to increase visual effects to impress your readers. You also should do your best to give insight and understanding to the project with a good conclusion. Please use subtitles to make your assignment more reader friendly as well.

Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
The price is based on these factors:
Academic level
Number of pages
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more

Order your essay today and save 30% with the discount code ESSAYSHELP