Predictive Analytics For Dummies (Anasse Bari, 2013)


  • Content-based filtering: credibility, sparsity, inconsistency
  • System recommendation measurement: precision - how accurate the recommendation, recall - set of possible good recommendation
  • Groups of customers: persuadables, sure things, lost causes, do not disturb
  • Structured data: organised, formally defined, easy to access and query, lower availability, efficient to analyse. Unstructured data: scattered and dispersed, free-form, hard to access and query, higher availability, additional preprocessing is needed
  • Attitudinal data: how the customer feels about something. Behavioural data: sales transaction. Demographic data: personal information
  • Data-driven data: no prior knowledge, broad use of data-mining tools, suited for large-scale data, open scope, needs verification of results, uncovers patterns and associations. User-driven data: in-depth domain knowledge, specific design for analysis and testing, can work on smaller datasets, limited scope, easier adoption of analysis results, may miss hidden patterns and associations
  • Patterns: separation, alignment, cohesion
  • Fruit basket example: past patterns > categories > bias mode > percentage matching > confirm or deep search
  • Identifying groups in data: k-means clustering algorithm
  • Calculate similarity: Euclidean distance
  • Density-based algorithm: density-based spatial clustering of applications with noise (DBSCAN)
  • Data mining in association rules: Apriori algorithm
  • Data classification to predict the future: decision trees
    • Entropy decision formula
  • Data classification algorithm: support vector machine (SVM)
    • Neural networks - process past and current data to estimate future values
  • Data classification on probabilistic analysis: Naive Bayes classification algorithm & Naive Bayes’ Theorem
  • Boost prediction accuracy: ensemble method
  • Statistical model: Markov Model
    • Linear regression - statistical model to analyse and find the relationship between two variables
  • Underfitting: can’t detect any relationship in the data; overfitting: no predictive power and extra noise


Popular posts from this blog

Kokology Questions & Answers

Psychological Terms, Physics Laws & Effect, Mathematics & Paradoxes, Fallacies, Metaheuristics(Growing List)

The Art of Thinking Clearly (Rolf Dobelli, 2013)