Course Outline
Introduction to Data Analysis and Big Data
- What Makes Big Data "Big"?
- Velocity, Volume, Variety, Veracity (VVVV)
 
 - Limits to Traditional Data Processing
 - Distributed Processing
 - Statistical Analysis
 - Types of Machine Learning Analysis
 - Data Visualization
 
Big Data Roles and Responsibilities
- Administrators
 - Developers
 - Data Analysts
 
Languages Used for Data Analysis
- R Language
- Why R for Data Analysis?
 - Data manipulation, calculation and graphical display
 
 - Python
- Why Python for Data Analysis?
 - Manipulating, processing, cleaning, and crunching data
 
 
Approaches to Data Analysis
- Statistical Analysis
- Time Series analysis
 - Forecasting with Correlation and Regression models
 - Inferential Statistics (estimating)
 - Descriptive Statistics in Big Data sets (e.g. calculating mean)
 
 - Machine Learning
- Supervised vs unsupervised learning
 - Classification and clustering
 - Estimating cost of specific methods
 - Filtering
 
 - Natural Language Processing
- Processing text
 - Understaing meaning of the text
 - Automatic text generation
 - Sentiment analysis / topic analysis
 
 - Computer Vision
- Acquiring, processing, analyzing, and understanding images
 - Reconstructing, interpreting and understanding 3D scenes
 - Using image data to make decisions
 
 
Big Data Infrastructure
- Data Storage
- Relational databases (SQL)
- MySQL
 - Postgres
 - Oracle
 
 - Non-relational databases (NoSQL)
- Cassandra
 - MongoDB
 - Neo4js
 
 - Understanding the nuances
- Hierarchical databases
 - Object-oriented databases
 - Document-oriented databases
 - Graph-oriented databases
 - Other
 
 
 - Relational databases (SQL)
 - Distributed Processing
- Hadoop
- HDFS as a distributed filesystem
 - MapReduce for distributed processing
 
 - Spark
- All-in-one in-memory cluster computing framework for large-scale data processing
 - Structured streaming
 - Spark SQL
 - Machine Learning libraries: MLlib
 - Graph processing with GraphX
 
 
 - Hadoop
 - Scalability
- Public cloud
- AWS, Google, Aliyun, etc.
 
 - Private cloud
- OpenStack, Cloud Foundry, etc.
 
 - Auto-scalability
 
 - Public cloud
 
Choosing the Right Solution for the Problem
The Future of Big Data
Summary and Next Steps
Requirements
- A general understanding of math
 - A general understanding of programming
 - A general understanding of databases
 
Audience
- Developers / programmers
 - IT consultants
 
Testimonials (7)
How big data work, data programs, greater knowledge of how our current world works using data
Ozayr Hussain - Vodacom
Course - A Practical Introduction to Data Analysis and Big Data
The practical side of the training.
Patrick - Vodacom PTy Ltd
Course - A Practical Introduction to Data Analysis and Big Data
Interactive topics and the style used by the lecture to simplified the topics for the students
Miran Saeed - Sulaymaniyah Asayish Agency
Course - A Practical Introduction to Data Analysis and Big Data
the trainer and his ability to lecture
ibrahim hamakarim - Sulaymaniyah Asayish Agency
Course - A Practical Introduction to Data Analysis and Big Data
Practical exercises
JOEL CHIGADA - University of the Western Cape
Course - A Practical Introduction to Data Analysis and Big Data
R programming
Osden Jokonya - University of the Western Cape
Course - A Practical Introduction to Data Analysis and Big Data
Overall the Content was good.