The Data Analysis course represents for me a sort of a culmination of my online courses. Even since the first one (Artificial Intelligence) it became obvious that statistics and data analysis play a big part in lots of things I am interested in.
I do not consider myself particularly good at math (and this is one of the reasons for me abandoning the University years ago) so I was at the same time wary that the farther I went on the statistics study, the earlier I would hit a math wall.
Luckily for me the present course is very pragmatic and hands-on, and gives you a reasonable set of tools (both technical and mental ones) that allow you to at least make some inwards in exploring diverse data sets looking for interesting patterns.
![Dat.An-2.FICO](DataAnalytics_files/dat.an-2.fico.jpg)
These two "homework" projects are evaluated by randomly picked students (and you are in turn asked to grade four other people's works) and contribute another 50% of the final score.
Everything is done in "R". While the course provides resources and links to tutorials it should not be considered a "gentle introduction to R". In fact some students complained in the forums about feeling a bit overwhelmed by this. I managed to cope thanks to my programming background and having already used R in a previous course.
Final Result: 89.8% I am a bit saddened because I hoped to get the "distinction" certificate (you have to get at least 90% though). I worked hard at this course (probably more than at any other so far, with the possible exclusion of the AI one).
I am still happy, though - because I really got a sense of accomplishment from this, and I feel confident enough to start looking for opportunities to practice what I learned, and also to learn more.
Syllabus:
- The structure of a data analysis (steps in the process, knowing when to quit, etc.)
- Types of data (census, designed studies, randomized trials)
- Types of data analysis questions (exploratory, inferential, predictive, etc.)
- How to write up a data analysis (compositional style, reproducibility, etc.)
- Obtaining data from the web (through downloads mostly)
- Loading data into R from different file types
- Plotting data for exploratory purposes (boxplots, scatterplots, etc.)
- Exploratory statistical models (clustering)
- Statistical models for inference (linear models, basic confidence intervals/hypothesis testing)
- Basic model checking (primarily visually)
- The prediction process
- Study design for prediction
- Cross-validation
- A couple of simple prediction models
- Basics of simulation for evaluating models
- Ways you can fool yourself and how to avoid them (confounding, multiple testing, etc.)