
Vrije University - Summer graduate programs
Summer Course in Data Analysis in ROnline Netherlands
DURATION
2 Weeks
LANGUAGES
English
PACE
Full time
APPLICATION DEADLINE
Request application deadline
EARLIEST START DATE
Jul 2025
TUITION FEES
EUR 1,360 / per course *
STUDY FORMAT
On-Campus
* Students, PhD candidates and employees of VU Amsterdam, Amsterdam UMC or an Aurora Network Partner €765 Students and PhD candidates at partner universities of VU Amsterdam €1035 Students and PhD candidates at non-partner universities of VU Amsterdam
Key Summary
Introduction
Data is everywhere but retrieving valuable insights requires important analytical skills. The large number of active programmers creating R packages makes R suitable for a range of data analysis techniques, from basic hypothesis testing to generalized linear regression, and multivariate analysis such as principal component, factor analysis, or clustering. You will apply what you have learned right away in short exercises using Rmarkdown. You will be graded using an assignment in which you will learn to deal with messy data and integrate the knowledge you obtained in the exercises. The course is highly intensive as it focuses both on interpreting statistics while also learning to program in R.
With the increasing use of programming languages in data analytics, now is the time to learn their ins and outs. This course focuses on understanding statistical models and analyzing the results whilst learning to work with R. As well as introducing the software to newcomers, it presents basic and more advanced statistics using an overarching framework of the generalized linear model.
Course Overview
- Course level: Advanced Bachelor's and Master's
- Course curriculum: read more about the course curriculum
- Coordinating lecturer: Dr. Meike Morren
- Forms of assessment: written assignment
- Contact hours: 45 hours
Admissions
Scholarships and Funding
Equal Access Scholarship
Application Procedure
Application for the Equal Access Scholarship will open in Febraury
Great that you are interested in applying for the Equal Access Scholarship. You can apply to the scholarship between 12 February and 1 April. Please be aware that it is only possible to select one course.
The results of the scholarship selection will be announced in May. Since we have a limited number of scholarships available for a large number of applicants, we suggest - if possible! - to complete your payment at the time of your course application to guarantee your place in the course. However, if you are not able to come without the scholarship, you can just wait until the announcement. If you would like to come, regardless of whether you will be granted the scholarship, it is best to secure your place in the course by completing your payment via our regular application form. If the scholarship is granted to you, the tuition and accommodation fees will be reimbursed.
Deadline to submit your Equal Access Scholarship application: 31 March (23:59 CET).
Requirements
When you apply via the Equal Access Scholarship application form you will be requested to upload the following documents:
- Curriculum Vitae/Résumé (CV) stating your educational background.
- Professional Letter of Reference Including:
- His/her/their experience working with you (either in an academic, professional, or volunteer setting)
- His/her/their motivation for recommending you for the scholarship
- Complete contact information
- His/her/their experience working with you (either in an academic, professional, or volunteer setting)
- His/her/their motivation for recommending you for the scholarship
- Complete contact information
- When filling out the scholarship form, we will ask the following questions*:
- Why are you interested in joining VU Amsterdam Summer School?
- What’s your motivation for selecting this course?
- How you will use the information you learn to make a positive impact in the future for both you and your community?
- Why do you deserve this scholarship?
- Why are you interested in joining VU Amsterdam Summer School?
- What’s your motivation for selecting this course?
- How you will use the information you learn to make a positive impact in the future for both you and your community?
- Why do you deserve this scholarship?
Please stick to a maximum of 150 words per question.
Green Travel Grant
At VU Amsterdam Summer School we are also committed to VU's sustainability goals and we aim to reduce the environmental impact of mobility, and specifically, student travel. Therefore, we are thrilled to offer Green Travel Grants to encourage sustainable travel for students attending our summer school.
Where can I apply?
Once the courses have been confirmed to run in mid-May or June, we will send out a newsletter to our participants with a link where they can apply for either funding for train travel or funding for bus travel.
The application period will last two weeks, and we will select the winners via a lottery system. More information on the specific deadlines can be found in the newsletter we send out in May.
How does it work?
For students to receive the economic compensation they will need to submit their purchased travel tickets via email within two weeks after being selected as winners of the grant. Once the deadline to submit their tickets has passed, the students will receive the reimbursement.
Curriculum
The first week is devoted to learning how to use R and regression analysis. We start with reading data into R, descriptive statistics, and visual representation of data, which is the first step for statistical analyses. We then introduce the linear regression model, a widely used model with two main purposes: modeling relationships among the variables and predicting future observations.
In the second week, we will extend the linear model to the generalized linear framework, in order to analyze discrete dependent variables. The logit regression that you will work with proves useful to understand the remainder of the course: classification. You will learn how to reduce data dimensions using principal component analysis and cluster analysis, and how to use the learned methods for prediction.
Every day consists of short lectures with examples and exercises in which you apply what you have learned right away. The focus of the exercises and assignment is the coding in R and how to apply and interpret generalized linear regression models. After class, you are supposed to work on an assignment in which you integrate what you have learned in the exercises during class. This assignment will be graded.
Week 1
Day 1: Introduction
We start with explaining the basics of the R environment, and Rstudio. You will learn how to work with the main data types in R: vector, factor, matrix, list, and data frames. You will learn to create variables, select cases and variables, and how to use plots. Simple functions to calculate the mean and the standard deviation are introduced.
Day 2: Data & functions
You will read a data file into R, and you will learn how to compute descriptive statistics and frequencies in R. The functions discussed last day will be applied to this survey dataset. Additionally, various loop commands that allow you to run complicated tasks on the entire dataset are discussed. We introduce vectorization as an alternative to loops. Although a loop is more intuitive, vectorization is much faster. Throughout the course, we will practice these skills in writing a function for the t-test, linear regression, and the log-likelihood ratio test.
Day 3: Simple regression
We will discuss how the linear model is related to the t-test. You will learn how to interpret the results with one independent dummy or interval variable, and how you can test the assumptions of linear regression.
Day 4: Assumptions of regression
You will learn how to interpret the results with one independent interval variable, and how you can test the assumptions of linear regression. The assumptions of linear regression underlie the possibility to trust your estimates and standard errors. You will learn how to inspect them, and what the consequences are of violating these assumptions
Day 5: Multiple regression
This day builds on day 3 in which we treat simple regression. The multiple regression model additionally adds the concept of ‘ceteris paribus’. We will also treat confounding and interaction effects, and when and how to use mean centering.
Week 2
Day 6: Logistic regression
We will introduce logistic regression as part of the generalized linear framework. We will calculate the odds ratio and discuss how it is related to the chi-square test and logistic regression. Furthermore, we will discuss the log-likelihood ratio test to compare two or more models.
Day 7: Classification / Linear Discriminant Analysis
Many datasets are multidimensional in nature in which there is no definite dependent or independent variable. You learn how to apply linear (quadratic) discriminant analysis as a data reduction technique, and how to plot the decision boundary.
Day 8: Cluster analysis
You will learn about similarity measures, how to read dendrogram plots, how to use the K-means algorithm for classification, and how to visualize clustered data in R.
Day 9: Principal component analysis / Confirmatory factor analysis
Many scales in survey research are multiple-item scales. You will learn about the validity and reliability issues surrounding these scales and how to apply and interpret confirmatory factor analysis. We will discuss the factor loadings, the factor correlations, and the item variances. You will learn how to test group differences using multiple group analysis, and mediation analysis.
Day 10: Recap and Tidy Work Methods
This final day will be used to:
- Recap the material
- Explain more about how to clean up your code and improve your workflow,
- To work on your assignment.
If you have any questions about your own research that have not been addressed during the course (feel free to use the Q&A for your own research questions), you are welcome to make an appointment to meet with us individually.
Program Outcome
By the end of this course, students will be able to:
- Evaluate the quality of quantitative data sources
- Choose the appropriate method for analysis, depending on the data source
- Conduct various statistical tests
- Analyse data using the generalized linear framework
- Have developed their skills in programming