Bivariate and multivariate analyses are statistical methods that help you investigate relationships between data samples. Bivariate analysis looks at two paired data sets, studying whether a relationship exists between them. Multivariate analysis uses two or more variables and analyzes which, if any, are correlated with a specific outcome. The goal in the latter case is to determine which variables influence or cause the outcome.
Bivariate analysis investigates the relationship between two paired data sets. The two data sets are paired because a pair of observations are taken from a single sample or individual, but each sample is independent. The data is analyzed, using tools such as t-tests and chi-squared tests, to see if the two groups of data correlate with each other and, if the variables are quantitative, they are usually graphed on a scatterplot. Bivariate analysis also examines the strength of any correlation.
One example of bivariate analysis is a research team recording the age of both husband and wife in a single marriage. This data is paired because both ages come from the same marriage, but independent because one person's age doesn't cause another person's age. The data is plotted, showing a correlation in the data: the older husbands have older wives. A second example is recording measurements of grip strength and arm strength from individuals. The data is paired because both measurements come from a single person, but independent because different muscles are used. Data is plotted from many individuals, showing a correlation: people with higher grip strength have higher arm strength.
Multivariate analysis analyzes several variables to see if one or more of them are predictive of a certain outcome. The predictive variables are considered independent variables, and the outcome is the dependent variable. The variables can be either continuous, meaning they can have a range of values, or they can be dichotomous, meaning they represent the answer to a yes or no question. Multiple regression analysis is the most common method used in multivariate analysis to find correlations between data sets, but many others, such as logistic regression and multivariate analysis of variance, are also used.
Multivariate analysis was used in by researchers in a 2009 Journal of Pediatrics study to investigate whether negative life events, family environment, family violence, media violence and depression are predictors of youth aggression and bullying. Negative life events, family environment, family violence, media violence and depression were the independent predictor variables. Aggression and bullying were the dependent outcome variables. Over 600 subjects, with an average age of 12 years old, were given questionnaires that determined the predictor variables for each child. A survey was also given that determined the outcome variables for each child. Multiple regression equations and structural equation modeling was used to study the data set. Negative life events and depression were found to be the strongest predictors of youth aggression.