Vdem codebook

3/31/2023

This is an estimate of the influence of a data point. hat observations, we can examine the amount of leverage that each country has on the model. sigma = this is the estimate of residual standard deviation if that observation is dropped from model It shows how much actual influence the observation had on the model. hat = this is a measure of the leverage of each variable. In addition to the predicted values generated by the model, other new columns that the augment function adds include:

Labs(x = '', y = '', title = "Fitted values versus actual values")

Geom_point(aes(color = fh_category), size = 4, alpha = 0.6) + Mutate(fh_category = cut(mean_fh, breaks = 5, Mutate(across(where(is.numeric), ~round(., 2))) %>%Īnd we can graph them out: fem_bus_pred %>% None of our variables have a VIF score above 5, so that is always nice to see!įrom the broom package, we can use the augment() function to create a whole heap of new columns about the variables in the model. We will look more into this later with the variables we create with augment() a bit further down this blog post. The line is not flat at the beginning so that is not ideal. I found it a bit tricky to download the first time.Ĭheck model assumptions with easystats package in R performance::check_model(fem_bus_lm) We can look at some preliminary diagnostic plots.Ĭlick here to read more about the easystat package. We don’t want the country variables so omit it from the list of independent variables. Next, we run a simple OLS linear regression. `Military spending GDP` = mean_mil_gdp) %>% Select(`Females in business` = mean_fem_bus, Select (-c(cown, iso2c, fh_country)) -> wdi_fhīefore we model the data, we can look at the correlation matrix with the corrplot package: wdi_fh %>% We join both the datasets together with the inner_join() functions: fh_summary %>% Mutate(cown = countrycode::countrycode(fh_country, "country.name", "cown")) %>% Summarise(mean_fh = mean(fh_total, na.rm = TRUE)) %>% Into the right-hand side of the model, our independent variables will be child mortality, military spending by the government as a percentage of GDP and Freedom House (democracy) Scores.įirst we download the World Bank data and summarise the variables across the years.Ĭlick here to read more about the WDI package and downloading variables from the World Bank website.ĭownload democracy data with democracyData package in R fh %

Overall scores are calculated by taking the average score of each index (Mobility, Workplace, Pay, Marriage, Parenthood, Entrepreneurship, Assets and Pension), with 100 representing the highest possible score.

With the augment() function, we can easily find observations with high leverage on the model and outlier observations.įor our model, we are going to use the “women in business and law” index as the dependent variable.Īccording to the World Bank, this index measures how laws and regulations affect women’s economic opportunity. It also gives us lots of information about how does each observation impact the model. This blog will look at the augment() function from the broom package.Īfter we run a liner model, the augment() function gives us more information about how well our model can accurately preduct the model’s dependent variable. Library(kableExtra) # make pretty tables prettier Library(democracyData) # Freedom House data Packages you will need: library(tidyverse)

0 Comments

Vdem codebook

Leave a Reply.

Author

Archives

Categories