VSI_Analysis/vsi.Rmd

195 lines
5.6 KiB
Plaintext
Raw Normal View History

---
output:
pdf_document:
includes:
in_header: ./components/styles.tex
number_sections: TRUE
latex_engine: xelatex
fig_caption: yes
fig_width: 4
fig_height: 3
keep_tex: TRUE
header-includes:
- \usepackage{titling}
- \setlength{\droptitle}{5em}
papersize: a4paper
fontsize: 11pt
mainfont: Arial
geometry: margin = 3cm
subparagraph: TRUE
graphics: yes
csl: ./components/apa.csl
link-citations: yes
params:
mainFile: ""
intFile: ""
version: "1.0 beta"
---
```{r setup, include = FALSE, error=TRUE}
library(knitr)
#library(kableExtra)
knitr::opts_chunk$set(echo = TRUE, results = "hide", message = TRUE, dev = "cairo_pdf", warning = TRUE)
knitr::opts_chunk$set(fig.pos = 'H')
options(knitr.table.format = "latex", knitr.kable.NA = "")
Sys.setlocale("LC_ALL","English")
```
```{r setup2, include = FALSE,, error=T, warning=T, message=T}
library(here)
library(foreign)
library(dplyr)
library(ggplot2)
library(lubridate)
library(ggthemes) # theme_tufte works
```
```{r theme, include=FALSE, error=TRUE}
ESSred <- rgb(.91, .20, .32)
ESSgreen <- rgb(.14, .62, .51)
ESSblue <- rgb(0, .25, .48)
# now some adjacent and square colors (colortools has been removed from CRAN)
ESS_colors_extra <- c(rgb(.44,.2,.91),rgb(.2,.91,.79),rgb(.68,.91,.2),rgb(.91,.2,.68),rgb(.91,.44,.2))
ESSColors <- c(ESSred, ESS_colors_extra, ESSgreen, ESSblue)
themeESS <- theme_tufte(base_size = 9, base_family = "Calibri") +
theme(axis.title = element_text(size = 9, face = "plain"),
axis.text = element_text(size = 9),
axis.line.x = element_line(),
plot.title = element_blank(),
legend.title = element_blank(),
legend.text = element_text(size = 9),
strip.text = element_text(size = 9, face = "bold"),
legend.position = "none",
legend.direction = "horizontal",
legend.box = "vertical",
legend.spacing = unit(0, "line"),
legend.key.size = unit(.75, "line"))
linebreak <- "\\hspace{\\textwidth}"
```
\newpage
\FloatBarrier
\pagenumbering{gobble}
```{r child = "components/Titlepage.Rmd", error=T, warning=T, message=T}
```
\pagenumbering{arabic}
\newpage
\setcounter{tocdepth}{2}
# Introduction {-}
This tools checks the Virtual Surrounding Impression data and reports on the cases that are suspicious. In essence, it conducts two analysis:
- checks for intra-case VSI match to report on cases that most likely weren't interviewed on a single location (missing intra-case VSI match)
- checks for accross-dataset VSI matches to report on cases what were most likely taken on the same location by the same interviewer using the same machine
The results should always be combined with other methods of detecting undesired interviewer behaviour.
2023-01-16 13:44:57 +01:00
**Data check**
```{r datacheck, echo=FALSE, results='asis', error=TRUE}
dataset <- read.csv2("demo_data/main.csv", dec=".", stringsAsFactors=F)
# add interviewer file, too!
# include VSI functions
source('components/VSI.R')
# exit if data is not OK
if (check_data(dataset) == FALSE) {
knitr::knit_exit()
}
```
\pagenumbering{arabic}
\newpage
# Case level analysis
Checking for individual cases that seem to be taken at multiple locations (change of location during a single interview).
Please read the results with the grain of salt as sometimes the VSI might change due to weak signal. Multiple locations per interview are possible when the interview was taken at multiple sittings.
When the WiFi is turned off, the "VSI" doesn't detect any intra-case change.
```{r intra-case, echo=FALSE, results='asis', error=TRUE}
intra_case <- c()
extra_full_list <- list()
for(current_row in 1:nrow(dataset)) {
vsi_t1_exploded <- extract_ap(dataset[current_row,]$VSI1);
vsi_t2_exploded <- extract_ap(dataset[current_row,]$VSI2);
vsi_t3_exploded <- extract_ap(dataset[current_row,]$VSI3);
# checking within
if (!match_within(vsi_t1_exploded,vsi_t2_exploded,vsi_t3_exploded)) {
intra_case <- c(intra_case, dataset[current_row,]$idno)
}
# compare against all other cases
for (compare_against in (current_row+1):nrow(dataset)) {
t1_oth <- extract_ap(dataset[compare_against,]$VSI1);
t2_oth <- extract_ap(dataset[compare_against,]$VSI2);
t3_oth <- extract_ap(dataset[compare_against,]$VSI3);
if (match_outside (vsi_t1_exploded, vsi_t2_exploded, vsi_t3_exploded, t1_oth, t2_oth, t3_oth)) {
extra_full_list <- group_pairs(extra_full_list, c(dataset[compare_against,]$idno, dataset[current_row,]$idno))
}
}
}
if (length(intra_case) == 0) {
cat ("**No cases where the location changed during the interview detected.**")
} else {
cat ("**There are ", length(intra_case) , " cases which seem to have the location changed during the interview** \n \n \n")
cat ("**IDNOs: **", paste(as.character(intra_case), collapse=", "))
}
```
\pagenumbering{arabic}
\newpage
# Accross-dataset analysis
Checking for cases that seem to have the same location.
This can happen if WiFi is OFF (all cases from given interviewer have the same location).
```{r extra-case, echo=FALSE, results='asis', error=TRUE}
if (length(extra_full_list) == 0) {
cat ("**There are no multiple cases that seem to have been taken at the same location.**")
} else {
cat ("**There are ", length(extra_full_list) , " location where multiple interviews seem to be conducted** \n \n \nPlease see them grouped by location below: \n \n \n")
for (location in 1:length(extra_full_list)) {
cat (paste("**Location ", location, ":** \n"))
cat ("\tIDNOs ", paste(as.character(extra_full_list[[location]]), collapse=", "), " \n \n \n")
}
}
```