VSI_Analysis/vsi.Rmd
MAY fc628cdf28 First release ready version.
Cleaned up the code, provided additional data checks and instructions on
how to fix errors.

Refactored the code so it now uses "common.R" and "VSI.R". By doing
this, all other "tools" by the WP8 team can be streamlined and made more
user friendly.
2023-01-17 10:13:25 +01:00

167 lines
4.7 KiB
Plaintext

---
output:
pdf_document:
includes:
in_header: ./components/styles.tex
number_sections: TRUE
latex_engine: xelatex
fig_caption: yes
fig_width: 4
fig_height: 3
keep_tex: TRUE
header-includes:
- \usepackage{titling}
- \setlength{\droptitle}{5em}
papersize: a4paper
fontsize: 11pt
mainfont: Arial
geometry: margin = 3cm
subparagraph: TRUE
graphics: yes
csl: ./components/apa.csl
link-citations: yes
params:
mainFile: "demo_data/main.csv"
intFile: "demo_data/inwer.csv"
version: "1.0.230117"
---
```{r setup, include = FALSE, error=TRUE}
library(knitr)
library(here)
library(foreign)
library(dplyr)
library(ggplot2)
library(lubridate)
library(ggthemes) # theme_tufte works
source('components/VSI.R')
source('components/common.R')
common_defs();
source('components/template.R')
```
\newpage
\FloatBarrier
\pagenumbering{gobble}
```{r child = "components/Titlepage.Rmd", error=T, warning=T, message=T}
```
\pagenumbering{arabic}
\newpage
\setcounter{tocdepth}{2}
# Introduction {-}
This tools checks the Virtual Surrounding Impression data and reports on the cases that are suspicious. In essence, it conducts two analysis:
- checks for intra-case VSI match to report on cases that most likely weren't interviewed on a single location (missing intra-case VSI match)
- checks for accross-dataset VSI matches to report on cases what were most likely taken on the same location by the same interviewer using the same machine
The results should always be combined with other methods of detecting undesired interviewer behaviour.
## Support {-}
Support is available on MyESS forums or via e-mail at may.dousak@fdv.uni-lj.si .
**Data check**
```{r datacheck, echo=FALSE, results='asis', error=TRUE}
# check files
if (check_files(params$mainFile, params$intFile) == FALSE) {
cat (" \n**Please fix the above errors and re-run the script.**")
knitr::knit_exit()
}
dataset = load_files(params$mainFile, params$intFile);
if (!(length(dataset) > 0 && dataset != FALSE)) {
cat (" \n**Please fix the above errors and re-run the script.**")
knitr::knit_exit()
}
```
\pagenumbering{arabic}
\newpage
# Case level analysis
Checking for individual cases that seem to be taken at multiple locations (change of location during a single interview).
Please read the results with the grain of salt as sometimes the VSI might change due to weak signal. Multiple locations per interview are possible when the interview was taken at multiple sittings.
When the WiFi is turned off, the "VSI" doesn't detect any intra-case change.
```{r intra-case, echo=FALSE, results='asis', error=TRUE}
intra_case <- c()
extra_full_list <- list()
for(current_row in 1:nrow(dataset)) {
vsi_t1_exploded <- extract_ap(dataset[current_row,]$VSI1);
vsi_t2_exploded <- extract_ap(dataset[current_row,]$VSI2);
vsi_t3_exploded <- extract_ap(dataset[current_row,]$VSI3);
# checking within
if (!match_within(vsi_t1_exploded,vsi_t2_exploded,vsi_t3_exploded)) {
intra_case <- c(intra_case, dataset[current_row,]$idno)
}
# compare against all other cases
for (compare_against in (current_row+1):nrow(dataset)) {
t1_oth <- extract_ap(dataset[compare_against,]$VSI1);
t2_oth <- extract_ap(dataset[compare_against,]$VSI2);
t3_oth <- extract_ap(dataset[compare_against,]$VSI3);
if (match_outside (vsi_t1_exploded, vsi_t2_exploded, vsi_t3_exploded, t1_oth, t2_oth, t3_oth)) {
extra_full_list <- group_pairs(extra_full_list, c(dataset[compare_against,]$idno, dataset[current_row,]$idno))
}
}
}
if (length(intra_case) == 0) {
cat ("**No cases where the location changed during the interview detected.**")
} else {
cat ("**There are ", length(intra_case) , " cases which seem to have the location changed during the interview** \n \n \n")
cat ("**IDNOs: **", paste(as.character(intra_case), collapse=", "))
}
```
\pagenumbering{arabic}
\newpage
# Accross-dataset analysis
Checking for cases that seem to have the same location.
This can happen if WiFi is OFF (all cases from given interviewer have the same location).
```{r extra-case, echo=FALSE, results='asis', error=TRUE}
if (length(extra_full_list) == 0) {
cat ("**There are no multiple cases that seem to have been taken at the same location.**")
} else {
cat ("**There are ", length(extra_full_list) , " location where multiple interviews seem to be conducted** \n \n \nPlease see them grouped by location below: \n \n \n")
for (location in 1:length(extra_full_list)) {
cat (paste("**Location ", location, ":** \n"))
cat ("\tIDNOs ", paste(as.character(extra_full_list[[location]]), collapse=", "), " \n \n \n")
}
}
```