Cleaned up the code, provided additional data checks and instructions on how to fix errors. Refactored the code so it now uses "common.R" and "VSI.R". By doing this, all other "tools" by the WP8 team can be streamlined and made more user friendly.
167 lines
4.7 KiB
Plaintext
167 lines
4.7 KiB
Plaintext
---
|
|
output:
|
|
pdf_document:
|
|
includes:
|
|
in_header: ./components/styles.tex
|
|
number_sections: TRUE
|
|
latex_engine: xelatex
|
|
fig_caption: yes
|
|
fig_width: 4
|
|
fig_height: 3
|
|
keep_tex: TRUE
|
|
header-includes:
|
|
- \usepackage{titling}
|
|
- \setlength{\droptitle}{5em}
|
|
papersize: a4paper
|
|
fontsize: 11pt
|
|
mainfont: Arial
|
|
geometry: margin = 3cm
|
|
subparagraph: TRUE
|
|
graphics: yes
|
|
csl: ./components/apa.csl
|
|
link-citations: yes
|
|
params:
|
|
mainFile: "demo_data/main.csv"
|
|
intFile: "demo_data/inwer.csv"
|
|
version: "1.0.230117"
|
|
|
|
---
|
|
|
|
|
|
```{r setup, include = FALSE, error=TRUE}
|
|
library(knitr)
|
|
library(here)
|
|
library(foreign)
|
|
library(dplyr)
|
|
library(ggplot2)
|
|
library(lubridate)
|
|
library(ggthemes) # theme_tufte works
|
|
|
|
source('components/VSI.R')
|
|
source('components/common.R')
|
|
|
|
common_defs();
|
|
|
|
source('components/template.R')
|
|
|
|
```
|
|
|
|
\newpage
|
|
\FloatBarrier
|
|
\pagenumbering{gobble}
|
|
|
|
|
|
```{r child = "components/Titlepage.Rmd", error=T, warning=T, message=T}
|
|
```
|
|
|
|
\pagenumbering{arabic}
|
|
\newpage
|
|
\setcounter{tocdepth}{2}
|
|
|
|
# Introduction {-}
|
|
|
|
This tools checks the Virtual Surrounding Impression data and reports on the cases that are suspicious. In essence, it conducts two analysis:
|
|
|
|
- checks for intra-case VSI match to report on cases that most likely weren't interviewed on a single location (missing intra-case VSI match)
|
|
- checks for accross-dataset VSI matches to report on cases what were most likely taken on the same location by the same interviewer using the same machine
|
|
|
|
The results should always be combined with other methods of detecting undesired interviewer behaviour.
|
|
|
|
## Support {-}
|
|
|
|
Support is available on MyESS forums or via e-mail at may.dousak@fdv.uni-lj.si .
|
|
|
|
|
|
**Data check**
|
|
|
|
```{r datacheck, echo=FALSE, results='asis', error=TRUE}
|
|
|
|
# check files
|
|
if (check_files(params$mainFile, params$intFile) == FALSE) {
|
|
cat (" \n**Please fix the above errors and re-run the script.**")
|
|
knitr::knit_exit()
|
|
}
|
|
|
|
|
|
dataset = load_files(params$mainFile, params$intFile);
|
|
if (!(length(dataset) > 0 && dataset != FALSE)) {
|
|
cat (" \n**Please fix the above errors and re-run the script.**")
|
|
knitr::knit_exit()
|
|
}
|
|
|
|
|
|
```
|
|
|
|
\pagenumbering{arabic}
|
|
\newpage
|
|
|
|
# Case level analysis
|
|
|
|
Checking for individual cases that seem to be taken at multiple locations (change of location during a single interview).
|
|
Please read the results with the grain of salt as sometimes the VSI might change due to weak signal. Multiple locations per interview are possible when the interview was taken at multiple sittings.
|
|
When the WiFi is turned off, the "VSI" doesn't detect any intra-case change.
|
|
|
|
|
|
```{r intra-case, echo=FALSE, results='asis', error=TRUE}
|
|
|
|
intra_case <- c()
|
|
extra_full_list <- list()
|
|
|
|
for(current_row in 1:nrow(dataset)) {
|
|
vsi_t1_exploded <- extract_ap(dataset[current_row,]$VSI1);
|
|
vsi_t2_exploded <- extract_ap(dataset[current_row,]$VSI2);
|
|
vsi_t3_exploded <- extract_ap(dataset[current_row,]$VSI3);
|
|
|
|
# checking within
|
|
if (!match_within(vsi_t1_exploded,vsi_t2_exploded,vsi_t3_exploded)) {
|
|
intra_case <- c(intra_case, dataset[current_row,]$idno)
|
|
}
|
|
|
|
# compare against all other cases
|
|
for (compare_against in (current_row+1):nrow(dataset)) {
|
|
|
|
t1_oth <- extract_ap(dataset[compare_against,]$VSI1);
|
|
t2_oth <- extract_ap(dataset[compare_against,]$VSI2);
|
|
t3_oth <- extract_ap(dataset[compare_against,]$VSI3);
|
|
|
|
if (match_outside (vsi_t1_exploded, vsi_t2_exploded, vsi_t3_exploded, t1_oth, t2_oth, t3_oth)) {
|
|
extra_full_list <- group_pairs(extra_full_list, c(dataset[compare_against,]$idno, dataset[current_row,]$idno))
|
|
|
|
}
|
|
}
|
|
}
|
|
|
|
if (length(intra_case) == 0) {
|
|
cat ("**No cases where the location changed during the interview detected.**")
|
|
} else {
|
|
cat ("**There are ", length(intra_case) , " cases which seem to have the location changed during the interview** \n \n \n")
|
|
cat ("**IDNOs: **", paste(as.character(intra_case), collapse=", "))
|
|
}
|
|
|
|
```
|
|
|
|
\pagenumbering{arabic}
|
|
\newpage
|
|
|
|
# Accross-dataset analysis
|
|
|
|
Checking for cases that seem to have the same location.
|
|
This can happen if WiFi is OFF (all cases from given interviewer have the same location).
|
|
|
|
```{r extra-case, echo=FALSE, results='asis', error=TRUE}
|
|
|
|
if (length(extra_full_list) == 0) {
|
|
cat ("**There are no multiple cases that seem to have been taken at the same location.**")
|
|
} else {
|
|
cat ("**There are ", length(extra_full_list) , " location where multiple interviews seem to be conducted** \n \n \nPlease see them grouped by location below: \n \n \n")
|
|
|
|
for (location in 1:length(extra_full_list)) {
|
|
cat (paste("**Location ", location, ":** \n"))
|
|
cat ("\tIDNOs ", paste(as.character(extra_full_list[[location]]), collapse=", "), " \n \n \n")
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|