282 lines
10 KiB
Markdown
282 lines
10 KiB
Markdown
|
# Virtual surrounding impression tool
|
||
|
|
||
|
This tool generates VSI hashes on the client computers
|
||
|
|
||
|
*** Please read before using! ***
|
||
|
*** Developers can skip chapter 4 and check chapter 5! ***
|
||
|
|
||
|
This folder contains sample code for "virtual surrounding impression" generator
|
||
|
as a Python script (VSI.py)
|
||
|
|
||
|
|
||
|
# CONTENTS
|
||
|
|
||
|
1. License
|
||
|
2. Introduction
|
||
|
3. Principles of operation
|
||
|
4. Security, privacy concerns
|
||
|
5. How to use the provided code
|
||
|
5.1 Supported enviromnents and requirements
|
||
|
5.2 Use on Linux clients
|
||
|
5.3 Use on Windows clients
|
||
|
5.4 Use on MacOS clients
|
||
|
6. How to implement your own solution (<-- programmers read this)
|
||
|
7. Authors
|
||
|
|
||
|
|
||
|
# 1. LICENSE
|
||
|
|
||
|
|
||
|
While this is clearly only a rough proof-of-concept demo code, you can use
|
||
|
it freely under the GNU GPL v3 license.
|
||
|
|
||
|
This program is free software: you can redistribute it and/or modify it
|
||
|
under the terms of the GNU General Public License as published by the Free
|
||
|
Software Foundation, either version 3 of the License, or (at your option)
|
||
|
any later version.
|
||
|
|
||
|
This program is distributed in the hope that it will be useful, but WITHOUT
|
||
|
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
||
|
FITNESS FOR A PARTICULAR PURPOSE.
|
||
|
See the GNU General Public License for more details.
|
||
|
|
||
|
You should have received a copy of the GNU General Public License along with
|
||
|
this program. If not, see <https://www.gnu.org/licenses/>.
|
||
|
|
||
|
|
||
|
|
||
|
## 2. INTRODUCTION
|
||
|
|
||
|
This is sample app that generates virtual surrounding impression and works
|
||
|
in all three major computer operating systems.
|
||
|
|
||
|
It doesn't require any parameters and it returns a clean text output that
|
||
|
can be directly piped into other computer programs.
|
||
|
|
||
|
If you have limited developer resources, you can call this script from
|
||
|
within your survey software and save the returned text as a string variable.
|
||
|
|
||
|
If you have any programming means, please read chapter 5. It's not much work
|
||
|
to implement it in your solution.
|
||
|
|
||
|
If you choose to use this script, you can use PyInstaller to compile it into
|
||
|
an executable program that doesn't need python installed on clients.
|
||
|
|
||
|
Please read more about the actual principles of operation in the included
|
||
|
"How_VSI_works.pdf" PDF file.
|
||
|
|
||
|
****************************************************************************
|
||
|
|
||
|
|
||
|
# 3. PRINCIPLES OF OPERATION
|
||
|
|
||
|
For more details, refer to the included "How_VSI_works.pdf".
|
||
|
|
||
|
In essence, the script:
|
||
|
|
||
|
a) scans surrounding wifi APs and retrieve their BSSIDs and signal
|
||
|
strengths
|
||
|
b) sorts them in descending order based on signal strength
|
||
|
c) takes first five AP BSSIDs and their strengths, e.g.:
|
||
|
12:34:56:78:90:ab 80%
|
||
|
23:34:45:56:56:67 79%
|
||
|
|
||
|
d) salts each and every BSSID and strength separately with machine UUID
|
||
|
and username. When less than five APs are visible, DO NOT SALT EMPTY
|
||
|
BSSIDs.
|
||
|
|
||
|
e) makes 8 characters long hash of each salted BSSID and signal strength
|
||
|
SEPARATELY.
|
||
|
|
||
|
5 APs with two data points (BSSID, power) generate 5*2=10 data 8-char
|
||
|
long hashes. 10*8 = 80, hence the 80-characters long VSI code.
|
||
|
|
||
|
****************************************************************************
|
||
|
|
||
|
|
||
|
# 4. SECURITY, PRIVACY
|
||
|
|
||
|
Each computer salts differently for each user, so it should be safe from
|
||
|
reverse lookups using rainbow tables: 128 bit UUID plus username.
|
||
|
|
||
|
This script uses xhash and CRC32, but you can use anything else, e.g.
|
||
|
xxhsum -H0; fletcher-32; adler32, or make a SHA-1 and take out 8 chars.
|
||
|
|
||
|
In practice, nobody is going to alter wifi names to fake different locations
|
||
|
as it is much easier to simply turn the WiFi off.
|
||
|
|
||
|
Collisions are possible, but not critical for the given usage. Please refer
|
||
|
to included "How_VSI_works.pdf" for more; you can also read a word or two
|
||
|
about the collisions here:
|
||
|
https://preshing.com/20110504/hash-collision-probabilities/
|
||
|
|
||
|
|
||
|
****************************************************************************
|
||
|
|
||
|
# 5. HOW TO USE THE PROVIDED CODE
|
||
|
|
||
|
Your survey software should call this procedure three times during a survey;
|
||
|
we suggest implementing it as a hidden string variable at a fixed location
|
||
|
in the survey (e.g. after 1st block, middle block, last block).
|
||
|
|
||
|
We suggest doing it in a parallel thread, as scanning networks can take a
|
||
|
couple of seconds. If that is not possible, we suggest scheduling this
|
||
|
script to run every 10 minutes and store the results in a temporary text
|
||
|
file, idealy on a volatile memory (RAM) so the contents get removed upon
|
||
|
reboot or shut down.
|
||
|
|
||
|
Then, simply read text file contents from the survey.
|
||
|
|
||
|
Scheduling command on Linux on Mac:
|
||
|
`python3 SampleLocator.py > /tmp/VSI.txt`
|
||
|
|
||
|
Scheduling command on Windows:
|
||
|
`python3 SampleLocator.py > %tmp%\VSI.txt`
|
||
|
|
||
|
** NOTE: when using temporary files, do NOT store hashes permanently on a
|
||
|
nonvolatile memory. Do not store more than a single (last) hash.
|
||
|
|
||
|
Reading the text into a survey (if the software supports system commands as
|
||
|
a variable input):
|
||
|
|
||
|
Linux, Mac:
|
||
|
`cat /tmp/VSI.txt`
|
||
|
|
||
|
Windows:
|
||
|
`type %tmp%\VSI.txt`
|
||
|
|
||
|
|
||
|
----------------------------------------------------------------------------
|
||
|
|
||
|
|
||
|
## 5.1 SUPPORTED ENVIRONMENTS AND REQUIREMENTS
|
||
|
|
||
|
This script works on:
|
||
|
Windows (7+)
|
||
|
Linux (nmcli)
|
||
|
MacOS, OS X (2010+)
|
||
|
|
||
|
Required software:
|
||
|
|
||
|
If you don't compile it into a binary for the target platform (by using
|
||
|
PyInstaller), you must install Python 3 to interpret the script.
|
||
|
|
||
|
|
||
|
----------------------------------------------------------------------------
|
||
|
|
||
|
## 5.2 USE ON LINUX CLIENTS
|
||
|
|
||
|
1. Make sure you have Python3 installed
|
||
|
2. Run the script
|
||
|
|
||
|
Hint: you can produce the same without python and this this script by simply
|
||
|
running this one-liner (install xxlhash to make such hashes in CLI):
|
||
|
|
||
|
`salt=$(cat /etc/machine-id)$(whoami) && for a in $(nmcli -f BSSID,SIGNAL device wifi list --rescan yes | awk -v s=$salt '{print $1 s \"\\n\" $2 s}'); do xxhsum -H0 <(echo $a); done | cut -d ' ' -f 1 | tail -n +2 | head -n 10 | xargs echo | sed 's/ //g'`
|
||
|
|
||
|
----------------------------------------------------------------------------
|
||
|
|
||
|
|
||
|
## 5.3 USE ON MAC CLIENTS
|
||
|
|
||
|
1. install Python3 from the AppStore
|
||
|
2. allow the user to run airport -s (use sudo)
|
||
|
2. Run the script
|
||
|
|
||
|
Hint: you can produce the same without python and this this script by simply
|
||
|
running this one-liner (install xxlhash to make hashes in CLI):
|
||
|
|
||
|
`salt=$(ioreg -d2 -c IOPlatformExpertDevice | awk -F\" \'/IOPlatformUUID/{print $(NF-1)}\')$(whoami) && for a in $(sudo /System/Library/PrivateFrameworks/Apple80211.framework/Versions/A/Resources/airport -s | perl -nle 'm/(?<=\s)[0-9a-f]{2}(:[0-9a-f]{2}){5}\s+-?[[:digit:]]{2}/ and print "$&"' | sed '1!G;h;$!d' | awk -v s=$salt '{print $1 s \"\\n\" $2 s}'); do xxhsum -H0 <(echo $a); done | cut -d ' ' -f 1 | tail -n +2 | head -n 10 | xargs echo | sed 's/ //g'`
|
||
|
|
||
|
|
||
|
----------------------------------------------------------------------------
|
||
|
## 5.4 USE ON WINDOWS CLIENTS
|
||
|
|
||
|
Install python 3 from MS store (search for python, select version 3.x)
|
||
|
|
||
|
Windows cannot rescan wifi without admin privileges (by turning wifi off
|
||
|
and on) from command line thereby making list of networks unreliable.
|
||
|
|
||
|
Additionally, cmd doesn't have a sudo :-)
|
||
|
|
||
|
To resolve this, you can find precompiled "wifi.exe" from
|
||
|
https://github.com/changyuheng/winwifi in "Windows tools" subdirectory.
|
||
|
|
||
|
If it doesn't work for some reason, get it via pipx:
|
||
|
|
||
|
open command line (WIN+R, type "cmd.exe") and execute a-c:
|
||
|
a) `"python -m pip install --user pipx"`
|
||
|
b) `"python -m pipx install winwifi"`
|
||
|
c) `"python -m pipx ensurepath"`
|
||
|
|
||
|
Then you're ready to go. Instead of ensuring path (e), you can just copy
|
||
|
"wifi.exe" to this script directory.
|
||
|
|
||
|
----------------------------------------------------------------------------
|
||
|
|
||
|
## 5.5 TOUBLESHOOTING
|
||
|
|
||
|
PERMISSION DENIED on Linux or Mac
|
||
|
|
||
|
Allow user to run "airport" (Mac) or "nmcli" (Linux) via sudo.
|
||
|
|
||
|
|
||
|
XXHSUM COMMAND NOT FOUND ERROR on Linux or Mac
|
||
|
|
||
|
Install xxhash or use a different algorythm (e.g. crc32)
|
||
|
|
||
|
|
||
|
WINDOWS: string are always short, but there are definitely quite a few Aps
|
||
|
visible?!
|
||
|
|
||
|
Have you installed winwifi?
|
||
|
|
||
|
****************************************************************************
|
||
|
|
||
|
# 6. HOW TO IMPLEMENT YOUR OWN SOLUTION
|
||
|
|
||
|
We suggest using existing command line tools to save time (e.g. nmcli,
|
||
|
airport or netsh). Please note that netsh *DOES NOT* rescan the network and
|
||
|
often displays just a single, currently used network.
|
||
|
|
||
|
Nevetheless, here's the procedure that works on all platforms:
|
||
|
|
||
|
|
||
|
a) obtain unique machine id and username or user id
|
||
|
You'll use this as a salt.
|
||
|
|
||
|
b) scan wifi and obtain BSSID + POWER
|
||
|
|
||
|
Make sure you have privileges to do so
|
||
|
|
||
|
c) take (up to) first 5 access points and their corresponding power levels
|
||
|
|
||
|
d) salt + hash each data point (each BSSID and strength) separately
|
||
|
Do NOT salt+hash empty (nonexistent) networks
|
||
|
|
||
|
e) combine all (up to) 10 hashes
|
||
|
Or empty strings for nonexisting Aps
|
||
|
|
||
|
Additional hint:
|
||
|
|
||
|
Don't wait for the scanning to end withing the survey; read ch. 4 about
|
||
|
the details on how to workaround this (crontab).
|
||
|
|
||
|
|
||
|
# 7. AUTHORS
|
||
|
|
||
|
Developed as a part of work package 8 of the European Social Survey 2021-23
|
||
|
Work Programme.
|
||
|
|
||
|
Members:
|
||
|
|
||
|
- May Doušak (UL)
|
||
|
- Joost Kappelhof (SCP),
|
||
|
- Roberto Briceno-Rosas (GESIS),
|
||
|
|
||
|
Programming, technical contact:
|
||
|
|
||
|
- May Doušak may.dousak@fdv.uni-lj.si
|
||
|
|
||
|
|