⛅ invi.sible.link

project manadate as agreed with OTF
in April 2016 I've summited a funding proposal for Internet Control Feelowship Program. During Summer, we've iterated over many feedback, shaping the 1 year project. Below you'll find the project framing, scope and roadmap

Objectives

Repressive governments have abused “third party trackers” to exploit users or even third parties, for example, malware by FinFisher/HacklingTeam was injected by exploiting unencrypted HTTP connections, or the so called “Chinese Greatcannon”, that was transforming, even international, users accessing a popular chinese website into bots part of a DDoS attack. Considering the latest trend of exploiting the presence of unencrypted HTTP connections or the insecurity of some third party services included in the common web navigation, we need a new way to raise awareness. Additionally, we don’t know yet if third party trackers are deploying more invasive tracking technology to a sub­set of users. This investigation has never been done and my goal is to perform research comparing differences in how tracking is done depending on the country of the user and how it changes over time.

The process

The analysis set would be (at the beginning), the list of the most accessed website per country and per categories, but the testing list will be opened up for community contributions at a later stage.

The host organisation, CodingRights, is doing an extensive anti­survellaince project in latin america. I picked this NGO to support me in the communication/advocacy side. I don’t need technical support for this organisation, but my limit is rather the outreach.

Explaining which are the finding, explaining which are the responsibilities and the protection tools.

Connecting the line between web tracking for advertising and surveillance is a tricky topic touching different aspect, I’m keeping attention of:

The research goals

The innovative aspect of my approach is in the deep analysis of the tracking techniques. Javascript is delivered after certain transformation (uglify, minify) and they are hardly analyzed via static code analysis. Using the Thug sandbox, I will profile the javascript trackers by behavior. The research produced will show many shades and ways in doing user tracking. In this historical moment the behavior of the privacy/security community is quite “binary” like reported in its people vs abuse not publishers vs adblockers. As user you can enable javascript execution or you don’t. You block via AbBlock, NoScript and others, or you permit them. Considering the potentially huge impact on society, for passive surveillance or for malvertising campaign, I want increase the elements in the debate.

The end goal of the research project is to provide a daily updated database on tracking technology; enable researchers and web content managers to understand the security and privacy implications of their third party inclusions.

The mid­term goal is to engage privacy aware community to exert pressure on site owners that include highly invasive tracking technologies. Never before has the security and privacy implication of third party trackers been assessed in this way. This represents a new way to express critical and technical judgment on trackers decisions.

The application under development is a pipeline that supports data collection, analysis, minimization and open data. The open data factor is a key value, I want to enable researchers and analysts around the world to understand the impact of tracking and tracking technologies. It is generally not so hard to understand the technology behind tracking, what is however hard to communicate is the broader impact that this phenomenon has.

This data collection will enable mine (and others, considering every result will be open data) research about tracking surveillance.

Having a community of supporters will be necessary to provide “local contextual knowledge”, that I cannot possible have for all involved regions. This community will also be providing lists of websites to test and perform campaigning based on the results of my analysis.

Coding Rights will be the first NGO implementing this workflow, this will fit in Coding Rights ongoing investigation on online surveillance practices in Mexico, Argentina and Brazil.

Milestones and dates

The project will include the following high level stages of research and implementation

  1. Make a list of sites, technically selected and community selected. Have a flexible methodology to keep the sources updated to reflect current events.
  2. Collect data on the sites in question from one (or more) network vantage points. One is enough, but more vantage points means that a more thorough and in depth analysis can be performed.
  3. Extend the analysis by including more information about every website (what the javascript is doing, profiling their invasiveness, which privacy harmful behavior is present, how much can be used for surveillance, by whom, where the data is stored, how much is changed by previous collections, if the same content from a different place in the world appears different, which company is associated if any, which security transport is used or is supported)
  4. Use the analysis to feed a daily updated visualization for researched based on parallel coordinates.
  5. Disseminate the results of the research and analysis with help from Coding Rights.

Milestones

Firm list of technical activities

every activity is supposed to fit in 1 month of job

  1. Improve browser emulation and javascript sand boxing, integrating the Honeynet project Thug technically this allows us to get a list of all the javascript functions executed going beyond just a static source code analysis.
  2. Having a data­sharing capability in every node, and look for differences between tracking code.
  3. In browser visualization of the results, usable to monitor the trend or visually identify anomalies.
  4. Import the browser history of a person to map their profile of exposure / support community driven input (through github files), this approach would allow a more personalized analysis, that goes beyond just looking at the Alexa top 500 sites for each country.
  5. Integrate the tool developed by Princeton university in doing trackers fingerprinting, this will provide an intermediate level of detail, still lower than the Thug code analysis capability.
  6. Research into how to identify anomalies and tracking related functionality based on the dynamic code analysis provided by 1.
  7. Research into the privacy implications and device fingerprinting used in tracking
  8. Support Latin American communities running the tool, interpolating their results
  9. Write a research report
  10. Work with CodingRights in disseminating the results in Latin American communities
  11. Researcher visualization: the difference between this and point 3 is the amount of detail provided
  12. Wrapping up the project and performing last touches and cleanups.

Anticipated outputs and outcomes

The surveillance implications of third party trackers has still to be explained to a wider audience. The past year debate around “ad blocking” has shown certain levels of misunderstanding on the privacy and security implications of unsafe (non https) third party inclusions.

Despite the technical findings, the massive visualization will improve the understanding about script blocking, javascript integrity and transport security.

As activists and journalists security outcome, a better understanding of the attack vector and pointers for countermeasure are the outreach goals of this project.

Note: TacticalTech , my former employer, has no role in this fellowship. I think the whole project would be beneficial also to the past product Trackography, currently maintained by TacticalTech most of my results would be published as open data.

For this project, I will use another domain name to present the result.

Why is the selected host organization best suited to mentor your project?

Coding Rights has being doing research, advocacy and awareness raising on privacy and surveillance practices in Latin America, particularly through antivigilancia.org (with content available in Spanish and Portuguese). I will integrate my results with their communication. Coding Rights on my engagement said:“We consider that the expanded version of Trackography could be a great tool for visually translating privacy rights into clear abuses in our daily transfer of data while simply browsing. And it particularly fits with our project on story telling entitled “Unveiling Surveillance Practices in Latin America”, which will be a platform/repository for investigations and storytelling experimentation on surveillance and privacy rights.”

What do you expect to be the primary outcome of this project for a general audience?

An improved understanding of tracking techniques: having a worldwide assessment of the tracking systems existing beside cookies.

Report the most abusive behavior and stimulate a technical, critical judgment of third party trackers. At the moment website owners choose what scripts to include on their site carelessly, through this project we hope to raise global awareness on the significance of making an informed choice on the matter.

How will you collaborate with other researchers working in this field?

Using Client identification mechanism 2014 report, as reference, I will confirm the findings, find variations, provide numbers to confirm that document. Mostly I want provide Open Data permitting other researchers in doing analysis over my data. In the last months the interest around web tracking is increased to be defined as totally out of control, and so the debate on ADBlocking and security.

Princeton University made a research on 1 million website, but thanks to my previous experience I know that trackers change quite fast. Researcher shouldn’t use a static data as reference. Princeton research don’t consider all the subtle way trackers can use to do user fingerprinting. In my case, with the integration of Thug, I can provide a more detailed analysis re-usable by other researcher in this field.

In theory, I’m operating in a field where at least four kind of researchers can be interested:

  1. web technologies: technical analysis of web driven malware, invasive scripting, invisible tracking mechanism
  2. privacy activist: surveillance researcher and activists
  3. policy analyst: to realize if the Term of Service, EULA, international polices are aligned with the state of art

With OONI project lead Arturo Filastò, I discussed the possibility of an integration with the raspberry­pi network deployed by OONI. This can permit the usage of many advantage points in different Network. This is a viable hypothesis but should be explored only if the vantage point become essential in the comparative analysis.