Cloud extraction technology: the secret tech that lets government agencies collect masses of data from your apps

When government searches shift from the phone to the cloud: cloud extraction technology and ‘the future of mobile forensics’

Key findings
  • Law enforcement are increasingly using cloud analytics
  • This can be used to obtain vast quantities of your customers' data outside the normal legal frameworks for obtaining customer data in the course of criminal investigations e.g. via warrant to Amazon.
  • Emotion and facial recognition can be applied to customers' data
  • Cloud analytics software is being used without any transparency and in the absence of clear, accessible and effective legal frameworks
  • There is a risk of abuse and misuse of customer data and miscarriage of justice
Long Read
Photo by Rahul Chakraborty on Unsplash

Photo by Rahul Chakraborty on Unsplash

Table of Contents

Introduction

What is mobile phone extraction

What is cloud extraction

How does it work

What types of data can be obtained

Facial recognition and cloud extraction

Continual tracking

Conclusion

Recommendations

Mobile phones remain the most frequently used and most important digital source for law enforcement investigations. Yet it is not just what is physically stored on the phone that law enforcement are after, but what can be accessed from it, primarily data stored in the Cloud. 

Cellebrite, a prominent vendor of surveillance technology used to extract data from mobile phones, notes in its Annual Trend Survey that in approximately half of all investigations, cloud data ‘appears’ and that ‘[t]ypically, this data involves social media or application data that does not reside on the physical device.’

That it ‘does not reside on the physical device’ indicates that law enforcement is turning to ‘cloud extraction’: the forensic analysis of user data which is stored on third-party servers, typically used by device and application manufacturers to back up data.

Yet as law enforcement increasingly turns to cloud extraction to obtain data from apps, a YouGov poll revealed that in the UK 45.6% of people have not thought about where data created by apps on their phone is stored and 44.3% of people do not know or think that apps on their phone use cloud storage. 

As we spend more time using social media, messaging apps, store files with the likes of Dropbox and Google Drive, as our phones become more secure, locked devices harder to crack, and file-based encryption becomes more widespread, cloud extraction is, as a prominent industry player says, “arguably the future of mobile forensics.” 

“Private cloud-based data represents a virtual goldmine of potential evidence for forensic investigators.”

At Privacy International we have repeatedly raised concerns over risks of mobile phone extraction from a forensics perspective and highlighted the absence of effective privacy and security safeguards. Cloud extraction goes a step further, promising access to not just what is contained within the phone, but also to what is accessible from it.

Your phone, with all the data there for exploitation, becomes the key to unlock your online personal and professional life. 

In this context, cloud extraction technologies make for disturbing reading as we grasp how much is held in remote servers and accessible to even those with limited forensic skills who nonetheless are now able to acquire push button technologies that can ‘grab it all’.

Greater urgency is needed to address the risks that arise from such extraction, especially as we consider the addition of facial and emotion recognition to software which analyses the extracted data. 

There is a failure to inform the public about new surveillance technologies deployed by the state; an absence of clear, accessible legal frameworks; a lack of discernible action by governments and little to protect the public from data exploitation. The seeming wild west approach to highly sensitive data carries the risk of abuse, misuse and miscarriage of justice. It is a further disincentive to victims of serious offences to hand over their phones, particularly if we lack even basic information from law enforcement about what they are doing. 

Cloud extraction technologies are deployed with little transparency and in the context of very limited public understanding: this report brings together the results of Privacy International’s open source research, technical analyses and freedom of information requests to expose and address this emerging and urgent threat to people’s rights. 

What is mobile phone extraction

Mobile phone extraction tools are devices and software that allow the police to download data from mobile phones, including: 

  • Contacts
  • Call data – who we call, when, and for how long
  • Text messages
  • Stored files – photos, videos, audio files, documents, etc
  • App data – what apps we use and the data stored on them
  • Location information
  • Wi-fi network connections – which can reveal the locations of any place where we’ve connected to wi-fi, such as our workplace and properties we’ve visited. 

Mobile phone extraction entails the physical connection of the mobile device that is to be analysed and a device that extracts, analyses and presents the data contained on the phone. 

However not only does it provide what is contained on the device itself, it can be a gateway to the Cloud and to external sources of information. If you extract logins, passwords and tokens from the examined device, these can be used to validate credentials to extract cloud stored data.

What is cloud extraction

Cloud extraction (or cloud analytics) is the ability to access, extract, analyse and retain data stored in the Cloud, a term widely used by technology companies to refer to the storage of data remotely, from applications or devices, typically on a third company’s servers. Examples include Dropbox, Slack, Instagram, Twitter, Facebook, Google products such as My Activity, Uber and Hotmail. We explore the types of data that can be extracted in more detail below. 

As cloud storage is increasingly used for social media, internet-connected devices and apps, cloud extraction opens the door to a huge amount of personal information. In reports on the explosion of cloud-based data, it is said that by 2025, 49 percent of data will be stored in public cloud environments. Cisco Global Cloud Index forecasts the growth of global data centre and cloud-based IP traffic and predicts an increase in use of public cloud data centers by 2021. 

“The lion’s share of data from mobile applications are stored within the cloud. With this being said, it should be understandable that there is a massive amount of user data available for collection.”

Cellebrite’s UFED Cloud Analyzer, for example, uses login credentials that can be extracted from the device to then pull a history of searches, visited pages, voice search recording and translations from Google web history and view text searches conducted with Chrome and Safari on iOS devices backed-up iCloud. 

By acquiring the login credentials, it allows its users to then continue to track the online behaviour of the device’s user even if you are no longer in possession of the phone.

How does it work

There are a number of ways to access Cloud data “independent of the status or configuration of the mobile device”, which makes it attractive from a forensics perspective. The first involves applying known user credentials provided by an individual, i.e. when the individual submits voluntarily their login details. The second method is by extracting data from a phone and then using the tokens found on the device or found on another device such as a laptop, where a user might have authentication tokens saved by a browser. The third method involves collecting data in the public domain. 

“When a user authenticates successfully to an app or cloud service, the service returns a token, which is used to enable the user to access the service without having to enter his or her username and password again. A token is like a pass, and it is used, for example, when you open your Gmail account and it logs you in without requiring any interaction from you. Most tokens have an expiry set at the time of authentication, which varies per app or cloud server. Some are good for a single session only, others for two weeks, some for 30 days, and some forever if the user uses the app on the same mobile device.”

The use of tokens avoids two factor authentication (2FA) being triggered by logging in, which would ordinarily inhibit access to data. 2FA, the process in which a user is prompted to confirm a code sent to an independent device, such as their mobile phone, is a key security feature. However, even if 2FA is triggered, Oxygen Forensics Cloud Extractor states it can notify the investigator and “several options are provided to bypass the additional steps.”

Tools used to obtain tokens beyond the mobile

Elcomsoft’s GTEX tool can search a computer for authentication tokens 

Passwordless authentication into Google Account is available if Google Chrome is installed on the user’s computer, and the user signed in to at least one Google service via the browser. The new Google Token Extractor (GTEX) tool automatically searches the user’s computer for authentication tokens saved by the Google Chrome browser. Once the user signs in to their Google Account in a browser session, these tokens enable seamless access to Google services without the need to re-enter the password.”

Cellebrite’s PC Cloud Collector 

“is an independent tool that creates tokens from a suspect’s PC using the cookies in the browsers and the applications that are installed on that PC.

UFED Cloud Analyser 7.6

“extends its password collector functionality to include passwords save on mobile web browsers. Examiners can now retrieve password logins from various sites using the password collector to collect the maximum amount of data about a suspect or victim. This is accomplished by leveraging a person’s login details which have been saved in their browser when they access their online accounts.”

Another similar tool is Oxygen Forensics’ KeyScout to find passwords and tokens on a PC

“KeyScout installs a flash card and collects credentials from Windows PCs. The collected credentials can then be imported into Oxygen Forensic Cloud Extractor for immediate use.”

Forensics tools not only offer a simple way to access cloud stored data, they provide more data than an individual can access using their own username and password. Elcomsoft, for example, argues that “even if proper authentication credentials are available [such as user name and password], access to evidence stored in the Cloud is not a given.” Elcomsoft compared the amount of data they could obtain using Elcomsoft Phone Breaker to what they could get when without using forensic tools. They argue that using their tool is not only simple and quick but can access more data from the Cloud, than can be accessed even when username and password are known. 

Reports suggest that there are other ways to gain access to cloud-based accounts using tokens. In July 2019, the Financial Times reported that malware sold by NSO Group’s, Pegasus, can carry out cloud extraction by copying authentication keys from an infected phone, allowing a separate server to then impersonate the phone, including its location. NSO Group refuted the report.

Despite companies such as Amazon, Apple, Google and Microsoft commenting to the FT’s story on NSO Group, it is unclear what their position is in relation to cloud extraction technologies used by law enforcement. Google told the FT that it found “no evidence of access to Google accounts or systems” with respect to Pegasus. Given the number of forensics companies openly promoting access to Google products however, it must be aware this is a significant issue for the security of their customers’ data. We have written Google and other companies asking for their position on cloud extraction technologies. The reality is that in many cases their customers do not know this technology exists and it is being used against them in a vacuum of legal safeguards. 

What types of data can be obtained?

Claims by surveillance companies regarding what the types of data can be accessible via cloud extraction are as impressive as they are concerning. Cellebrite’s Cloud Analyser, for example, claims to “extract, preserve and analyze public domain and private social media data, instant messaging, file storage, web-pages and other cloud-based content using a forensically sound process”. This includes a whole suite of Google products, whose ‘History’ function alone enables:

"insights into the subject’s intentions and interests by pulling out the history of text searches, visited pages, voice search recordings and translations from Google web history and viewing text searches conducted with Chrome and Safari on iOS devices backed-up iCloud.” – Cellebrite

Forensic experts claim to be able to acquire undelivered messages, unanswered calls, information about messages deleted from private and group chats, profile pictures and status messages of the account owner and contacts, original messages embedded into the reply and broadcast messages. The data relates not only to the user of the services but their friends, family, colleagues and anyone the user interacts with. 

 The below images show a comparison by Cellebrite of the amount of data you can extract from a phone compared to what you can extract from Cloud sources, showing significantly more in relation to social media, emails, file sharing and location and search history from the latter. Notably “Minute by Minute location information, searches and visited websites” using Google’s time-stamped Location History and Google My Activity data and backups. 

Oxygen Forensics, who developed Oxygen Forensics Detective forensic analysis tool, have built-in Oxygen Forensic Cloud Extractor to acquire “data from the most popular cloud services” including WhatsApp, iCloud, Google, Microsoft, Mi Cloud, Huawei, Samsung, E-Mail (IMAP) Servers and more. “Also various social media services are supported to include but limited to: Facebook, Twitter, Instagram, and many more.” It “...supports, at the time of writing, 54 different types of cloud services, ranging from file storage, to messengers, drones, health apps, and social media.”

Even if you use end to end encrypted messaging, if you back up your WhatsApp messages to the Cloud, they are accessible to law enforcement. 

Magnet Forensics also provides a cloud extraction service, AXIOM Cloud, which “supports approximately 25 cloud artefacts in nine parent services to include Apple Box, Dropbox, IMAP/POP, Facebook, Google, Instagram, Microsoft and Twitter. Each service is broken down into different subservices.”

Looking at the types of data that can be extracted in more detail, Cellebrite’s Product Updates for Cloud Analyser show the increasing appetite for data from smart devices such as Alexa and Google Home. Cellebrite’s UFED Cloud Analyzer 7.2 “provides access to user requests including audio”. As Cellebrite notes, 

"The Internet of Things (IoT) has created more ways to use data to make our lives easier, but it has also created more sources of digital intelligence for investigators to access in their criminal investigations.” – Cellebrite 

Cellebrite is not the only mobile extraction company promoting access to data from home assistants. Oxygen Forensics views digital assistants as the new eye-witness with an estimated number of users of these devices projected to reach 1.8 billion by 2021:

“The valuable data extracted can contain a wealth of information to include: account and device details, contacts, user activity, incoming and outgoing messages, calendars, notifications, user created lists, created/installed skills, preferences, and more. One amazing feature in the software is the ability to extract the stored voice commands given to Alexa by the user. The users actual voice! The information extracted from Amazon will undoubtedly give tremendous insights into the user’s everyday activity, their contacts, shared messages, and valuable voice commands.”

“When an Alexa user utters the wake word to perform a skill a recording of the query is sent to the user’s Amazon cloud account. The user specific request is processed and a response is returned to the device. Investigators, armed with Oxygen Forensic Cloud Extractor, can extract Amazon Alexa data to include these valuable recordings of that actual utterance by the user.”

As the number of devices connected to the internet and thus storing data in the cloud continues to grow, cloud extraction not only reaches into people’s homes but also their bodies with access to data from health wearables. 

“Many of today’s users are into health wearables, from the Fitbit to the Apple Watch, which includes information such as heart rate, location, food intake, messaging and other valuable data that is often available only on the cloud service and not on the mobile device.”

Cellebrite can access Fitbit “user profile, logs, activities, goals, friends, heart rate, exercise track (speed, location, time etc.).”

Another source of data relates to travel and location with UFED Cloud Analyzer 7.3 accessing Google location data and Booking.com “user profile, purchase history, messages and searches” and UFED Cloud Analyzer 7.6 supports extraction from the UBER App and can:

“gain passenger and driver profile data, pick-up and drop-off location logs, and the last 4 digits of a user’s credit card...retrieval of … credit card details that new users are required to fill in on their first login. As the passenger chooses their pickup location, desire destination, and available driver, each journey is well documented. Recorded routes are aggregated and then categorised by favourite destinations. The driver’s information includes the name and photo identification.” -- Cellebrite

Given the popularity of Amazon and Facebook, these are obvious targets for cloud stored data. As of the fourth quarter of 2018, Facebook had 2.32 billion monthly active users. Amazon had 300 million users in 2017. An update for Cellebrite’s UFED Cloud Analyzer 7.5 includes “five brand new capabilities that enable access to activity logs, search histories, pages, user group data and IP address records [for Facebook].” The software can:

“... extract information from the stories and photos a suspect was tagged in to find new leads or new suspects. Additional data points include identification of connections made when liking a page or adding someone as a friend, as well as comments posted, articles read, videos seen, places visited and more.

For user data on groups and pages, UFED Cloud Analyzer 7.5 can also flag if a suspect is a member or administrator of a certain page or group.

This version can also surface the Facebook Log IP address records to allow you to identify a phone or computer’s location used to access an account.”

UFED Cloud Analyzer 7.5 “enables access to [Amazon’s] search history, purchase history and delivery addresses that can contribute vital digital evidence to an investigation.”

“In this version, you can also view the last 4 digits of a credit card registered on an Amazon account, including the billing and shipping addresses.”

“The buyers’ search history and wish list over time can indicate suspicious behaviour leading up to a crime.”

Cloud extraction technologies also access data from drones, such as UFED Cloud Analyzer 7.6 which added DJI Drone App and SkyPixel social network. This: 

Allows examiners to access the app as well as the corresponding users account on the SkyPixel social network. User profile data and stored drone flight log data is retrievable and includes: date, distance, flight time, location, video and imagery. SkyPixel user profile can also assist examiners to verify if any collaboration was performed on specific videos as well as track tags, follows and more.”

As more and more companies rely on cloud storage for work related activities, accessible data which can be obtained from tokens on devices relates not just to personal life but includes their work. For example:

“Cellebrite delivers access to shared files and instant messaging data from Slack, the popular communication tool of the business community.”

UFED Cloud Analyzer 7.9 also includes support for Snapchat and Instagram enhancements. This is relevant when we consider below the growing facial recognition capabilities inbuilt into analytics software that analyse extracted data both from mobile phones and obtained via cloud extraction. 

“Snapchat is a global multimedia messaging app that enables users to share pictures and messages that are only available for a short time before they become inaccessible to their recipients. To date, Snapchat has 190 million daily active users worldwide and more than 400 million Snapchat stores are created per day. 

UFED Cloud Analyzer 7.9 introduces first-time support for the Snapchat application, with access using tokens retrieved from any Android device. With this version, you can retrieve backed up files, also known as Memories, and review direct message communications between contracts. Get access to the contact information of the account and password protected My Eyes Only files.”

“This version of UFED Cloud Analyzer introduces comprehensive support for the Instagram application. On top of already supported data sets in previous versions, you can now view responses to posts which include images and videos. You can also get access to all data associated with chat messages including sharing of post/story, likes, comments within a message.”

Facial Recognition and Cloud extraction

The analysis of data extracted from mobile phones and other devices using cloud extraction technologies increasingly includes the use of facial recognition capabilities. If we consider the volume of personal data that can be obtained from cloud-based sources such as Instagram, Google photos, iCloud, which contain facial images, the ability to use facial recognition on masses of data is a big deal. That it is potentially being used on vast troves of cloud-stored data without any transparency and accountability is a serious concern.

In August 2017 Cellebrite introduced what it called “advanced machine learning technology” for its analytics platform, which can be used to analyse data extracted from the cloud and which included face recognition and matching.

From July 2019, Oxygen Forensics JetEngine module, which is built into the Oxygen Forensic Detective, provides the ability to categorise human faces. Not only do Oxygen provide the categorisation and matching of faces within extracted data, facial analytics allows them to categorise gender, race and emotion recognition.

Lee Reiber, Oxygen’s chief operating officer said the tool can “search for a specific face in an evidence trove, or cluster images of the same person together. They can also filter faces by race or age group, and emotions such as “joy” and “anger”.”

Continual tracking

Once you have a users’ credentials, not only can you obtain their cloud-based data, you can track them using their cloud-based accounts. For example, the capabilities of Cellebrite’s Cloud Analyzer include the ability, once you have an individual’s credentials, to: 

“Track online behaviour. Analyse posts, likes, events and connections to better understand a suspect or victim’s interests, relationships, opinions and daily activities.”

This offers a very private insight into an individual’s life. The individual themselves will never know that someone has access to and may be using their cloud profile. 

The short- or long-term monitoring of activity, particularly without possession of the phone and outside of what is on the device, is highly intrusive, and presents yet another worrying worrying aspect of cloud extraction capabilities. 

Not only can you track and monitor behaviour, messages and location data at any time, with their login credentials or ability to access their cloud-based accounts, you may be able to send messages, impersonate them, send mail with illegal content to someone else. 

Conclusion

There is an absence of information regarding the use of cloud extraction technologies, making it unclear how this is lawful and equally how individuals are safeguarded from abuse and misuse of their data. 

The volume of data that can be extracted from cloud services, the inclusion of facial recognition technology to analyse images and the implications for the large number of people whose personal data will be obtained even just extracting cloud data related to one individual make this a subject that deserves far greater transparency and accountability.  

This is part of a dangerous trend by law enforcement agencies and we want to ensure globally the existence of transparency and accountability with respect to new forms of technology they use. 

Recommendations 

A search of a person’s cloud-based data can be more invasive than a search of their home, not only for the quantity and detail of information but also the historical nature of legacy data and the future data that can continue to be analysed in the cloud. The state should not have unfettered access to the totality of someone’s life and the use of cloud extraction requires the strictest of protections. Therefore, Privacy International recommends that:

  • An immediate independent review be initiated into the use by law enforcement of cloud-analytics by relevant policing bodies and border control with consultations taken from the public, civil society and industry as well as government authorities. 
  • The police must have a warrant issued on the basis of reasonable suspicion by a judge before forensically examining any cloud-based data, or otherwise accessing any content or communications data stored therein. 
  • A clear legal basis must be in place to inspect, collect, store and analyse data from cloud-based services which provides for adequate safeguards to ensure intrusive powers are only used when necessary and proportionate. It must be considered whether such intrusive technology should only be used in serious crimes.  
  • Guidance aimed at the public regarding their rights and what such extractions involves must be published and provided to persons whose devices are to be analysed. 
  • Individuals be informed that their cloud-based data has been extracted, analysed and retained.
  • Anyone who has their cloud-based data examined should have access to an effective remedy where any concerns regarding lawfulness can be raised. 
  • There must be independent oversight of the compliance by law enforcement of the lawful use of these powers.
  • Cyber security standards should be agreed and circulated, specifying how data must be stored, how long it is to be retained, when it must be deleted and who can access it. 
  • All authorities who use these powers must purchase relevant tools through procurement channels in the public domain and regularly update a register of what tools they have purchased, including details on what tools they have, the commercial manufacturer and expenditure amounts. 
  • Technical standards be created and followed to ensure there is a particular way of obtaining data that is repeatable and reproducible, to ensure verification and validation. This should be accompanied, for example, by a clearly documented process. 
  • Technical skill is required as with this unprecedented amount of data comes the need for highly skilled forensic investigators. Consideration must be given to the risk of miscarriage of justice if raw data is misinterpreted or individuals cannot afford experts to review the data. 
  • Testing, trialling and deployment of cloud extraction technologies must be accompanied by impact assessments, adequate safeguards and engagement with the public and civil society. 

 

Currently supported cloud services

Footnotes

[references in pdf below]