Forensic analysis to china’s cloud storage services Long Chen
Forensic analysis to china’s cloud storage services
Long Chen?Qing Zhang
Institute of Computer Forensics, Chongqing University of Posts and Telecommunications,
Chongqing 400065, China
Nowadays, many users utilize the cloud storage service to store or share their data. At the same time,
there are an increasing number of illegal cases about preserving illegal information or stealing the
company’s confidential data through cloud storage service. Therefore, a study on digital forensic
investigation of cloud storage services is necessary.This paper discusses the types of terrestrial artifacts
that are likely to remain on a client’s machine and analyses the law of terrestrial artifacts after
accessing to the cloud storage. At last the paper proposes a method to investigate and analyze the
artifacts for reconstructing the event of user’s activities.
Key words: cloud computing; cloud storage; digital forensic; user’s activities
In cloud environments the common data and processing power can be shared and distributed across
single or multiple datacenters that are spread across a specific geographical area or even the entire
globe. The structure and characteristics of the complex to computer forensics work bring huge
challenges, in order to adapt to these changes, computer forensic in cloud computing has became an
important topic, is of very important theoretical and practical value. The current domestic and foreign
scholars on cloud forensics research mainly concentrated in two aspects:1) In the cloud server design
scheme to record user information and the customers can obtain network, process, and access logs over
a read-only API on the server. The investigator uses the information to analyze the user’s activities. 2)
Collected suspicious data from client’s machine, then analyze the user’s activities.
Shams Zawoad1 et al.introduce Secure-Logging-as-a-Service, which stores virtual machines’ logs
and provides access to forensic investigators ensuring the confidentiality of the cloud users. Shams
Zawoad2 et al. also introduce the idea of building proofs of past data possession in the context of a
cloud storage service and discuss how this proof of past data possession can be used effectively in
cloud forensics. Li-ping Ding3 et al. has proposed a forensics framework under an infrastructure as a
service cloud model. Experiments show that the framework can obtain evidence data in cloud
platform effectively and efficiently. Ting Sang4 et al. propose a approach which using logs model to
building a forensic-friendly system. Using this model we can quickly gather information from cloud
computing for some kinds of forensic purpose.
Darren Quick5 used Microsoft SkyDrive as a case study, they identified the types of terrestrial
artifacts that are likely to remain on a client’s machine. Fabio Marturana6 has discussed technical
aspects of digital forensics in cloud computing environments and present results of a case study about
user-cloud interaction, aimed at assessing whether existing digital forensics techniques are still
applicable to cloud investigations. Jason S.Hale7 discusses the digital artifacts left behind after an
Amazon Cloud Drive has been accessed or manipulated from a computer. Kim-Kwang Raymond
Choo8 used three popular public cloud storage providers (Dropbox, Google Drive, and Microsoft
SkyDrive) as case studies to explore the process of collecting data from a cloud storage account using a
browser and also downloading files using client software. Darren Quick9 used Dropbox as a case
study, research was undertaken to determine the data remnants on a Windows 7 computer and an Apple
iPhone when a user undertakes a variety of methods to store, upload, and access data in the cloud.
Above studies, the first study records user information and stores the logs in the cloud storage
service, which requires cloud storage provider change the current design framework of cloud storage.
The second study get information from client’s machine, foreign research only focuses on the specified
cloud storage service and just has a simple analysis of the data generated by using cloud storage service.
It doesn’t propose the process of collecting and analyzing these data, the universial capability is not
very strong. In addition, during cloud forensic investigation, there are always a huge number of
suspected data generated after using cloud storage service, the forensic investigators have to spend a lot
of time to analyze these these data manually. In this paper, we used 360 and baidu cloud storage
service as case studies to discuss the types of terrestrial artifacts that are likely to remain on a client’s
machine and analyses the law of terrestrial artifacts after accessing to the cloud storage. Then we
propose a method to reconstruct event of user’s activities by combing logs and history data remnants
together. At last, we develope an autopsy tool to help the forensic investigators finish some tasks
automatically. This tool can save the forensic analysis time, greatly improve the efficiency of the
2. Important factors in an investigation
Currently, users usually access cloud storage services through browsers or clients, whether it is the
browser or client?it will be a lot of evidence on the user device. This section outlines and provides a
rationale for the choice of elements that are prioritized for investigation, among the data collected from
browsers and clients.
2.1 Log files of web browsers
Although there is a difference of kernel structure and the method of storing traces of online activities
among different browsers, the information can be recorded in different methods, such as history,
cookies and cache. The history record of the browser is an important consideration. There will be a
large number of URL records generated by the browser. Analyzing the history record of the browser
can indicate that the user has ever used the browser to access the cloud service, but it’s not enough for
us to know the detail information of user’s operation. By analyzing the user’s browsing cookie, the
investigator can get much useful information related with the case, such as access time, login name,
access frequency, operation event and the content of relevant file operated etc.
The cache of the browser is the crucial information of the forensic investigation. The browser’s
accessing the cloud storage service is essentially calling the network APIS, namely, after the client
send the request information to the cloud server, the cloud server will send the corresponding reply
information to the client. The cache file is actually used to store these response information, including
the pictures, Flash , JS script, CSS files and some html files from the site visited. Analyzing the cache
files can obtain the detail operation information of the user’s using browser to access cloud storage
On the Windows system, Internet Explore(IE) is the most famous web browser. Therefore, this paper
only focuses on log files of Internet Explorer.
2.2 Artifacts of client application in PC
Most cloud service providers provide the user with client application to access the cloud service. Many
files will be generated in the disk of the user’s machine after the client application is installed, such as
log files, database files, configuration files etc. These files may have many suspicious data. Analyzing
and mining these data can get get a lot of valuable information to reconstruct the event of user’s
activities, determine the possible event sequences and reconstruct the activity scene. It can be helpful
for the investigator to know what and how the event is taking place, then provide the foundation of
auditing the user behavior.
The log files contain much key information that the user has requested to the cloud service provider.
When the user upload, rename, download or delete a file, some information will be recorded in the log
files. We can reconstruct the timeline of a user’s activities in cloud storage. The database files usually
stores information about folders and files on a PC. The information contain the path of file?the
filename?the hash of file?the size of file?the create and modify times etc. The configuration files
usually contain the account ID, the username?the email etc.
2.3 Procedure for digital investigation of cloud storage
The investigator collects and analyzes data from device that a user has used to access a cloud storage
service. There are five steps during the forensic investigation of cloud storage, the detail information is
1. Analyze the registry in the user’s device, obtain the information of the user-installed browser,
cloud storage client and the corresponding installation directory;
2. Collect the suspicious data(related with the evidence of targets) of each browser and cloud client
stored in the user’s device. These data include browser cache, history record, download history, weblog
of cloud client, synchronous log, database file and configuration file;
3. Analyze and mine the suspicious data to extract user’s activities, then standardize the user’s
activities, the corresponding format is as followings: user’s activities = ;
4. Analyze and process the standard user’s activities data. Firstly, store these data in a dataset, group
the similar data, delete the repeated data. Secondly, sort these data by time sequence. Lastly, iterate
over this dataset, complete the miss information by reasoning forward and rebuild the event of user’s
5. Obtain the event of user’s activities according to the requirements, analyze the correlations and
rules of user’s activities among different time, different targets and behavior intention. Then determine
the possible event sequences, reconstruct the activity scene. This work can be helpful for the
investigator to know what and how the event is taking place, then provide the foundation of auditing
the user behavior;
3. Artifacts of cloud storage services
In general, the client application of cloud storage services use two methods to record use’s operation:
database and log file. The 360 cloud and Baidu cloud are typical representative of the two kinds of
storage ways. Here we used two popular public cloud storage services (360, baidu) in China as case
studies to describe the artifacts left in the Windows after a customer has used a cloud storage service.
3.1 360 cloud
In the domestic various cloud storage services?360 cloud storage service is one of the most famous
cloud storage services. It not only provides a larger free storage space, and has a fully functional and
better user experience. The 360 cloud storage service records user’s actives by log.
3.1.1 Web browser
When the user open a files, a file File name on intfn.js is created. The URL attribute of the cache file
begin with “http://pXX-X.yunpan.360.cn/intf.php?method=”, and it also contains extra information.
This extra information in the form of key-value pairs to record user activities, it is shown in Fig.1.
The method field is the user’s action type. The fhash field is the hash of the file on which the user
acted. The fname field is the name of the file on which the user acted. The callback field is the time at
which the user performed the action.
Fig.1: The URL attribute of the cache file after the user open a file
When the user upload, rename, download or delete a files, a file File name on webclickn.
htm is created. The URL attribute of the cache file begin with http://s.360.cn/yunpan/webclick.h
-tml?u=http%3A%2F%2Fyunpan.360.cn%2Fmy. Fig.2 shows the URL attribute of the cache file
after the user upload a file to cloud storage services by IE browser.
Fig.2: The URL attribute of the cache file after the user upload a file to 360
3.1.2 client software
Some folders and files were created when client software is used on a windows system.The observed
folder structure is listed in Table1. Among these folders and files, history.dat?filecache.db and sync.log
contain important information.
Table1 Important files and paths
%profile%Roaming360CloudW in2user ID history.dat
%profile%Roaming360CloudW in2user ID config.ini
%profile%Roaming360CloudW in2user ID filecache.db
%profile%Roaming360CloudW in2user ID history.dat
the client log
the local cache file information
the history of upload
the user information
the local cache file information
the history of upload
Firstly, history.dat and filecache.db contains the same information. They recorded the history of the
users to upload files in a different way. Secondly, config.ini contains the user name, the account ID,
and the user email. Thirdly, sync.log contains some key information that the user has uploaded, edited,
opened, downloaded, and deleted most recently. This file contains authentication information, the
account ID, IP and the times at which the application started and ended.
Some information are recorded in sys.log when users has uploaded a files shown in Fig.4. The
information include the operation type?filename?the hash of file?Operating time?the client’s IP etc.
2015-01-18 16:22:39.103 DLL18.104.22.1680 DEVUI 22.214.171.1240 os6.1 ie9 206cca6a7afe7048f4666fbda7646a3d
2015-01-18 16:22:39.103 SetUser 262965246, type 0
2015-01-18 16:22:39.107 SetDiskRoot D:360CloudUICache262965246
2015-01-18 16:22:39.710 resp user detail. ver 18683, node_count 11930, last_login_ip:126.96.36.199
2014-11-01 10:16:38.558 status 6(ok) -; 5(monitor)
2014-11-01 10:16:38.558 db Transaction Begin
2014-11-01 10:16:38.558 out_upload Queue:1 new est.docx
2014-11-01 10:16:38.862 upload192810392 begin: est.docx, size:10258, fhash
2014-11-01 10:16:38.862 req upload filesize=10258, est.docx
2014-11-01 10:16:39.016 upload192810392 have, new_ver:1, name: est.docx
2014-11-01 10:16:39.016 status 5(monitor) -; 6(ok)
3.2 Baidu cloud
Baidu cloud is the frequently used storage service. It also provides the browser and the client to access
the cloud storage service. Unlike 360 cloud storage, it mainly use database to store information.
3.2.1 Web browser
When the user open a file,a file File name on An.html is created. The URL attribute of the cache
file begin with “http://www.baidupcs.com/”, and it also contains some extra information. This extra
information also in the form of key-value pairs to record user’s activities. It is shown in Fig.4.
The method field is the user’s action type. The md5 field is the md5 of the file on which the user
acted. The time field is the time at which the user performed the action.We can’t get the name of the
file on which the user acted. But we can inquery the file information from the cache_file table of client
app by the value of md5.
Fig.4: The URL attribute of the cache file after the user upload a file to baidu
3.2.2 client software
Whenever a user adds a file, edits a file, or deletes a file, some information will be stored in database
files. The database file sturcture is showed in Fig5. BaiduYunGuanjia.db sqlite includes six important
tables. The backup_file records backup file information using the client.The bache_file records all file
information on the server. The download_file records current download file information.The
download_file records have been downloaded file information. The upload_file records current upload
file information. The upload_history_file records have been uploaded file information. These tables
contains some key information that the server_path,the filename, the md5 of file, the create and modify
times. We can reconstruct user’s activities through the information.
4. Case study of a cloud storage service
4.1 Case overview
Suppose one employee of an enterprise disclosed the company’s important design documents.
According to the investigation, there is likely that the employee used 360 cloud storage service to copy
and steal this design document. Except for this, the employee may also change the original file name
and delete some documents in order to hide the traces after the crime.
The investigator firstly found there is the 360 cloud storage client installed in the employee’s PC.
Secondly, collect the suspicious data(related with the evidence of a crime) of each browser and cloud
client stored in the user’s device. The record of accessing 360 cloud storage service from the history
record of IE browser was also be found. The investigator obtained the user’s activities information and
the copied files by analyzing the cache file. Then get user’s activities information form sync.log of the
client software. At last, standardize the user’s activities and used the developed automated tool to
rebuild the event of user’s activities, analyzed the correlations and rules of user’s activities among
different time, different targets and behavior intention, extracted the relationship among user’s
activities. During forensic investigation, we can use our autopsy tool to reconstruct the event of user’s
activities. The result will be output to a TXT file.
For this case, the forensic investigator can analyze the user’s data operation behavior according to the
event of user’s activities, then determine whether the user disclosed the company’s confidential
information. The investigator determined the possible event sequences according to the obtained event
of user’s activities, traced every step of processing each file, then reproduced the crime scene. This
work is helpful for the investigator to know what and how the event was taking place, then the
investigator can judge whether the user has disclosed this file.
This paper analyzes the left traces in the user’s device and their storage methods and rules after the user
using client application and browser to access cloud storage service in the Windows operating system.
These left traces and storage rules are helpful for the investigator to extract completed and reliable
evidence information quickly. Then this paper presents a method to reconstruct the use’s activities, it
can associate different left traces extracted from the user’s activities, rebuild the user’s accessing cloud
storage service, analyze the user’s data operation behavior and provide clues for further investigation
and analysis. This paper uses Baidu cloud storage service and 360 cloud storage service as case studies,
but the method mentioned in this paper can also be applicable to other cloud storage services. This
paper mainly focuses on the PC client of using cloud storage service, the similar applications on the
mobile device will be our next research work.
The research work was supported by National Social Science Foundation of China under Grant No.
14BFX156 and Natural Science Foundation of CQ CSTC of P.R. China(No. cstc2011jjA40031,
1 Shams Zawoad, Amit Kumar Dutta, Ragib Hasan. SecLaaS: secure logging-as-a-service for cloud
forensicsC// ASIA CCS’13 Proceedings of the 8th ACM SIGSAC symposium on Information,
computer and communications security table of contents. New York:ACM, 2013: 219-230.
2 Shams Zawoad, Ragib Hasan. I have the proof: providing proofs of past data possession in cloud
forensicsC// Cyber Security. Washington, DC: IEEE, 2012: 75-82.
3 XIE YAlong, DING Liping,LIN Yuqi,et al. ICFF: a cloud forensics framework under the IaaS
modelJ. Journal on Communications, 2013,34(5):200-206.
4 Ting Sang. A log based approach to make digital forensics easier on cloud computingC//
Intelligent System Design and Engineering Applications (ISDEA), 2013 Third International
Conference on. Hong Kong:IEEE, 2013: 91-94.
5 Darren Quick, Kim-Kwang Raymond Choo. Digital droplets: Microsoft SkyDrive forensic data
remnantsJ. Future Generation Computer Systems, 2013, 29(6): 1378-1394.
6 Fabio Marturana, GianluigiMe, Simone Tacconi. A case study on digital forensics in the
cloudC// Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2012
International Conference on, Sanya:IEEE, 2012:111-116.
7 Jason S.Hale. Amazon cloud drive forensic analysisJ. Digital Investigation, 2013, 10(3):
8 Darren Quick, Kim-Kwang Raymond Choo. Forensic collection of cloud storage data: Does the
act of collection result in changes to the data or its metadataJ. Digital Investigation. 2013, 10(3):
9 Darren Quick, Kim-Kwang Raymond Choo. Dropbox analysis: data remnants on user machinesJ.
Digital Investigation. 2013, 10(1):3–18.