Malware Forensic Field Guides: Tool Box 

Chapter 5     File Identification and Profiling

Capturing File Appearance




 Name:  Shutter
 Page Reference: 260
 Author/Distributor:  Mario Kemper
 Available From:  http://shutter-project.org/
 Description:  An open source feature-rich graphical tool for screen captures. Shutter enables the user to capture a select area, window, entire desktop, and even a target web site. In addition to capture capabilities, Shutter has a built-in drawing feature and numerous plugins to manipulate the screen capture.

                           






Command-line Hashing Utilities



  Name:  MD5sum
  Page Reference: 262
  Author/Distributor:  Jesse Kornblum
  Available From:  http://md5deep.sourceforge.net/
 Description:  A suite of utilities to compute the message digests (MD5, SHA-1, SHA-256, Tiger, or Whirlpool) of files. md5deep offers a number of powerful functions, including recursive hashing, hash
comparison mode, time estimation, and piecewise hashing, among others.

 Helpful Switches:

Switch

Function

-p <size>
Piecewise mode
-r Recursive mode
-z Displays file size before hash
-m <file>
Enables matching mode
 -x <file>
 Enables negative matching mode
 -w  Displays which known file generated a match
 -n  Displays known hashes that did not match





GUI Hashing Utilities




 Name:  GUIMD5Sum (qtmd5summer)
 Page Reference: 262
 Author/Distributor: irfanhab
 Available From: http://qtmd5summer.sourceforge.net
 Description:  A graphical utility for calculating the MD5 hash value of directories, sub-directories, and individual files. GUIMD5sum offers a clean and simple interface for simple processing of multiple files.

                                   



Name:  Parano
Page Reference: 263
Author/Distributor: BerliOS
Available From: http://parano.berlios.de
Description:  Gnome utility for creating, verifying, and managing hash files. Currently, Parano supports MD5, SHA-1 and (Simple File Verification) SFV.

                                   




File Similarity Indexing





Name:  SSDeep
Page Reference: 264
Author/Distributor: Jesse Kornblum
Available From: http://ssdeep.sourceforge.net/
Description:  A fuzzy hashing tool that computes a series of randomly sized checksums for a file, allowing file association between files that are similar in file content but not identical.
 Helpful Switches:

Switch

Function

-v
Verbose mode; displays file name as its being processed
-p
Pretty matching mode; similar to -d but includes all
matches
-r
Recursive mode
-d
Directory mode, compare all files in a directory
-s
 Silent mode; all errors are suppressed
 -b Uses only the bare name of files; all path information
omitted.
 -l Uses relative paths for file names
 -c Prints output in CSV format
 -t Only displays matches above the given threshold
 -m Match FILES against known hashes in file






 Name:  DeepToad
 Page Reference: 264
 Author/Distributor: Joxean Koret
 Available From:  http://code.google.com/p/deeptoad/
 Description: Inspired by ssdeep, Deeptoad (python) is a tool to clusterize similar files using fuzzy hashing techniques. The menu and tool is usage is shown below:


       









File Visualization



Name:  Crypto Visualizer (part of the Crypto Implementation Analysis Toolkit)
 Page Reference: 304
Author/Distributor: Omar Herrera
Available From:  http://sourceforge.net/projects/ciat/
Description:   The Crypto Implementations Analysis Toolkit is a suite of tools for the detection and
analysis of encrypted byte sequences in files. CryptoVisualizer displays the data contents of a target file in a graphical histogram, allowing the digital investigator to identify pattern or content anomalies.


                              







Name:  BinVis
Page Reference:  266
Author/Distributor:  Gregory Conti/Marius Ciepluch
Available From:  http://code.google.com/p/binvis/
Description:  BinVis is binary file visualization framework, that enables the digital investigator to view
binary structures in unique ways. As of this writing, the tool does not natively install and run in Linux;
WINE or CrossOver must first be installed on the analysis system. As shown in the figure below, BinVis provides for eight distinct visualization modes that render alternative graphical perspectives on the target file structure, data patterns and contents. Particularly useful for analysis is the interconnectedness of the views; for example if digital investigator opens the byteplot diaply and strings viewer, with each region that is clicked on in the byteplot viewer the same area of the target file is automatically displayed in the strings viewer.

        


Mode

Function

Text  Displays file contents in a text and hexadecimal viewer interface
Byte Plot
 Maps each byte in the file to a pixel in the display window
RGB Plot
 Red, Green Blue plot; 3 bytes per pixel
Bit Plot
 Maps each bit in the file to a pixel in the display window
Attractor Plot  Visual plot display based upon chaos theory
Dot Plot Displays detected sequences of repeated bytes contained within a file
Strings Displays strings in a text view display
Byte Cloud Visual cloud of bytes generate from file contents




 Hexadecimal Editors





Name:  Okteta
Page Reference:  268
Author/Distributor:  Okteta
Available From:  http://www.ohloh.net/p/okteta

Description: A robust GUI hex editor for analyzing raw data files. Multifunctional, Okteta has a number of valuable file analysis modules—such as checksum calculator, string extraction, structure analysis, decoding, and statistical tools—that can be viewed or minimized from the main interface.

     





Antivirus





Name: Avast
Page Reference:  272
Author/Distributor:  Avast
Available From:  http://files.avast.com/files/linux/avast4workstation-1.3.0-1.i586.rpm;
http://files.avast.com/files/linux/avast4workstation_1.3.0-2_i386.deb;
http://files.avast.com/files/linux/avast4workstation-1.3.0.tar.gz
Description:  A command-line and graphical anti-virus solution for on-access scans.
Helpful Switches:

Switch

Function

-a
Scan all files (default)
-c
 Scan entire files
-d
Scan only target directory and no sub-files





Name: Avira (for Linux)
Page Reference:  272
Author/Distributor:  Avira Antivirus
Available From:  http://dl1.avgate.net/down/unix/packages/antivir-workstation-pers.tar.gz

Description: A free command-line anti-virus solution that can perform on-demand and on-access scans.

Helpful Switches:

Switch

Function

--scan-
mode=<mode>

Scans in three different selected modes:

"extlist" scans files based upon filename and
extension;

“smart” detects which files to scan based upon
 name/content,

“ all” scans all files regardless of name or content

-s
Scans subdirectories
--scan-in-archive

Scans contents of archive files
-v Scan files completely (lowers false-
positives/negatives)
-r1 Log infections and warnings
-r2 Log all scanned paths
-r3 Log all scanned files

 


Name: AVG (for Linux)
Page Reference:   272
Author/Distributor: AVG
Available From: http://free.avg.com/us-en/download.prd-alf
Description: A free command-line anti-virus solution that can perform on-demand and on-access scans.

Helpful Switches:

Switch

Function

 -T Invokes a terminal user interface (TUI)

 
-d Debug/verbose mode; up to 3 –d switches can be used to
 increase verbosity.
-x Exclude path from scan
-e Scan files with a specific extension; multiple extension
types can be targeted
-n Excludes files with specific extension; multiple file
extensions can be excluded.
-H Uses heuristic scanning
-p Scan for “potentially unwanted programs”
-i Recognize hidden extensions
-a Scan through archive files





Name:   Comodo (for Linux)
Page Reference:  272
Author/Distributor:  Comodo
Available From:  http://www.comodo.com/home/download/download.php?prod=antivirus-for-linux
Description:  A free command-line and graphical anti-virus solution.

                  






 Name:   ClamAV
 Page Reference:  272
 Author/Distributor:  The Clam Team
 Available From:  http://www.clamav.net/lang/en/
 Description:  A free command-line anti-virus solution that can perform on-demand and on-access scans. A GUI overlay, ClamTK is available through most Linux distribution package managers.
 Helpful Switches:

Switch

Function

-v  Verbose mode
-i Only show infected files
-r Scan recursively
--detect-structured Detect structured data, such a PII or
financial information
-l <file> Saves scan report to file (by default this
saves in /home/<user> directory)





 Name:   F-Prot (for Linux)
 Page Reference:  272
 Author/Distributor:  Commtouch
 Available From:   http://www.f-prot.com/products/home_use/linux/;
http://www.f-prot.com/download/home_user/download_fplinux.html
 Description:  A free command-line anti-virus solution that can perform on-demand and on-event scans.
 Helpful Switches:

Switch

Function

-f Scan/follow symbolic link
-m Mount (for each target path provided, remain on
that file system)
-d <number> Descend depth of scan to the provided number
-s <number> Scan level (0 4, default is 2). 0= only heuristic
scanning; 1= Skip suspicious data files; 2=
unknown or wrong file extensions will be
emulated; 3= Unknown binaries emulated; 4=
mode for scanning large corpus of malware, no
limits for emulation.
-u <number>
Aggressiveness of heuristics used (0 4, default is 2)
 -v  Verbose
 -z <number>
 Depth to scan into an archive file
 --adware  Scan for adware in addition to malware





 Name:   BitDefender (for Linux)
 Page Reference:  272
 Author/Distributor:   Bitdefender
 Available From:  http://unices.bitdefender.com/downloads/
 Description:   A free command-line and GUI anti-virus solution that can perform on-demand and on-access scans.
  Helpful Switches:

Switch

Function

--no-recursive Don't recursively scan into subdirectories
--follow-link Scan symbolic links
--recursive-level=n Set maximum depth of recursion for subdirectory scan
--ext[=ext1:ext2] Scans only targeted extensions
--exclude-ext[=ext]  Excludes extensions
--verbose
 Display debug information





 Name:   Panda (for Linux)
 Page Reference:  272
 Author/Distributor:  Panda Security
 Available From:  http://research.pandasecurity.com/free-commandline-scanner/
 Description:  A free command-line anti-virus solution that can perform on-demand and on-event scans.
 Embedded Artifact Extraction




Strings




 Name:   Strings
 Page Reference:  272
 Author/Distributor:  GNU
 Available From:  GNU Binary Utilities (binutils); Native to Linux distributions
 Description:  Displays plain-text ACSII and UNICODE (contiguous) characters within a file.
 Helpful Switches:

Switch

Function

-a Scan the entire file, not just the data section
-f Displays the file name of the target file before each string
-<number> String sequence is at least <number> of characters in length.  Default is 4.





File Dependencies
 Name:  LDD
 Page Reference: 281
 Author/Distributor:  Rowland McGrath and Ulrich Drepper
 Available From: Native to Linux distributions
 Description: Displays the shared libraries required by a target program/executable file. Standard usage:

$ldd <target file>

 Helpful Switches:

Switch

Function

-d
Displays process data locations
-r
Displays process data and function locations
-u
Shows unused direct dependencies
-v
 Verbose; prints all information





 Name:  ELF Library Viewer
 Page Reference:  283
 Author/Distributor: Michael Payne
 Available From:  http://www.purinchu.net/software/elflibviewer.php
 Description:  Graphical utility for displaying library dependencies of a target ELF file. Libraries are
displayed in hierarchial order with respective file path prominently displayed in a separate field. A built-
in search tool enables the digital investigator to quickly query and locate specific libraries—identified files are displayed in red text.

                      




Name:  Visual Dependency Walker
Page Reference:  283-284
Author/Distributor: Filippos Papadopoulos and David Sansome
Available From: http://freecode.com/projects/visual_ldd

Description: Graphical utility for displaying library dependencies of a target ELF file. Libraries are
displayed in hierarchical order with respective file path prominently displayed in a separate field. A useful right-click menu offers the digital investigator the ability to quickly expand or collapse the dependency tree or save the tree to a text file.

                          






Name:   DepSpec
Page Reference:  284
Author/Distributor:  Kyle McFarland
Available From:  https://launchpad.net/depspec

Description: A dual-paned GUI for file dependency analysis that reveals imports libraries, exported
libraries and associated symbolic references. DepSpec can process both ELF and Windows PE files.

                            






Name: ELF Dependency Walker
Page Reference:  283-284
Author/Distributor: Peter Cheung
Available From: http://code.google.com/p/elf-dependency-walker/
Description: A feature-rich ELF file dependency analysis tool that provides alternative viewing options. In tree mode, suspect files are displayed in hierarchical tree order, with dependencies viewable by expanding the tree. Upon selecting a target file or dependency, the file structure is displayed in the right-hand viewing pane. Alternatively, the graph mode enables the digital investigator to render the file dependencies of a target file specimen into eight different graphical layouts for contrasting perspective. Graphical layouts can be saved as a Portable Network Graphics (.png) image file.


               




Extracting Symbolic and Debug References





 Name: NM
 Page Reference:  285
 Author/Distributor: GNU
 Available From:  GNU Binary Utilities (binutils); Native to Linux distributions
 Description:  Command-line utility that lists symbols in a target file.
 Helpful Switches:

Switch

Function

-a Displays debugger-only symbols
-A Displays the name of the input file before every symbol
-C "Demangle" mode that decodes low-level symbol names into user-level names
-D Display dynamic symbols instead of standard symbols
-g
 Only display external symbols
 -l  Use debugging information to locate a file name
 -n  Sort symbols numerically by address





 Name: Gedit Symbol Browser Plugin
 Page Reference:  291
 Author/Distributor:  Micah Carrick
 Available From:  http://www.micahcarrick.com/gedit-symbol-browser-plugin.html
 Description:  A graphical symbol extraction and analysis tool that is leveraged as a plugin through gedit.

                          




File Metadata


Name:  Exiftool
Page Reference: 294
Author/Distributor:  Phil Harvey
Available From:    http://www.sno.phy.queensu.ca/~phil/exiftool/
Description:  A powerful command-line metadata extraction tool that can acquire meta information from ELF, PDF, MS Office, among other types of target malware files.
Helpful Switches:

Switch

Function

-q Quiet processing
-r Recursively process subdirectories
-s Short output format
-S Very short output format
-w EXT
Write console output to file




ELF File Analysis





 Name:  Binutils
 Page Reference:  276
 Author/Distributor:  GNU
 Available From:  www.gnu.org/software/binutils
 Description: A collection of binary tools for manipulating and analyzing object and archive files,
including, among others, nm (list symbols from object files); strings, readelf, and objdump.






 Name:  Elfutils
 Page Reference:  276
 Author/Distributor:  Ulrich Drepper
 Available From:  https://fedorahosted.org/elfutils/
 Description: A collection of utilities for working with ELF object files, including:

Utility

Function

eu-elfcmp  A tool for “diffing” or comparison of relevant parts of two
 target ELF files
eu-elflint Compares target file compliance with gABI/psABI
specifications.
eu-nm  Lists symbols in a target file
eu-objdump
Displays information in object files
eu-readelf Tool for displaying content of ELF file structures and content
eu-size List section sizes of target file
eu-strings Displays plaint-text ASCII and UNICODE (contiguous) characters within a file





 Name:  Greadelf
 Page Reference:  276
 Author/Distributor:  Ashok Das
 Available From:  https://code.google.com/p/greadelf/
 Description: GUI for the readelf and eu-readelf utilities that provides an easy-to-navigate multi-
pane views of ELF file structures and contents.

              



Name: ERESI Reverse Engineering Software Interface ("ERESI")
Page Reference:  308
Author/Distributor: Julien Vanegue and the ERESI team
Available From: http://www.eresi-project.org/
Description:  A framework of multi-architecture binary analysis tools geared toward reverse engineering and program manipulation. The framework includes the following tools: elsh, kernsh, e2dbg, etrace, evarista, kedbg. In addition to these programs ERESI contains numerous specialized libraries that can be used by ERESI and/or in third-party programs.




Name:  Readelf
 Page Reference:  277, 305, 308
 Author/Distributor:  GNU
 Available From:   www.gnu.org/software/binutils
 Description:  A command line tool that diplays the structure and contents of ELF files.

 Helpful Switches:

Switch

Function

-a  All
-h  Displays file header
-l Displays program headers
-S Displays section headers
-t Displays section details
-e Verbosely displays header details
-s Displays symbols
--dyn-syms Displays dynamic symbols
--notes Displays notes
-V Displays version information in file




 Malicious Document Analysis




Malicious Document Analysis: PDF Files


Name:  Origami
Page Reference:  341-344
Author/Distributor:  Gillaume Delugré, Frédéric Raynal (Contributor)
Available From:     http://esec-lab.sogeti.com/dotclear/index.php?pages/Origami;
http://code.google.com/p/origami-pdf/
Description:  Origami is a framework of tools written in Ruby designed to parse and analyze malicious
PDF documents as well as to generate malicious PDF documents for research purposes. Origami contains a series of Ruby parsers—or core scripts (described in the table below), scripts, and Walker (a GTK GUI interface to examine suspect PDF files, depicted in the Figure below).


                            
                       
Helpful Switches: 

Switch

Function

pdfscan.rb Parses the contents and structures of a target PDF file
specimen
extractjs.rb Extracts JavaScript from a target PDF file specimen
detectsig.rb
Detects malicious signatures in a target PDF file specimen
pdfclean.rb
Disables common malicious trigger functions
printmetadata.rb Extracts file metadata from a target PDF file specimen





Name:  Jsunpack-n
Page Reference:  344
Author/Distributor:  Blake Hartstein
Available From:  https://code.google.com/p/jsunpack-n/; Jsunpack: http://jsunpack.jeek.org/dec/go
Description: Jsunpack-n, "a generic JavaScript unpacker," is a suite of tools written in python designed to emulate browser functionality when navigating to URLs. Although a powerful tool for researchers to identify client-side browser vulnerabilities and exploits, Jsunpack-n is also a favorite tool of digital investigators to examine suspect PDF files and extract embedded Javascript. In the figure below, the pdf.py script is used to extract JavaScript from a suspect PDF file specimen and write it to a separate file for further analysis.


 malwarelab@MalwareLab:~/jsunpack-n$./pdf.py
/home/malwarelab/Desktop/merry_christmas\ UNZIPPED.pdf

processing /home/malwarelab/Desktop/merry_christmas UNZIPPED.pdf!!!

parsing /home/malwarelab/Desktop/merry_christmas UNZIPPED.pdf

failed to decompress object 26 0

Found JavaScript in 31 0 (3106 bytes)

     children []

     tags [['Filter', ''], ['FlateDecode', ''], ['Length', '1213']]

     indata = <</Filter[/FlateDecode]/Length
1213>>streamHVmOG8Yd)}$PpEZ)io^y=Ytp<?>5a~=<9<s'g7-]/ghhiIwwwhY

Wrote JavaScript (9085 bytes -- 5979 headers / 3106 code) to file
/home/malwarelab/Desktop/merry_christmas UNZIPPED.pdf.out








 Name:  PDFMiner
 Page Reference:  347
 Author/Distributor: Yusuke Shinyama
 Available From:  http://www.unixuser.org/~euske/python/pdfminer/index.html
 Description: Python PDF parser and analyzer. PDF Miner consists of numerous python scripts to
examine the textual data inside of a PDF file, including pdf2txt.py (extracts text contents from a PDF file) and dumppdf.py (dumps the internal contents of a PDF file in pseudo-XML format).





 Name:  Peepdf
 Page Reference:  347
 Author/Distributor:  Jose Miguel Esparza
 Available From:  http://code.google.com/p/peepdf/
 Description:   Command-line based PDF parser and analyzer. Peepdf can be invoked from the command line and pointed toward a target file (shown in the figure below), or set into ”interactive mode,” (using the –i switch) creating a peepdf ”PPDF shell” wherein commands can be directly queried.







 Helpful Switches:


Switch

Function

metadata Displays target PDF file metadata, including Creation
Date, Modified Date, Producer, Creator, Keywords,
Author, among other items.
object Displays the content of a target object after being decoded
and decrypted
offsets Displays the physical structure of the target document
open Open and parse the target file
rawobject Displays the content of a target object without being
decoded and decrypted
rawstream Displays the content of a target stream without being
decoded and decrypted
references Displays the references in the object or to the object in a
target file
search Search target file for a specified string or hexadecimal
string
stream Displays the content of a target stream after being decoded
and decrypted





 Name:  Malzilla
 Page Reference:  351
 Author/Distributor:   Boban Spasic aka bobby
 Available From:  http://malzilla.sourceforge.net/downloads.html
 Description:   Described by the developer as a malware hunting tool, Malzilla is commonly used by
malicious code researchers to navigate to potentially malicious URLs in an effort to probe the contents formalicious code and related artifacts. However, Malzilla has a variety of valuable decoding and shellcode analysis features making it an essential tool in the digital investigator’s arsenal for exploring malicious PDF files. As of this writing, the tool does not natively install and run in Linux; WINE or CrossOver must first be installed on the analysis system.









 Name:  Hachoir-urwid
 Page Reference:  357
 Author/Distributor:  Victor Stinner
 Available From:  https://bitbucket.org/haypo/hachoir/wiki/hachoir-urwid
 Description:  Based upon the hachoir-parser, the hachoir-urwid is a binary file exploration utility that can parse a myriad of file types, including OLE files.

              





Name:  Hachoir-wx
Page Reference:  271, 357
Author/Distributor:  Victor Stinner
Available From:  https://bitbucket.org/haypo/hachoir/wiki/hachoir-wx
Description:  A wxWidgets-based GUI for Hachoir that enables the digital investigator to parse binary
files, including OLE files.

             





Name:  pyOLEScanner
Page Reference:  357
Author/Distributor:  Giuseppe ’Evilcry’ Bonfa
Available From:  https://github.com/Evilcry/PythonScripts
Description:  Python script for triaging OLE files for indicators of malice, including embedded
executables, API references, shellcode, Macros and other artifacts.

 lab@MalwareLab:~/pyOLEScanner$python pyOLEScanner.py Discussions.doc
+-------------------------------+
| OLE Scanner v. 1.2
| by Giuseppe 'Evilcry' Bonfa
+-------------------------------+

[-] OLE File Seems Valid

[+] Hash Informations

MD5: 2e0aafbf78c3459dfa5cb1d1d88e6bc3
SHA-1: 59b15f68f3b72dfea14e50878b31b87bee3019fa
[+] Scanning for Embedded OLE in Clean

Revealed presence of Embedded OLE

[+] Scanning for API presence in Clean

Revealed presence of WinExec at offset:0x703c
Revealed presence of ShellExecute at offset:0x70d4
Revealed presence of UrlDownloadToFile at offset:0x7046
Revealed presence of UrlDownloadToFile at offset:0x6f2a

==========================================

Warning File is Potentially INFECTED!!!!

[+] Scanning for Embedded Executables - Clean Case

('Embedded Executable discovered at offset :', '0x344e', '\n')

==========================================

Warning File is Potentially INFECTED!!!!

[+] Scanning for Shellcode Presence

FS:[00] Shellcode at offset:0x6137
NOP Slide:0x5c0a

==========================================

Warning File is Potentially INFECTED!!!!

[+] Scanning for MACROs
==========================================

No MACROs Revealed
('An Error Occurred:', 'columns MD5, SHA1 are not unique')



 




Name:  Beye (Binary Eye) (formerly known as Binary vIEWer, "BIEW")
Page Reference:  357
Author/Distributor:  Nickols Kurshev
Available From:   http://beye.sourceforge.net/
Description:  A terminal user interface based tool for parsing numerous binary file formats, including ELF and OLE files.

                       




Name:  Structured Storage Viewer
Page Reference:  357
Author/Distributor:  MiTec/Michal Mutl
Available From:  http://www.mitec.cz/ssv.html
Description:  GUI tool for analyzing and manipulating MS OLE Structured Storage files. As of this
writing, the tool does not natively install and run in Linux; WINE or CrossOver must first be installed on the analysis system.