Guide to Malware Analysis
Components of malware analysis
- Basic static analysis
- Basic dynamic analysis
- Reverse engineering
Basic static analysis
- VirusTotal scan
- PEiD - detect packer
- UPX - UPX unpacker
- Dependency walker - list dynamically linked functions
- PEView - view PE files
- Resource Hacker - view
Evading antivirus signatures
from the "Antivirus Hacker's Handbook"
Some basic methods
- Binary padding
- Dividing malware into smaller pieces, then making sure each piece isn't detected
A systematic method
- Where are the malware definition files?
- What is their file format?
- How is the signature encoded in malware definition files?
- What are some edge cases the signature doesn't consider?
- Check your work using VirusTotal
These are words used to describe malware samples. A malware sample may fit more than one of these categories!
Worms vs viruses
- Worms can spread on their own
- For example, a worm may port scan other machines on a subnet and exploit an SSH vulnerability
- Viruses require user action
- For example, an "ILUVYOU" email (with an attached infected Office file) that implores you to forward it to your contacts
from SourceFire Chalk Talks
Why do people pay botmasters?
- installing unwanted applications
- click fraud
- stealing information
- mass identity theft
- spread new malware
- The top use of botnets is spam
- One botnet sent 25K emails per day per host
- Even if emails from 1 host are blocked, can send emails from other hosts
- don't like the website (eg, Krebs on Security)
- extortion (eg, threatening to shut down online gambling site before Super Bowl)
Unwanted applications - "pay per install"
- fake antivirus software that tells user they must pay to remove malware
- Idea: usually, an advertiser gives an ad to an affiliate
- Affiliate places ads on Internet
- When someone clicks on the ad, advertiser pays affiliate
- Botnets can simulate clicks on ads a bad affiliate places
- So advertiser has to pay this affiliate a lot of money
- Note - botnet renter and affiliate must collude/be same group
- can extract customer data, corporate credentials, etc
- or passwords, credit card numbers, bank credentials
- to harvest passwords, can sniff packets or keylog
- sell in underground economy
Overall, you can do mass identity theft
- send phishing emails from people's email addresses
- host phishing sites on infected machines
Spreading new malware
- if you want to hack company X, can buy a bot inside company X
- Russia can install NotPetya on all the bots in Ukraine
- then, the malware can spread exponentially on its own
- power in cyberspace = # of hosts you control
Popular comms channels
- Peer to peer (P2P)
- Covert channel
from SourceFire Chalk Talks
What is a rootkit?
- Malware that is persistent and undetectable
- Remember -- a piece of malware can be rootkit and a backdoor, for instance
from the "Rootkit Arsenal" book
- Flat memory model
- Segmented memory
- Each process is assigned pages of memory, which can be on disk or in RAM
- Process thinks it has an linear, contiguous address space and owns all of memory
- The MMU converts process's virtual addresses into physical addresses
Has 3 modes of operation:
- Real mode - 16 bit
- Protected mode - Run OS, like Windows 7
- System management mode - run CPU firmware code (like emergency CPU shutdown)
BIOS and boot code run in real mode
- OS loads interrupt vector table (IVT) into memory
- IVT stores the memory addresses of the interrupt handlers for each interrupt
- When a keypress happens (for example), it triggers a hardware interrupt. CPU look ups "key pressed" interrupt in IVT and executes code at that memory address
- Types of interrupts - hardware, software, exceptions
x86 CPUs have 4 "rings" or levels of privilege:
- Ring 0 = Kernel
- Ring 1, 2 = Device drivers
- Ring 3 = Apps
Apps use syscalls to run code in kernel mode.
3 main rootkit techniques
- Modify code directly. Eg, modify code of grep, ls, etc
- Hooking - modify where to look for code. Specifically, modify an Import Address Table 
- Modify data structures (kernel mode rootkits only). For example, modify EPROCESS linked list to hide a process
 An import address table is loaded in memory for each process. It stores the addresses of all the functions a DLL imports.
- Deleting Windows event logs
- A shell script that deletes an executable on disk after the executable runs (to avoid analysis)
- Hiding data to exfiltrate in slack space on disk
- Armoring - rootkit decrypts itself at runtime
Waste analyst's time by creating a lot of false positives
- Say a rootkit needs to modify a DLL--just modify every DLL on the system!
- Easy to detect, but hard to tell what the rootkit's target DLL is
Data source elimination
- Don't leave a trace at all!