Vault 7: Projects

This publication series is about specific projects related to the Vault 7 main publication.
SECRET//20350112
15.4 (S) Generic Filter (GF) Search Algorithm Details
(S) This section describes the email and chat search algorithms used by the GF.
15.4.1 (S) Email Search
(S) This section describes the email search algorithm used by the GF (see 5.2.3.5).
(S) A “hooking” mechanism is used for gaining read/write access to all network packets
passing through the device. Target emails are then found by searching hooked network
packets for the‘@’ character. Because many webmail protocols use URL encoding, the
Generic Filter (GF) first URL decodes packets meeting port and protocol criteria to a
temporary buffer (i.e., the @ sign may be URL encoded as ‘%40’, so decoding it will
return it to an @ sign in the temporary buffer). The GF then searches the temporary
buffer forward for an @ sign. If it finds an @ sign, it searches backwards to find a
terminating username character. In actuality, two searches are performed – one that is
looking for the first invalid RFC 822 character and another that is looking for the first
invalid “webmail” character (or the start of the buffer). Most popular webmail services
allow only numbers, letters, dots, dashes, and underscores. Similar logic is used to search
forward from the @ sign to determine the domain. These searches result in an RFC 822
email and a “webmail” email for each @ sign in a packet. The GF then marks each of the
emails valid if the username is at least 2 characters in length and domain is at least 3
characters in length (this eliminates a lot of unnecessary computation of hashes when, for
example, a binary file with many @ signs but no email addresses is downloaded). The GF
then computes the MD5 hash of the valid emails, and compares those hashes to each
target email address hash in the email target list of the Mission. The GF then continues
searching for the next @ sign from the current @ sign. The search process is then
repeated for each ‘@’ sign found.
(S) Because network traffic is “packetized”, an email could span two packets. The GF
handles this case by buffering a (URL decoded) portion (nominally 32 bytes) of the
previous packet. The GF “prepends” this buffer to the current packet buffer and performs
the search as described before.
15.4.2 (S) Chat Search
(S) This section describes the chat search algorithms used by the GF (see 5.2.3.5).
Currently supported chat clients are include Yahoo Messenger (YM), and America
Online Instant Messenger (AIM) (as of October 2010, maktoob is part of Yahoo! and the
maktoob chat service is no longer available). Many AIM and YM events, including login,
sending and receiving chat messages, and logout are detected. Only MC login events are
detected. GoogleTalk and MSN Messenger chat users are also supported, but because
these clients use email addresses to identify users, the Email Search algorithm of 15.4.1
detects them.
(S) A “hooking” mechanism is used for gaining read/write access to all network packets
passing through the device. Target chat users are then found by passing all network
136
SECRET//20350112