Network Data Source | pcaps, snort IDS alerts |
Network Data Labeled | Some ground truth provided |
Host Data Source | Apache web server logs, Splunk logs |
Host Data Labeled | No |
Overall Setting | Enterprise IT |
OS Types | Windows Server 2008 FreeBSD Fedora |
Number of Machines | ~12 |
Total Runtime | ~4 days |
Year of Collection | 2009 |
Attack Categories | n/a |
Benign Activity | Synthetic, via scripts |
Packed Size | 12 GB |
Unpacked Size | 15,3 GB |
Download Link | n/a |
Overview
The Cyber Defense Exercise Capture-The-Flag (CDX-CTF) dataset originated from the idea of instrumenting network warfare competitions to generate new, labeled datasets suitable for evaluation of intrusion detection systems. The dataset is named after the competition it was generated in, which was held by the U.S. Military Academy (USMA) and other military colleges in cooperation with the NSA. The authors note that their method of data collection only allows for rather coarse labeling, and suggest methods for alleviating this problem in the future.
Note: At the time of writing, this dataset can no longer be found at its old location. The USMA homepage is still linked below, but I was unable to find the dataset in question anywhere. Should you find a working URL, please open a pull request.
Environment
The setup for each team consisted of setting up a “standard” enterprise network with Windows and Linux machines, but also included adding three untrusted workstations provided by the organizers. A detailed network infrastructure diagram is linked below.
Activity
The competition was held over a duration of four days, with eight teams competing in total. Each team was tasked to defend their network (and services) against attacks performed by an NSA red team, while scripts generated background SMTP and HTTP noise. Further information regarding
Contained Data
This dataset focuses on traffic coming from and to the USMA network. Note that all networks operated isolated from the internet. Captured data consists of:
- Snort IDS alert logs (entire exercise)
- External DNS named service logs (entire exercise)
- External DNS message log (entire exercise)
- Apache web server access log (final day of exercise)
- Apache web server error log (final day of exercise)
- Splunk log server aggregate log (final day of exercise)
In addition to this, packet capture data (in the form of tcpdumps) from three different sensors is available and intended to be correlated using the previous log data as ground truth (this is the “coarse labeling”). These traces make up the majority of the dataset.
Papers
Links
Data Examples
Snippet of Snort IDS alert logs
[**] [129:4:1] TCP Timestamp is outside of PAWS window [**]
[Priority: 3]
11/09-16:13:15.434301 154.241.88.201:80 -> 3.75.190.181:53659
TCP TTL:63 TOS:0x0 ID:7687 IpLen:20 DgmLen:455 DF
***AP*** Seq: 0xC882770 Ack: 0x50C1D0B9 Win: 0x5B TcpLen: 32
TCP Options (3) => NOP NOP TS: 75053430 25206943
[**] [123:8:1] (spp_frag3) Fragmentation overlap [**]
[Priority: 3]
11/09-16:14:20.915899 31.154.241.4 -> 3.75.190.181
UDP TTL:126 TOS:0x0 ID:21644 IpLen:20 DgmLen:618
Frag Offset: 0x00B9 Frag Size: 0x0256
[**] [123:8:1] (spp_frag3) Fragmentation overlap [**]
[Priority: 3]
11/09-16:15:20.617389 31.154.241.4 -> 3.75.190.181
UDP TTL:126 TOS:0x0 ID:21672 IpLen:20 DgmLen:618
Frag Offset: 0x00B9 Frag Size: 0x0256
Snippet of external DNS named service logs
10-Nov-2011 06:35:01.482 queries: info: client 7.204.241.161#54116: query: 20.10.1.10.in-addr.arpa IN PTR +
10-Nov-2011 06:35:01.483 queries: info: client 7.204.241.161#60924: query: www1.hq.bluenet IN A +
10-Nov-2011 06:35:20.909 queries: info: client 7.204.241.161#55292: query: 203.60.1.10.in-addr.arpa IN PTR +
Snippet of external DNS message logs
Nov 9 18:13:04 ns1 root: /etc/rc.d/sysctl: WARNING: sysctl net.inet.rfc1321 does not exist.
Nov 9 18:13:20 ns1 login: ROOT LOGIN (root) ON ttyv0
Nov 10 03:04:01 ns1 kernel: TCP: [127.0.0.1]:60173 to [127.0.0.1]:25 tcpflags 0x2<SYN>; tcp_input: Connection attempt to closed port
Snippet of apache web server access logs
Nov 11 09:56:11 www logger: 10.2.23.147 - - [11/Nov/2011:09:56:11 -0500] "GET /admin/%7F%AC%CA~%F4@%AC%03%EE%94;R%95%A6%13)%F9%E64%AA%DBt%EA%93%FEW%1F8%09%22?%E1%E9%9B%A9W%9B%B9S%F0%D0C%C1%EB%F7-%15%1E%8D%87%CFb%BD6%AB)%EC%01%A1a%8C%99%8B-%17]%7F%89m%937%F5%86jm[%E8%179%F6%07%CAVAj%7D~%C7%BC%9C%CC;%1E%9B%BA%A3%DCf%DC%0EZz(%EA%C0%C7%BF%C5%B9%03t%EE%CB%82%7FR,1%F5J%B3i%07j3S%82?F%C1%CB HTTP/1.0" 302 654
Nov 11 09:56:12 www logger: 10.2.23.147 - - [11/Nov/2011:09:56:12 -0500] "GET /admin/%FC%D5%02@%1FN%80%E1%BFK%D8%98mX%07%BC%92%B2%F7%87%1E%11U HTTP/1.0" 302 272
Nov 11 09:56:13 www logger: 10.2.23.147 - - [11/Nov/2011:09:56:13 -0500] "GET /admin/s%18ld%AB%B1%BE%C4%90%F5%E1%90dq%B7%E58N%BF%96%F4%A0%7C%A58%7B[%BAJ8%DE%F1%9E%5C%83*%AB%B4%D9%BC%CA HTTP/1.0" 302 316
Snippet of apache web server error logs
Nov 11 10:18:10 www httpd[4978]: [error] [client 10.2.23.173] client denied by server configuration: /var/www/html/admin/\xa0\xedo\xbeq\xa0=C\xa0%\x06\x85uQS\x8a\x9d\xf5\r\xeb\xf707WwD\x16]6wm\x89\xf4\xec"\xe4\xafgy&\xd2\xb2Mq\x88\xda\x19\x0f7\xb6\xd20N\n3t\xcc\x05\x84&P\x01\x8c\x8c%\xbc\x9e\xda\xefZ$*\x8b|\xc2
Nov 11 10:18:12 www httpd[4885]: [error] [client 10.2.23.173] client denied by server configuration: /var/www/html/admin/\xea\x1d\x98c\x95\xcb&\x8b'\x13\xcf\x86\xf1f\xd4\xa8\xf5\xba&\xc4\x89\x96\x95\x9f\x1c\x1c\r\x05\x0f\xa4\xb0S\x15<\xae\xf3\xeb\x9c\x85\n\xcb\xa0\x9b\x97\x93\x86\\\xb9K5\xf9\x9f\xc7\x1a\xff2El\xb3\xdeL\xac\x90\x03\xafc\xdb\x04\x0e\x80D\xa6\x8a\xcdn\x97\xf2\xfaj\xe8\xe8\x82e9\xf34\xd2\x19\xc0\xa0\xb2\xd1\xdc\xe2\xff\xbd\xd7\xd2\x7f\xe6\xf6*\x12A%\xb2\x04@\n\xc9y\xda\x15\x9e-'n\x11\xcc\x04\xaec\xaa\xdc\xb7a
Nov 11 10:18:18 www httpd[5016]: [error] [client 10.2.23.209] client denied by server configuration: /var/www/html/admin/yj\x9d\xce\xf3i\xe0\x05\x9c\xb4\x9e\b.\xab\xc2\xb3\xfa}\xf4\x12\x92\x82\x17\x98\xd1\x8a\xe8);\\\x85\xdb!y\xd2&n\xa7I\x8ft\x04F\xba!'\x94\x05\xdcHPt\xc6H!\xa9p\x8b`\xab\xef\xa2\xee\x816\xbd\x07\xc2\\8\xa2\xf1"\xccbe\xa4\xc2\x9fp4y\xc6\v\x82&\xcaT\x99\xddm\xcc.\xd4\x1d[\xa1roN\xb3P\xc8\xd3\x14\xcc\x91xU\x0f\x9b\xb42y\xe4\xee\xb23\r\n\xe26\xb7l6\x1c~\xb3\x1c\xde\x1d\xcc\xa8~\x07x\xa5&\x87\x93\b0\xd7\r\xc1\xd9\x0e\x0f\xfd\xc8
Snippet of Splunk log server aggregate logs (csv)
source,host,sourcetype,_raw,_fullcharcount,_fulllinecount,_segs
[...]
udp:514,31.154.241.2,syslog,"Nov 11 13:28:00 31.154.241.2 Nov 11 13:28:51 Alpha.usma.bluenet MSWinEventLog 1 Application 4365 Fri Nov 11 13:28:51 2011 8 crypt32 Unknown User N/A Error ALPHA None Failed auto update retrieval of third-party root list sequence number from: <http://www.download.windowsupdate.com/msdownload/update/v3/static/trustedr/en/authrootseq.txt> with error: A connection with the server could not be established 96",,,
udp:514,180.242.137.181,syslog,Nov 11 13:25:05 180.242.137.181 Nov 11 13:25:00 /usr/sbin/cron[5115]: (root) CMD (/usr/libexec/atrun),,,
/var/log/cron,wrestlingcleats.usma.bluenet,too_small,2011-11-11T13:25:00.159668-05:00 wrestlingcleats /usr/sbin/cron[2004]: (root) CMD (/usr/libexec/atrun),,,