All Datasets

Watch Star

Name	Network/Host Data	Year	Times Recently Cited¹	TL;DR	Setting	OS Type	Labeled?²	Data Type/Source	Packed Size	Unpacked Size
CasinoLimit	Both	2025	0	Syslogs and NetFlows collected from 114 individual CTF attempts, labeled with MITRE ATT&CK techniques. Does not feature benign behavior	Enterprise IT	Linux	🟩	NetFlows, Syslog, auditd	3,6 GB	54,4 GB
DEDALE	Both	2025	0	Labeled host and network logs collected from a testbed simulating a company network with 55 machines under attack by an APT, with a total runtime of four weeks	Enterprise IT	Windows, Linux	🟩	NetFlows, pcaps, Windows events, Linux events	-	65 GB
AIT Alert Dataset	Both	2023	15	Alerts generated from the AIT log dataset, including labels. Only caveat is the lack of Windows machines	Enterprise IT	Linux	🟩	Wazuh, Suricata and AMiner alerts	96 MB	2,9 GB
FLNET2023	Network	2023	10	Large dataset generated with CORE emulator based on ISP-like network topology. Features variety of attack types distributed across 40 routers	ISP-like	Undisclosed	🟩	pcaps, Custom network features	-	176 GB
OTFR Security Datasets - LSASS Campaign	Both	2023	-	Very small simulation focusing on exploiting Windows’ LSASS.exe. Lacking documentation, no labels and no user behavior	Single OS	Windows	🟥	pcaps, Windows events, Zeek logs	423 MB	1 GB
UNR-IDD	Network	2023	24	Mininet-based SDN dataset focusing on balanced class representation and port statistics (versus purely flow-based features)	Miscellaneous	Linux	🟩	Port statistics, selected flow features	-	5,3 MB
AIT Log Dataset	Both	2022	35	Huge variety of labeled logs collected from multiple simulation runs of an enterprise network under attack. With user emulation. but only Linux machines	Enterprise IT	Linux	🟩	pcaps, Suricata alerts, misc. logs (Apache, auth, dns, vpn, audit, suricata, syslog)	130 GB	206 GB
CLUE-LDS	Host	2022	5	Database of real user behavior without known attacks, for evaluation of methods detecting shifts in user behavior	Subsystem	Undisclosed	🟥	Custom event logs	640 MB	14,9 GB
EVTX to MITRE ATT&CK	Host	2022	-	Small dataset providing various events corresponding to certain MITRE ATT&CK tactics/techniques	Single OS	Windows	🟩	Windows events	<1 GB	<1 GB
OD-IDS2022	Network	2022	22	30 days of traffic from two servers under attack. Large variety of attacks, but extremely lacking documentation and access has to be requested manually	Enterprise IT	Windows, Linux	🟩	NetFlows	-	-
OTFR Security Datasets - Atomic	Both	2019-2022	-	Various small datasets, each corresponding to a specific MITRE ATT&CK tactic/technique. Lacks user simulation / underlying scenario and does not provide explicit labels	Single OS	Windows, Linux, Cloud	🟨	pcaps, Windows events, auditd logs, AWS CloudTrail logs	125 MB	-
PWNJUTSU	Both	2022	10	Rich collection of complex attacks executed by various red team participants each acting in a small network, but not labeled	Miscellaneous	Windows, Linux	🟥	pcaps, Windows events, Sysmon, auditd, various logs (Apache, auth, dns, ssh, etc.)	82 GB	-
UWF-ZeekData22	Network	2022	29	Traffic collected from a university’s wargaming course. Covers all MITRE ATT&CK tactics, though the overwhelming majority is simple recon and attacks are poorly documented	Enterprise IT	Windows, Linux	🟩	pcaps, Zeek logs	-	209 GB
I-Sec-IDS	Network	2021	1	Small collection of NetFlows containing trivial DoS and scan attacks targeting a single host, does not feature user behavior	Single OS	Windows	🟩	NetFlows	66 MB	-
NF-UQ-NIDS	Network	2021	299	Combination of four distinct network datasets using a newly proposed set of standardized features	Miscellaneous	Windows, Linux, MacOS	🟩	Custom NetFlows	2 GB	14,8 GB
OTFR Security Datasets - Log4Shell	Both	2021	-	Very small simulation focusing on the Log4j vulnerability. Lacking documentation, no explicit labels and no user behavior	Single OS	Linux	🟨	pcaps, Ubuntu events	<1 MB	1 MB
OTFR Security Datasets - SimuLand Golden SAML	Host	2021	-	Barely a dataset, only contains very few traces for some specific events. At most usable to test specific Windows detection rules.	Enterprise IT	Windows	🟩	Windows Events	-	<1 MB
SOCBED Example Dataset	Both	2021	25	Generated using the SOCBED framework, demonstrating reproducible dataset creation, though current attacks are on the basic side	Enterprise IT	Windows, Linux	🟥	Windows events, Linux events, packetbeat	78 MB	1,3 GB
Unraveled	Both	2021	43	Large dataset with intricate labeling, though the focus seems to be on network flows. Mapping will be annoying.	Enterprise IT	Windows, Linux	🟩	pcaps, misc. logs (syslog, audit, auth, Snort)	-	22 GB
DAPT 2020	Both	2020	75	Focuses on attacks mimicking those of an APT group, executed in a rather small environment	Enterprise IT	Undisclosed	🟩	NetFlows, misc. logs (DNS, syslog, auditd, apache, auth, various services)	460 MB	-
OpTC	Both	2020	-	Huge amount of data and interesting attacks, but possibly hard to use due to uncommon event format and requiring semi-manual labeling	Enterprise IT	Windows	🟨	Custom event logs, Zeek events	-	1 TB
OTFR Security Datasets - APT 29	Both	2020	-	Replication of APT29 evaluation developed by MITRE. Well made and documented, but without labels or user behavior	Enterprise IT	Windows, Linux	🟥	pcaps, Windows events, Zeek events	126 MB	2 GB
SR-BH 2020	Network	2020	31	Multi-label dataset assigning a variety of MITRE CAPEC classifications to requests collected from a small honeypot	Single OS	Undisclosed	🟩	Custom Network Features	-	436 MB
CICDDoS2019	Network	2019	820	Dataset focusing on various DDoS attacks, covering a broad range of categories. Includes benign behavior, but only for Pcaps, not NetFlows	Enterprise IT	Windows, Linux	🟩	Pcaps, NetFlows, Windows events, Ubuntu events	24,4 GB	-
DARPA TC5	Host	2019	-	Custom event logs from network under attack from APT groups, designed to facilitate provenance tracking	Undisclosed	Undisclosed	🟨	Custom event logs	-	-
IDEA Dataset	Network	2019	-	One week of anonymized IDS alerts collected from three large organizations, in a normalized format (an extension of IDMEF)	Enterprise IT	Undisclosed	🟥	NEMEA, Suricata, TippingPoint, and other alerts (normalized & anonymized)	1 GB	7 GB
LID-DS 2019	Host	2019	22	Contains system calls + associated data/metadata for a variety of Linux exploits, includes normal behavior	Single OS	Linux	🟨	Sequences of syscalls with extended information	13 GB	-
OTFR Security Datasets - APT 3	Host	2019	-	Replication of APT3 evaluation developed by MITRE. Lacking documentation, no labels and no user behavior	Enterprise IT	Windows, Linux	🟥	Windows events	30 MB	855 MB
ASNM Datasets	Network	2009-2018	4	Specialized features extracted from instances of remote buffer overflow attacks for the purpose of anomaly-based detection	Miscellaneous	Windows, Linux	🟩	Custom NetFlows	21 MB	95 GB
AWSCTD	Host	2018	20	Syscalls collected from ~10k malware samples running on Windows 7, no user emulation	Single OS	Windows	🟩	Sequences of syscall numbers	10 MB	558 MB
CSE-CIC-IDS2018	Both	2018	3317	Simulation of large enterprise IT (450 machines) with user emulation and various attacks, includes host and network logs, but only the latter are labeled	Enterprise IT	Windows, Linux, MacOS	🟩	pcaps, NetFlows, Windows events, Ubuntu events	220 GB	-
DARPA TC3	Host	2018	-	Custom event logs from network under attack, designed to facilitate provenance tracking	Undisclosed	Undisclosed	🟨	Custom event logs	115 GB	-
NGIDS-DS	Both	2018	4	Enterprise network undergoing variety of attacks using IXIA PerfectStorm hardware. Seems to lack host user behavior, does not provide raw host logs	Enterprise IT	Linux	🟩	pcaps, custom host features	941 MB	13,4 GB
Biblio-US17	Network	2017	1	Large number of web requests collected over 6.5 months from a production server, but heavily anonymized and only select features available	Enterprise IT	Undisclosed	🟩	HTTP requests (select features)	1,1 GB	6 GB
CIC DoS	Network	2017	134	Dataset focusing on different DoS attacks targeting the application layer (instead of network layer), but no longer available	Enterprise IT	Linux	🟩	Network traffic (unknown format)	-	4,6 GB
CIC-IDS2017	Network	2017	3317	Simulation of medium-sized company network under attack, focuses solely on network traffic	Enterprise IT	Windows, Linux	🟩	pcaps, NetFlows, custom network features	48,4 GB	50 GB
Unified Host and Network Data Set	Both	2017	69	Selection of network and host events collected from operational environment, but without any attacks	Enterprise IT	Windows, Linux	🟥	NetFlows, Windows events	-	-
UGR’16	Network	2016	139	Network flows collected from real network over a long period of time, with some attack traffic injected	Enterprise IT	Undisclosed	🟩	NetFlows	236 GB	-
AWID	Network	2015	285	Traffic features collected from a home Wi-Fi network using WEP, targeted by an attacker exploiting various weaknesses of this security mechanism	Home IT	Windows, Linux, iOS	🟩	Custom network features	11,7 GB	-
Comprehensive, Multi-Source Cyber-Security Events	Both	2015	76	Various events from production network with red team activity, but extremely limited information per event	Enterprise IT	Windows, Linux	🟩	Custom event logs (auth, proc, network flows, dns, redteam)	12 GB	-
Kyoto Honeypot	Network	2006-2015	144	Collection of features derived from attack traffic targeting honeypots over the span of 9 years	Miscellaneous	Windows, Unix, MacOS	🟩	Custom network features	20 GB	-
UNSW-NB15	Network	2015	2511	Custom network undergoing a variety of attacks using IXIA PerfectStorm hardware. Mostly geared towards anomaly-based NIDS	Undisclosed	Undisclosed	🟩	pcaps, custom network features	>100 GB	-
ADFA-WD	Host	2014	45	Mostly intended for anomaly-based stuff leveraging library calls, explores interesting concept of stealthy shellcode	Single OS	Windows	🟨	Sequences of dll calls, Windows events (dll calls only)	403 MB	13,6 GB
ISCX Botnet 2014	Network	2004-2014	117	A combination of several network traffic datasets with the goal of creating a diverse and realistic botnet dataset	Enterprise IT	Undisclosed	🟩	pcaps	13,8 GB	-
Skopik 2014	Host	2014	22	Focus on realistically emulating user behavior, does not include attacks	Enterprise IT	Linux	🟥	misc. logs (Apache, database, mail server, bug tracker app)	-	-
Twente 2014	Both	2014	20	Anonymized network flows and host logs from real network, but only those related to ssh authentication, focusing on detecting related brute force attacks	Enterprise IT	Undisclosed	🟩	NetFlows	2,42 GB	5,8 GB
User-Computer Associations in Time	Host	2014	6	Large number of authentication events over a period of 9 months, but with very little detail and without any attacks	Enterprise IT	Undisclosed	🟥	Custom auth event logs	2,3 GB	-
ADFA-LD	Host	2013	146	Purely intended for anomaly-based approaches, provides only syscall numbers	Single OS	Linux	🟩	Sequences of syscall numbers	2 MB	17 MB
CIDD	Network	2012	21	Spin on the DARPA’98 dataset, correlating user behavior over different systems/environments for behavior-based IDSs	Military IT	Unix	🟩	Sequences of user “audits”	-	22 GB
ISCX IDS 2012	Network	2012	601	Focus on realistic traffic generation in a company network, combined with some basic attacks	Enterprise IT	Windows, Linux	🟩	pcaps	84 GB	87 GB
TUIDS	Network	2012	46	Dataset focusing on DoS attacks, but very poorly documented	Enterprise IT	Undisclosed	🟩	pcaps, NetFlows	-	-
VAST Challenge 2012	Network	2012	6	Originated from a challenge about data analytics, focus an a large network being the victim of a botnet	Enterprise IT	Undisclosed	🟨	Snort alerts, firewall logs	186 MB	2,9 GB
CTU 13	Network	2011	466	Collection of various botnet behavior combined with loads of background traffic, but very limited feature space	Enterprise IT	Windows, Undisclosed	🟩	pcaps, NetFlows, Bro logs	-	697 GB
VAST Challenge 2011	Both	2011	-	Originated from a challenge about data analytics, focus on network but also contains host logs. Labeling is a bit lacking	Enterprise IT	Windows	🟨	pcaps, Windows events, misc. logs (firewall, Snort, Nessus)	940 MB	9,3 GB
ISOT Botnet	Network	2004-2010	76	An amalgamation of several individual datasets, two containing malicious botnet traffic, and five datasets consisting of benign traffic	Enterprise IT	Undisclosed	🟩	pcaps	3 GB	10,6 GB
CDX CTF 2009	Both	2009	32	Dataset captured from a CTF event, generally intended to provide methods for reliable generating labeled datasets from such events	Enterprise IT	Windows, Linux	🟨	pcaps, Snort IDS alerts, Apache logs, Splunk logs	12 GB	15,3 GB
NSL-KDD	Network	2009	2247	An improvement of the original KDD’99 dataset, but still outdated at its core	Military IT	Unix	🟩	Connection records	6 MB	19 MB
Twente 2009	Network	2009	38	Intricately labeled network flows + alerts collected from a single honeypot over the span of 6 days	Single OS	Linux	🟩	NetFlows	303 MB	1,9 GB
gureKDDCup	Network	2008	12	An extension of the KDDCup 1999 dataset, adding additional information about payloads to each connection record	Military IT	Unix	🟩	Connection records with payload information	10 GB	-
KDD Cup 1999	Network	1999	-	Network connection events derived from simulated U.S. Air Force network under attack. No longer appropriate to use for multiple reasons	Military IT	Unix	🟩	Connection records	18 MB	743 MB
DARPA’98 Intrusion Detection Program	Both	1998	140	Simulation of a small U.S. Air Force network under attack. No longer appropriate to use for a multiple reasons	Military IT	Unix	🟨	tcpdumps, host audit logs, file system dumps	5 GB	-

Legend

¹ “Times Recently Cited” counts any time the underlying publication of a given dataset has been referenced by other publications in the last five years. This data is sourced from the Semantic Scholar API and automatically updated whenever the website is re-deployed. Some datasets are not backed by a publication and thus do not show a number here. Last updated: 2026-01-23 11:08:26 UTC

² Labeling:

🟩: Direct; provides explicit labels on at least a portion of the contained data
🟨: Indirect; provides some form of ground truth that allows for manual or automatic labeling (e.g., periods of attack)
🟥: No labels; does not provide any form of explicit labels or information that would allow for their creation