CSV Download

Download here

This .csv file is automatically created by parsing all currently existing dataset entries. It can be used to sort and filter data in a spreadsheet program or generate statistics and plots. The following fields are present for each dataset (semicolon-delimited):

Field Name Description
Name Name of the dataset
Network Data Does this dataset feature network-based data (Yes/No)
Host Data Does this dataset feature host-based data (Yes/No)
Start Year Year in which data collection started
End Year Year in which data collection ended (usually the same as Start Year, but not always)
Setting Setting of the underlying scenario (Single OS/Enterprise IT/Military IT/Subsystem/Miscellaneous/Undisclosed)
OS Type OS families that were part of the underlying scenario (Windows/Linux/Unix/MacOS/Undisclosed)
Network Data Source Source of network data (e.g., pcaps or NetFlows)
Network Data Labeled If and how labels for network data are available
Host Data Source Source of host data (e.g., Windows events or ssh auth logs)
Host Data Labeled If and how labels for host data are available
Attack Categories Types of attacks in the underlying scenario
Benign activity How benign activity (aka “normal behavior”) was generated in the underlying scenario
Packed Size in MB Size of the entire dataset when packed, in MB
Unpacked Size in MB Size of the entire dataset when unpacked, in MB
Times Recently Cited Number of times the underlying publication of the dataset was cited in the last five years, sourced from the S2 API

Note: Missing values are indicated by a single hyphen (-).