CICIDS2017 Baseline Models - Evaluation Report¶
Generated: 2025-10-13 18:51:02 Mission: OPERATION ML-BASELINE Agent: HOLLOWED_EYES
Executive Summary¶
This report presents the performance evaluation of three baseline machine learning models trained on the CICIDS2017 intrusion detection dataset for binary classification (BENIGN vs ATTACK).
Model Performance Comparison¶
| Model | Accuracy | Precision | Recall | F1-Score | FP Rate | Inference Time |
|---|---|---|---|---|---|---|
| Random Forest | 99.28% | 99.29% | 99.28% | 99.28% | 0.25% | 0.0008ms |
| Xgboost | 99.21% | 99.23% | 99.21% | 99.21% | 0.09% | 0.0003ms |
| Decision Tree | 99.10% | 99.13% | 99.10% | 99.11% | 0.24% | 0.0002ms |
Detailed Model Results¶
Random Forest¶
Classification Metrics: - Accuracy: 99.28% - Precision: 99.29% - Recall: 99.28% - F1-Score: 99.28% - False Positive Rate: 0.25%
Performance Characteristics: - Training Time: 2.57s - Average Inference Time: 0.0008ms/sample - Model Size: 2.93MB
Confusion Matrix:
- True Negatives (BENIGN correctly identified): 8,840
- False Positives (BENIGN incorrectly flagged as ATTACK): 22
- False Negatives (ATTACK missed): 282
- True Positives (ATTACK correctly detected): 32,858
Xgboost¶
Classification Metrics: - Accuracy: 99.21% - Precision: 99.23% - Recall: 99.21% - F1-Score: 99.21% - False Positive Rate: 0.09%
Performance Characteristics: - Training Time: 0.79s - Average Inference Time: 0.0003ms/sample - Model Size: 0.18MB
Confusion Matrix:
- True Negatives (BENIGN correctly identified): 8,854
- False Positives (BENIGN incorrectly flagged as ATTACK): 8
- False Negatives (ATTACK missed): 325
- True Positives (ATTACK correctly detected): 32,815
Decision Tree¶
Classification Metrics: - Accuracy: 99.10% - Precision: 99.13% - Recall: 99.10% - F1-Score: 99.11% - False Positive Rate: 0.24%
Performance Characteristics: - Training Time: 5.22s - Average Inference Time: 0.0002ms/sample - Model Size: 0.03MB
Confusion Matrix:
- True Negatives (BENIGN correctly identified): 8,841
- False Positives (BENIGN incorrectly flagged as ATTACK): 21
- False Negatives (ATTACK missed): 355
- True Positives (ATTACK correctly detected): 32,785
Best Model Recommendation¶
Highest Accuracy: Random Forest (99.28%)
Best F1-Score: Random Forest (99.28%)
Fastest Inference: Decision Tree (0.0002ms/sample)
Recommendation for Production:
The following model(s) meet all performance targets: Random Forest, Xgboost, Decision Tree