Architecture Patterns and SOLID Principles
Dexray Insight has undergone significant architectural refactoring to implement SOLID principles and modern design patterns. This document describes the architectural improvements, design patterns used, and the benefits they provide.
SOLID Principles Implementation
The framework now strictly adheres to SOLID principles throughout its architecture:
Single Responsibility Principle (SRP)
Before: Massive methods with multiple responsibilities
analyze_apk()
method: 544 lines handling everything from setup to result aggregation_assess_crypto_keys_exposure()
method: 942 lines handling string collection, pattern detection, and result formatting_create_full_results()
method: 211 lines handling all result mapping and object creation
After: Focused methods with single responsibilities
# AnalysisEngine refactored into focused methods
def analyze_apk(self, apk_path: str, ...) -> FullAnalysisResults:
"""Orchestrate analysis workflow (82 lines)"""
context = self._setup_analysis_context(apk_path, androguard_obj, timestamp)
tool_results = self._execute_external_tools(context)
module_results = self._execute_analysis_modules(context, requested_modules)
security_results = self._perform_security_assessment(context, module_results)
return self._create_full_results(module_results, tool_results, security_results, context)
Benefits:
Each method has a clear, single purpose
Easier to test individual responsibilities
Improved maintainability and debugging
Better code readability and understanding
Open/Closed Principle (OCP)
Implementation: Strategy Pattern for extensible secret detection
# New strategies can be added without modifying existing code
class CustomDetectionStrategy:
def detect_secrets(self, strings_with_location):
# Custom detection logic
pass
# Usage in SensitiveDataAssessment
def _assess_crypto_keys_exposure(self, analysis_results):
pattern_detector = PatternDetectionStrategy(self.detection_patterns, self.logger)
# Could be replaced with CustomDetectionStrategy without changing this method
detected_secrets = pattern_detector.detect_secrets(enhanced_strings)
Benefits:
New detection strategies can be added without modifying existing detection logic
Different strategies can be swapped based on configuration or requirements
Extensible architecture supports future enhancements
Liskov Substitution Principle (LSP)
Implementation: All strategy classes implement consistent interfaces
# All detection strategies can be substituted for each other
class BaseDetectionStrategy(ABC):
@abstractmethod
def detect_secrets(self, strings_with_location) -> List[Dict[str, Any]]:
pass
class PatternDetectionStrategy(BaseDetectionStrategy):
def detect_secrets(self, strings_with_location) -> List[Dict[str, Any]]:
# Pattern-based detection implementation
class MLDetectionStrategy(BaseDetectionStrategy): # Future extension
def detect_secrets(self, strings_with_location) -> List[Dict[str, Any]]:
# Machine learning-based detection implementation
Interface Segregation Principle (ISP)
Implementation: Focused interfaces for specific responsibilities
# Separate interfaces for different aspects
class StringCollector(ABC):
@abstractmethod
def collect_strings(self, analysis_results) -> List[Dict[str, Any]]:
pass
class SecretDetector(ABC):
@abstractmethod
def detect_secrets(self, strings) -> List[Dict[str, Any]]:
pass
class ResultClassifier(ABC):
@abstractmethod
def classify_by_severity(self, secrets) -> Dict[str, Any]:
pass
Dependency Inversion Principle (DIP)
Implementation: Dependencies on abstractions, not concrete implementations
class SensitiveDataAssessment:
def __init__(self, config: Dict[str, Any]):
# Depends on abstractions (strategies), not concrete implementations
self.string_collector = StringCollectionStrategy(self.logger)
self.deep_analyzer = DeepAnalysisStrategy(self.logger)
self.pattern_detector = PatternDetectionStrategy(self.detection_patterns, self.logger)
# These could be injected as dependencies for better testability
Strategy Pattern Implementation
The secret detection system has been refactored using the Strategy Pattern to separate concerns and improve maintainability.
Strategy Pattern Overview
The Strategy Pattern allows selecting algorithms at runtime and makes the code more flexible and testable.
# Strategy Pattern workflow in secret detection
def _assess_crypto_keys_exposure(self, analysis_results: Dict[str, Any]) -> List[SecurityFinding]:
# Strategy 1: String Collection
string_collector = StringCollectionStrategy(self.logger)
all_strings = string_collector.collect_strings(analysis_results)
# Strategy 2: Deep Analysis Enhancement
deep_analyzer = DeepAnalysisStrategy(self.logger)
enhanced_strings = deep_analyzer.extract_deep_strings(analysis_results, all_strings)
# Strategy 3: Pattern Detection
pattern_detector = PatternDetectionStrategy(self.detection_patterns, self.logger)
detected_secrets = pattern_detector.detect_secrets(enhanced_strings)
# Strategy 4: Result Classification
result_classifier = ResultClassificationStrategy()
classified_results = result_classifier.classify_by_severity(detected_secrets)
# Strategy 5: Finding Generation
finding_generator = FindingGenerationStrategy(self.owasp_category)
return finding_generator.generate_security_findings(classified_results)
StringCollectionStrategy
Responsibility: Collect strings from various analysis sources with location metadata
class StringCollectionStrategy:
def collect_strings(self, analysis_results: Dict[str, Any]) -> List[Dict[str, Any]]:
"""
Systematically extract strings from multiple sources:
- String analysis module results
- Android properties and system configuration
- Raw strings from DEX analysis
Returns list of dictionaries with 'value', 'location', 'file_path', 'line_number'
"""
Key Features:
Handles multiple string sources (analysis results, Android properties, raw strings)
Adds location metadata for traceability
Graceful handling of missing or malformed data
Supports both object-based and dictionary-based string analysis results
DeepAnalysisStrategy
Responsibility: Extract additional strings from deep analysis artifacts (XML, Smali, DEX)
class DeepAnalysisStrategy:
def extract_deep_strings(self, analysis_results: Dict[str, Any],
existing_strings: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""
Enhance string collection with deep analysis sources:
- DEX object string extraction using Androguard
- XML resource file string extraction
- Smali code string extraction
Only operates in 'deep' analysis mode for performance
"""
Analysis Modes:
DEEP mode: Full string extraction from DEX, XML, and Smali sources
FAST mode: Returns existing strings unchanged (performance optimization)
Benefits:
Significantly increased string coverage for secret detection
Performance-aware operation based on analysis mode
Comprehensive error handling and logging
PatternDetectionStrategy
Responsibility: Apply 54 different secret detection patterns to collected strings
class PatternDetectionStrategy:
def detect_secrets(self, strings_with_location: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""
Apply comprehensive pattern matching for secret detection:
- 11 CRITICAL patterns (private keys, AWS credentials, etc.)
- 22 HIGH patterns (API keys, JWT tokens, service credentials)
- 13 MEDIUM patterns (database URIs, SSH keys, etc.)
- 8 LOW patterns (S3 URLs, high-entropy strings, etc.)
"""
Detection Categories:
CRITICAL: Private keys, AWS credentials, GitHub tokens
HIGH: API keys, JWT tokens, service-specific credentials
MEDIUM: Database connection strings, SSH public keys
LOW: Service URLs, base64 strings, high-entropy data
ResultClassificationStrategy
Responsibility: Organize detected secrets by severity and prepare output formats
class ResultClassificationStrategy:
def classify_by_severity(self, detected_secrets: List[Dict[str, Any]]) -> Dict[str, Any]:
"""
Create two output formats:
- Terminal display format with emojis and location info
- Structured evidence entries for JSON export and detailed analysis
"""
Output Structure:
findings: Terminal-friendly display strings with emojis
secrets: Structured evidence entries with full metadata
FindingGenerationStrategy
Responsibility: Generate final SecurityFinding objects with remediation guidance
class FindingGenerationStrategy:
def generate_security_findings(self, classified_results: Dict[str, Any]) -> List[SecurityFinding]:
"""
Create SecurityFinding objects with:
- Secret-finder style messaging with emojis
- Comprehensive remediation steps
- Evidence limited to prevent overwhelming output
- Severity-appropriate recommendations
"""
Finding Features:
Secret-finder style titles: “🔴 CRITICAL: 2 Hard-coded Secrets Found”
Detailed remediation steps: 3-5 actionable steps per finding
Evidence limitation: 10-20 items max to prevent information overload
OWASP categorization: Proper mapping to A02:2021-Cryptographic Failures
Refactored AnalysisEngine Architecture
The AnalysisEngine has been refactored from monolithic methods to a clean, focused architecture.
Result Building Architecture
Before: Single massive method handling all result creation
After: Focused builder methods with clear responsibilities
def _create_full_results(self, module_results, tool_results, security_results, context):
"""Orchestrate result creation using focused builder methods (32 lines)"""
apk_overview = self._build_apk_overview(module_results)
in_depth_analysis = self._build_in_depth_analysis(module_results, context)
apkid_results, kavanoz_results = self._build_tool_results(tool_results)
# Assemble final results object
full_results = FullAnalysisResults()
# ... populate results
return full_results
Builder Methods:
_build_apk_overview(module_results)
Responsibility: Create APK overview object from module results
Size: 26 lines (was part of 211-line method)
Features: Fallback to manifest analysis if APK overview failed
_build_in_depth_analysis(module_results, context)
Responsibility: Create in-depth analysis object using mapping methods
Size: 15 lines
Delegates to: 7 specialized mapping methods
_build_tool_results(tool_results)
Responsibility: Create external tool result objects
Size: 22 lines
Handles: APKID and Kavanoz results with success/failure handling
Mapping Architecture
Specialized mapping methods handle specific result types:
# Each mapping method has a single responsibility
def _map_manifest_results(self, in_depth_analysis, module_results):
"""Map manifest analysis results to in-depth analysis structure"""
def _map_permission_results(self, in_depth_analysis, module_results):
"""Map permission analysis results to in-depth analysis structure"""
def _map_string_results(self, in_depth_analysis, module_results, context):
"""Map string analysis results with fallback support"""
def _map_library_results(self, in_depth_analysis, module_results):
"""Map library detection results to in-depth analysis structure"""
String Analysis with Fallback:
def _map_string_results(self, in_depth_analysis, module_results, context):
"""Handle string results with built-in fallback logic"""
string_result = module_results.get('string_analysis')
if string_result and string_result.status.value == 'success':
self._apply_successful_string_results(in_depth_analysis, string_result)
else:
# Resilient fallback using legacy string extraction
self._apply_string_analysis_fallback(in_depth_analysis, context)
Benefits of New Architecture
Maintainability Improvements
Before: - Methods with 200+ lines were difficult to understand and modify - Mixed responsibilities made changes risky - Testing required complex setup for entire workflows
After: - Focused methods (5-25 lines) are easy to understand and modify - Single responsibilities make changes safer and more predictable - Individual methods can be tested in isolation
# Easy to test individual responsibilities
def test_string_collection_strategy():
strategy = StringCollectionStrategy(mock_logger)
result = strategy.collect_strings(mock_analysis_results)
assert len(result) > 0
assert all('value' in item for item in result)
Performance Improvements
Parallel Execution: Smaller methods enable better parallelization
# Methods can be executed in parallel when dependencies allow
with ThreadPoolExecutor() as executor:
apk_future = executor.submit(self._build_apk_overview, module_results)
tool_future = executor.submit(self._build_tool_results, tool_results)
apk_overview = apk_future.result()
apkid_results, kavanoz_results = tool_future.result()
Strategy Pattern Benefits: Different strategies can be optimized independently
# Fast strategy for basic analysis
if analysis_mode == 'fast':
pattern_detector = FastPatternDetectionStrategy(basic_patterns, logger)
# Comprehensive strategy for deep analysis
else:
pattern_detector = PatternDetectionStrategy(all_patterns, logger)
Extensibility Improvements
New Strategies: Easy to add new detection strategies
# Add machine learning-based detection without changing existing code
class MLSecretDetectionStrategy:
def detect_secrets(self, strings_with_location):
return self.ml_model.predict_secrets(strings_with_location)
New Result Builders: Easy to add new result types
# Add new result builder for custom analysis types
def _build_custom_results(self, module_results):
"""Build custom analysis results"""
custom_result = module_results.get('custom_analysis')
if custom_result and custom_result.status.value == 'success':
return CustomResults(data=custom_result.findings)
return CustomResults()
Testing Improvements
Unit Testing: Individual methods can be tested in isolation
class TestStringCollectionStrategy:
def test_collect_strings_from_string_analysis(self):
# Test specific responsibility without complex setup
strategy = StringCollectionStrategy(mock_logger)
result = strategy.collect_strings(mock_analysis_results)
# Focused assertions on single responsibility
Integration Testing: Strategy coordination can be tested separately
class TestSecretDetectionWorkflow:
def test_complete_strategy_workflow_integration(self):
# Test strategy coordination without implementation details
assessment = SensitiveDataAssessment(config)
findings = assessment._assess_crypto_keys_exposure(mock_results)
assert isinstance(findings, list)
Error Handling Improvements
Isolated Failures: Problems in one strategy don’t affect others
def _assess_crypto_keys_exposure(self, analysis_results):
try:
all_strings = string_collector.collect_strings(analysis_results)
except Exception as e:
self.logger.error(f"String collection failed: {e}")
all_strings = [] # Continue with empty strings
try:
enhanced_strings = deep_analyzer.extract_deep_strings(analysis_results, all_strings)
except Exception as e:
self.logger.error(f"Deep analysis failed: {e}")
enhanced_strings = all_strings # Fall back to basic strings
Graceful Degradation: System continues to work even if some components fail
Migration Guide
For developers working with the refactored code:
Accessing Refactored Methods
Old approach (calling massive methods directly): - Direct access to monolithic methods was discouraged
New approach (using focused public interfaces):
# AnalysisEngine public interface remains the same
engine = AnalysisEngine(config)
results = engine.analyze_apk(apk_path) # Same as before
# Internal methods are now focused and testable
# (but still internal - use public interface)
Working with Strategy Pattern
For security assessment customization:
# Custom strategy implementation
class CustomDetectionStrategy(PatternDetectionStrategy):
def detect_secrets(self, strings_with_location):
# Custom detection logic
custom_secrets = self._apply_custom_patterns(strings_with_location)
base_secrets = super().detect_secrets(strings_with_location)
return custom_secrets + base_secrets
# Use in configuration
assessment = SensitiveDataAssessment(config)
# Could be extended to accept strategy injection
Testing Patterns
New testing patterns for focused methods:
# Test individual strategies
def test_pattern_detection_strategy():
patterns = load_test_patterns()
strategy = PatternDetectionStrategy(patterns, mock_logger)
test_strings = [
{'value': 'sk_test_12345', 'location': 'test.java', 'file_path': None, 'line_number': None}
]
results = strategy.detect_secrets(test_strings)
assert len(results) == 1
assert results[0]['severity'] == 'HIGH'
Integration testing for strategy coordination:
# Test complete workflow
def test_security_assessment_integration():
config = load_test_config()
assessment = SensitiveDataAssessment(config)
mock_results = create_mock_analysis_results()
findings = assessment._assess_crypto_keys_exposure(mock_results)
assert isinstance(findings, list)
# Test workflow coordination without testing implementation details
This architectural refactoring provides a solid foundation for future enhancements while maintaining backward compatibility and improving code quality across all SOLID principles.