Architecture Patterns and SOLID Principles

Dexray Insight has undergone significant architectural refactoring to implement SOLID principles and modern design patterns. This document describes the architectural improvements, design patterns used, and the benefits they provide.

SOLID Principles Implementation

The framework now strictly adheres to SOLID principles throughout its architecture:

Single Responsibility Principle (SRP)

Before: Massive methods with multiple responsibilities

analyze_apk() method: 544 lines handling everything from setup to result aggregation
_assess_crypto_keys_exposure() method: 942 lines handling string collection, pattern detection, and result formatting
_create_full_results() method: 211 lines handling all result mapping and object creation

After: Focused methods with single responsibilities

# AnalysisEngine refactored into focused methods
def analyze_apk(self, apk_path: str, ...) -> FullAnalysisResults:
    """Orchestrate analysis workflow (82 lines)"""
    context = self._setup_analysis_context(apk_path, androguard_obj, timestamp)
    tool_results = self._execute_external_tools(context)
    module_results = self._execute_analysis_modules(context, requested_modules)
    security_results = self._perform_security_assessment(context, module_results)
    return self._create_full_results(module_results, tool_results, security_results, context)

Benefits:

Each method has a clear, single purpose
Easier to test individual responsibilities
Improved maintainability and debugging
Better code readability and understanding

Open/Closed Principle (OCP)

Implementation: Strategy Pattern for extensible secret detection

# New strategies can be added without modifying existing code
class CustomDetectionStrategy:
    def detect_secrets(self, strings_with_location):
        # Custom detection logic
        pass

# Usage in SensitiveDataAssessment
def _assess_crypto_keys_exposure(self, analysis_results):
    pattern_detector = PatternDetectionStrategy(self.detection_patterns, self.logger)
    # Could be replaced with CustomDetectionStrategy without changing this method
    detected_secrets = pattern_detector.detect_secrets(enhanced_strings)

Benefits:

New detection strategies can be added without modifying existing detection logic
Different strategies can be swapped based on configuration or requirements
Extensible architecture supports future enhancements

Liskov Substitution Principle (LSP)

Implementation: All strategy classes implement consistent interfaces

# All detection strategies can be substituted for each other
class BaseDetectionStrategy(ABC):
    @abstractmethod
    def detect_secrets(self, strings_with_location) -> List[Dict[str, Any]]:
        pass

class PatternDetectionStrategy(BaseDetectionStrategy):
    def detect_secrets(self, strings_with_location) -> List[Dict[str, Any]]:
        # Pattern-based detection implementation

class MLDetectionStrategy(BaseDetectionStrategy):  # Future extension
    def detect_secrets(self, strings_with_location) -> List[Dict[str, Any]]:
        # Machine learning-based detection implementation

Interface Segregation Principle (ISP)

Implementation: Focused interfaces for specific responsibilities

# Separate interfaces for different aspects
class StringCollector(ABC):
    @abstractmethod
    def collect_strings(self, analysis_results) -> List[Dict[str, Any]]:
        pass

class SecretDetector(ABC):
    @abstractmethod
    def detect_secrets(self, strings) -> List[Dict[str, Any]]:
        pass

class ResultClassifier(ABC):
    @abstractmethod
    def classify_by_severity(self, secrets) -> Dict[str, Any]:
        pass

Dependency Inversion Principle (DIP)

Implementation: Dependencies on abstractions, not concrete implementations

class SensitiveDataAssessment:
    def __init__(self, config: Dict[str, Any]):
        # Depends on abstractions (strategies), not concrete implementations
        self.string_collector = StringCollectionStrategy(self.logger)
        self.deep_analyzer = DeepAnalysisStrategy(self.logger)
        self.pattern_detector = PatternDetectionStrategy(self.detection_patterns, self.logger)
        # These could be injected as dependencies for better testability

Strategy Pattern Implementation

The secret detection system has been refactored using the Strategy Pattern to separate concerns and improve maintainability.

Strategy Pattern Overview

The Strategy Pattern allows selecting algorithms at runtime and makes the code more flexible and testable.

# Strategy Pattern workflow in secret detection
def _assess_crypto_keys_exposure(self, analysis_results: Dict[str, Any]) -> List[SecurityFinding]:
    # Strategy 1: String Collection
    string_collector = StringCollectionStrategy(self.logger)
    all_strings = string_collector.collect_strings(analysis_results)

    # Strategy 2: Deep Analysis Enhancement
    deep_analyzer = DeepAnalysisStrategy(self.logger)
    enhanced_strings = deep_analyzer.extract_deep_strings(analysis_results, all_strings)

    # Strategy 3: Pattern Detection
    pattern_detector = PatternDetectionStrategy(self.detection_patterns, self.logger)
    detected_secrets = pattern_detector.detect_secrets(enhanced_strings)

    # Strategy 4: Result Classification
    result_classifier = ResultClassificationStrategy()
    classified_results = result_classifier.classify_by_severity(detected_secrets)

    # Strategy 5: Finding Generation
    finding_generator = FindingGenerationStrategy(self.owasp_category)
    return finding_generator.generate_security_findings(classified_results)

StringCollectionStrategy

Responsibility: Collect strings from various analysis sources with location metadata

class StringCollectionStrategy:
    def collect_strings(self, analysis_results: Dict[str, Any]) -> List[Dict[str, Any]]:
        """
        Systematically extract strings from multiple sources:
        - String analysis module results
        - Android properties and system configuration
        - Raw strings from DEX analysis

        Returns list of dictionaries with 'value', 'location', 'file_path', 'line_number'
        """

Key Features:

Handles multiple string sources (analysis results, Android properties, raw strings)
Adds location metadata for traceability
Graceful handling of missing or malformed data
Supports both object-based and dictionary-based string analysis results

DeepAnalysisStrategy

Responsibility: Extract additional strings from deep analysis artifacts (XML, Smali, DEX)

class DeepAnalysisStrategy:
    def extract_deep_strings(self, analysis_results: Dict[str, Any],
                            existing_strings: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
        """
        Enhance string collection with deep analysis sources:
        - DEX object string extraction using Androguard
        - XML resource file string extraction
        - Smali code string extraction

        Only operates in 'deep' analysis mode for performance
        """

Analysis Modes:

DEEP mode: Full string extraction from DEX, XML, and Smali sources
FAST mode: Returns existing strings unchanged (performance optimization)

Benefits:

Significantly increased string coverage for secret detection
Performance-aware operation based on analysis mode
Comprehensive error handling and logging

PatternDetectionStrategy

Responsibility: Apply 54 different secret detection patterns to collected strings

class PatternDetectionStrategy:
    def detect_secrets(self, strings_with_location: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
        """
        Apply comprehensive pattern matching for secret detection:
        - 11 CRITICAL patterns (private keys, AWS credentials, etc.)
        - 22 HIGH patterns (API keys, JWT tokens, service credentials)
        - 13 MEDIUM patterns (database URIs, SSH keys, etc.)
        - 8 LOW patterns (S3 URLs, high-entropy strings, etc.)
        """

Detection Categories:

CRITICAL: Private keys, AWS credentials, GitHub tokens
HIGH: API keys, JWT tokens, service-specific credentials
MEDIUM: Database connection strings, SSH public keys
LOW: Service URLs, base64 strings, high-entropy data

ResultClassificationStrategy

Responsibility: Organize detected secrets by severity and prepare output formats

class ResultClassificationStrategy:
    def classify_by_severity(self, detected_secrets: List[Dict[str, Any]]) -> Dict[str, Any]:
        """
        Create two output formats:
        - Terminal display format with emojis and location info
        - Structured evidence entries for JSON export and detailed analysis
        """

Output Structure:

findings: Terminal-friendly display strings with emojis
secrets: Structured evidence entries with full metadata

FindingGenerationStrategy

Responsibility: Generate final SecurityFinding objects with remediation guidance

class FindingGenerationStrategy:
    def generate_security_findings(self, classified_results: Dict[str, Any]) -> List[SecurityFinding]:
        """
        Create SecurityFinding objects with:
        - Secret-finder style messaging with emojis
        - Comprehensive remediation steps
        - Evidence limited to prevent overwhelming output
        - Severity-appropriate recommendations
        """

Finding Features:

Secret-finder style titles: “🔴 CRITICAL: 2 Hard-coded Secrets Found”
Detailed remediation steps: 3-5 actionable steps per finding
Evidence limitation: 10-20 items max to prevent information overload
OWASP categorization: Proper mapping to A02:2021-Cryptographic Failures

Refactored AnalysisEngine Architecture

The AnalysisEngine has been refactored from monolithic methods to a clean, focused architecture.

Result Building Architecture

Before: Single massive method handling all result creation

After: Focused builder methods with clear responsibilities

def _create_full_results(self, module_results, tool_results, security_results, context):
    """Orchestrate result creation using focused builder methods (32 lines)"""
    apk_overview = self._build_apk_overview(module_results)
    in_depth_analysis = self._build_in_depth_analysis(module_results, context)
    apkid_results, kavanoz_results = self._build_tool_results(tool_results)

    # Assemble final results object
    full_results = FullAnalysisResults()
    # ... populate results
    return full_results

Builder Methods:

`_build_apk_overview(module_results)`

Responsibility: Create APK overview object from module results
Size: 26 lines (was part of 211-line method)
Features: Fallback to manifest analysis if APK overview failed

`_build_in_depth_analysis(module_results, context)`

Responsibility: Create in-depth analysis object using mapping methods
Size: 15 lines
Delegates to: 7 specialized mapping methods

`_build_tool_results(tool_results)`

Responsibility: Create external tool result objects
Size: 22 lines
Handles: APKID and Kavanoz results with success/failure handling

Mapping Architecture

Specialized mapping methods handle specific result types:

# Each mapping method has a single responsibility
def _map_manifest_results(self, in_depth_analysis, module_results):
    """Map manifest analysis results to in-depth analysis structure"""

def _map_permission_results(self, in_depth_analysis, module_results):
    """Map permission analysis results to in-depth analysis structure"""

def _map_string_results(self, in_depth_analysis, module_results, context):
    """Map string analysis results with fallback support"""

def _map_library_results(self, in_depth_analysis, module_results):
    """Map library detection results to in-depth analysis structure"""

String Analysis with Fallback:

def _map_string_results(self, in_depth_analysis, module_results, context):
    """Handle string results with built-in fallback logic"""
    string_result = module_results.get('string_analysis')

    if string_result and string_result.status.value == 'success':
        self._apply_successful_string_results(in_depth_analysis, string_result)
    else:
        # Resilient fallback using legacy string extraction
        self._apply_string_analysis_fallback(in_depth_analysis, context)

Benefits of New Architecture

Maintainability Improvements

Before: - Methods with 200+ lines were difficult to understand and modify - Mixed responsibilities made changes risky - Testing required complex setup for entire workflows

After: - Focused methods (5-25 lines) are easy to understand and modify - Single responsibilities make changes safer and more predictable - Individual methods can be tested in isolation

# Easy to test individual responsibilities
def test_string_collection_strategy():
    strategy = StringCollectionStrategy(mock_logger)
    result = strategy.collect_strings(mock_analysis_results)
    assert len(result) > 0
    assert all('value' in item for item in result)

Performance Improvements

Parallel Execution: Smaller methods enable better parallelization

# Methods can be executed in parallel when dependencies allow
with ThreadPoolExecutor() as executor:
    apk_future = executor.submit(self._build_apk_overview, module_results)
    tool_future = executor.submit(self._build_tool_results, tool_results)

    apk_overview = apk_future.result()
    apkid_results, kavanoz_results = tool_future.result()

Strategy Pattern Benefits: Different strategies can be optimized independently

# Fast strategy for basic analysis
if analysis_mode == 'fast':
    pattern_detector = FastPatternDetectionStrategy(basic_patterns, logger)
# Comprehensive strategy for deep analysis
else:
    pattern_detector = PatternDetectionStrategy(all_patterns, logger)

Extensibility Improvements

New Strategies: Easy to add new detection strategies

# Add machine learning-based detection without changing existing code
class MLSecretDetectionStrategy:
    def detect_secrets(self, strings_with_location):
        return self.ml_model.predict_secrets(strings_with_location)

New Result Builders: Easy to add new result types

# Add new result builder for custom analysis types
def _build_custom_results(self, module_results):
    """Build custom analysis results"""
    custom_result = module_results.get('custom_analysis')
    if custom_result and custom_result.status.value == 'success':
        return CustomResults(data=custom_result.findings)
    return CustomResults()

Testing Improvements

Unit Testing: Individual methods can be tested in isolation

class TestStringCollectionStrategy:
    def test_collect_strings_from_string_analysis(self):
        # Test specific responsibility without complex setup
        strategy = StringCollectionStrategy(mock_logger)
        result = strategy.collect_strings(mock_analysis_results)
        # Focused assertions on single responsibility

Integration Testing: Strategy coordination can be tested separately

class TestSecretDetectionWorkflow:
    def test_complete_strategy_workflow_integration(self):
        # Test strategy coordination without implementation details
        assessment = SensitiveDataAssessment(config)
        findings = assessment._assess_crypto_keys_exposure(mock_results)
        assert isinstance(findings, list)

Error Handling Improvements

Isolated Failures: Problems in one strategy don’t affect others

def _assess_crypto_keys_exposure(self, analysis_results):
    try:
        all_strings = string_collector.collect_strings(analysis_results)
    except Exception as e:
        self.logger.error(f"String collection failed: {e}")
        all_strings = []  # Continue with empty strings

    try:
        enhanced_strings = deep_analyzer.extract_deep_strings(analysis_results, all_strings)
    except Exception as e:
        self.logger.error(f"Deep analysis failed: {e}")
        enhanced_strings = all_strings  # Fall back to basic strings

Graceful Degradation: System continues to work even if some components fail

Migration Guide

For developers working with the refactored code:

Accessing Refactored Methods

Old approach (calling massive methods directly): - Direct access to monolithic methods was discouraged

New approach (using focused public interfaces):

# AnalysisEngine public interface remains the same
engine = AnalysisEngine(config)
results = engine.analyze_apk(apk_path)  # Same as before

# Internal methods are now focused and testable
# (but still internal - use public interface)

Working with Strategy Pattern

For security assessment customization:

# Custom strategy implementation
class CustomDetectionStrategy(PatternDetectionStrategy):
    def detect_secrets(self, strings_with_location):
        # Custom detection logic
        custom_secrets = self._apply_custom_patterns(strings_with_location)
        base_secrets = super().detect_secrets(strings_with_location)
        return custom_secrets + base_secrets

# Use in configuration
assessment = SensitiveDataAssessment(config)
# Could be extended to accept strategy injection

Testing Patterns

New testing patterns for focused methods:

# Test individual strategies
def test_pattern_detection_strategy():
    patterns = load_test_patterns()
    strategy = PatternDetectionStrategy(patterns, mock_logger)

    test_strings = [
        {'value': 'sk_test_12345', 'location': 'test.java', 'file_path': None, 'line_number': None}
    ]

    results = strategy.detect_secrets(test_strings)
    assert len(results) == 1
    assert results[0]['severity'] == 'HIGH'

Integration testing for strategy coordination:

# Test complete workflow
def test_security_assessment_integration():
    config = load_test_config()
    assessment = SensitiveDataAssessment(config)

    mock_results = create_mock_analysis_results()
    findings = assessment._assess_crypto_keys_exposure(mock_results)

    assert isinstance(findings, list)
    # Test workflow coordination without testing implementation details

This architectural refactoring provides a solid foundation for future enhancements while maintaining backward compatibility and improving code quality across all SOLID principles.

Architecture Patterns and SOLID Principles

SOLID Principles Implementation

Single Responsibility Principle (SRP)

Open/Closed Principle (OCP)

Liskov Substitution Principle (LSP)

Interface Segregation Principle (ISP)

Dependency Inversion Principle (DIP)

Strategy Pattern Implementation

Strategy Pattern Overview

StringCollectionStrategy

DeepAnalysisStrategy

PatternDetectionStrategy

ResultClassificationStrategy

FindingGenerationStrategy

Refactored AnalysisEngine Architecture

Result Building Architecture

_build_apk_overview(module_results)

_build_in_depth_analysis(module_results, context)

_build_tool_results(tool_results)

Mapping Architecture

Benefits of New Architecture

Maintainability Improvements

Performance Improvements

Extensibility Improvements

Testing Improvements

Error Handling Improvements

Migration Guide

Accessing Refactored Methods

Working with Strategy Pattern

Testing Patterns

`_build_apk_overview(module_results)`

`_build_in_depth_analysis(module_results, context)`

`_build_tool_results(tool_results)`