Architecture Patterns and SOLID Principles

Dexray Insight has undergone significant architectural refactoring to implement SOLID principles and modern design patterns. This document describes the architectural improvements, design patterns used, and the benefits they provide.

SOLID Principles Implementation

The framework now strictly adheres to SOLID principles throughout its architecture:

Single Responsibility Principle (SRP)

Before: Massive methods with multiple responsibilities

  • analyze_apk() method: 544 lines handling everything from setup to result aggregation

  • _assess_crypto_keys_exposure() method: 942 lines handling string collection, pattern detection, and result formatting

  • _create_full_results() method: 211 lines handling all result mapping and object creation

After: Focused methods with single responsibilities

# AnalysisEngine refactored into focused methods
def analyze_apk(self, apk_path: str, ...) -> FullAnalysisResults:
    """Orchestrate analysis workflow (82 lines)"""
    context = self._setup_analysis_context(apk_path, androguard_obj, timestamp)
    tool_results = self._execute_external_tools(context)
    module_results = self._execute_analysis_modules(context, requested_modules)
    security_results = self._perform_security_assessment(context, module_results)
    return self._create_full_results(module_results, tool_results, security_results, context)

Benefits:

  • Each method has a clear, single purpose

  • Easier to test individual responsibilities

  • Improved maintainability and debugging

  • Better code readability and understanding

Open/Closed Principle (OCP)

Implementation: Strategy Pattern for extensible secret detection

# New strategies can be added without modifying existing code
class CustomDetectionStrategy:
    def detect_secrets(self, strings_with_location):
        # Custom detection logic
        pass

# Usage in SensitiveDataAssessment
def _assess_crypto_keys_exposure(self, analysis_results):
    pattern_detector = PatternDetectionStrategy(self.detection_patterns, self.logger)
    # Could be replaced with CustomDetectionStrategy without changing this method
    detected_secrets = pattern_detector.detect_secrets(enhanced_strings)

Benefits:

  • New detection strategies can be added without modifying existing detection logic

  • Different strategies can be swapped based on configuration or requirements

  • Extensible architecture supports future enhancements

Liskov Substitution Principle (LSP)

Implementation: All strategy classes implement consistent interfaces

# All detection strategies can be substituted for each other
class BaseDetectionStrategy(ABC):
    @abstractmethod
    def detect_secrets(self, strings_with_location) -> List[Dict[str, Any]]:
        pass

class PatternDetectionStrategy(BaseDetectionStrategy):
    def detect_secrets(self, strings_with_location) -> List[Dict[str, Any]]:
        # Pattern-based detection implementation

class MLDetectionStrategy(BaseDetectionStrategy):  # Future extension
    def detect_secrets(self, strings_with_location) -> List[Dict[str, Any]]:
        # Machine learning-based detection implementation

Interface Segregation Principle (ISP)

Implementation: Focused interfaces for specific responsibilities

# Separate interfaces for different aspects
class StringCollector(ABC):
    @abstractmethod
    def collect_strings(self, analysis_results) -> List[Dict[str, Any]]:
        pass

class SecretDetector(ABC):
    @abstractmethod
    def detect_secrets(self, strings) -> List[Dict[str, Any]]:
        pass

class ResultClassifier(ABC):
    @abstractmethod
    def classify_by_severity(self, secrets) -> Dict[str, Any]:
        pass

Dependency Inversion Principle (DIP)

Implementation: Dependencies on abstractions, not concrete implementations

class SensitiveDataAssessment:
    def __init__(self, config: Dict[str, Any]):
        # Depends on abstractions (strategies), not concrete implementations
        self.string_collector = StringCollectionStrategy(self.logger)
        self.deep_analyzer = DeepAnalysisStrategy(self.logger)
        self.pattern_detector = PatternDetectionStrategy(self.detection_patterns, self.logger)
        # These could be injected as dependencies for better testability

Strategy Pattern Implementation

The secret detection system has been refactored using the Strategy Pattern to separate concerns and improve maintainability.

Strategy Pattern Overview

The Strategy Pattern allows selecting algorithms at runtime and makes the code more flexible and testable.

# Strategy Pattern workflow in secret detection
def _assess_crypto_keys_exposure(self, analysis_results: Dict[str, Any]) -> List[SecurityFinding]:
    # Strategy 1: String Collection
    string_collector = StringCollectionStrategy(self.logger)
    all_strings = string_collector.collect_strings(analysis_results)

    # Strategy 2: Deep Analysis Enhancement
    deep_analyzer = DeepAnalysisStrategy(self.logger)
    enhanced_strings = deep_analyzer.extract_deep_strings(analysis_results, all_strings)

    # Strategy 3: Pattern Detection
    pattern_detector = PatternDetectionStrategy(self.detection_patterns, self.logger)
    detected_secrets = pattern_detector.detect_secrets(enhanced_strings)

    # Strategy 4: Result Classification
    result_classifier = ResultClassificationStrategy()
    classified_results = result_classifier.classify_by_severity(detected_secrets)

    # Strategy 5: Finding Generation
    finding_generator = FindingGenerationStrategy(self.owasp_category)
    return finding_generator.generate_security_findings(classified_results)

StringCollectionStrategy

Responsibility: Collect strings from various analysis sources with location metadata

class StringCollectionStrategy:
    def collect_strings(self, analysis_results: Dict[str, Any]) -> List[Dict[str, Any]]:
        """
        Systematically extract strings from multiple sources:
        - String analysis module results
        - Android properties and system configuration
        - Raw strings from DEX analysis

        Returns list of dictionaries with 'value', 'location', 'file_path', 'line_number'
        """

Key Features:

  • Handles multiple string sources (analysis results, Android properties, raw strings)

  • Adds location metadata for traceability

  • Graceful handling of missing or malformed data

  • Supports both object-based and dictionary-based string analysis results

DeepAnalysisStrategy

Responsibility: Extract additional strings from deep analysis artifacts (XML, Smali, DEX)

class DeepAnalysisStrategy:
    def extract_deep_strings(self, analysis_results: Dict[str, Any],
                            existing_strings: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
        """
        Enhance string collection with deep analysis sources:
        - DEX object string extraction using Androguard
        - XML resource file string extraction
        - Smali code string extraction

        Only operates in 'deep' analysis mode for performance
        """

Analysis Modes:

  • DEEP mode: Full string extraction from DEX, XML, and Smali sources

  • FAST mode: Returns existing strings unchanged (performance optimization)

Benefits:

  • Significantly increased string coverage for secret detection

  • Performance-aware operation based on analysis mode

  • Comprehensive error handling and logging

PatternDetectionStrategy

Responsibility: Apply 54 different secret detection patterns to collected strings

class PatternDetectionStrategy:
    def detect_secrets(self, strings_with_location: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
        """
        Apply comprehensive pattern matching for secret detection:
        - 11 CRITICAL patterns (private keys, AWS credentials, etc.)
        - 22 HIGH patterns (API keys, JWT tokens, service credentials)
        - 13 MEDIUM patterns (database URIs, SSH keys, etc.)
        - 8 LOW patterns (S3 URLs, high-entropy strings, etc.)
        """

Detection Categories:

  • CRITICAL: Private keys, AWS credentials, GitHub tokens

  • HIGH: API keys, JWT tokens, service-specific credentials

  • MEDIUM: Database connection strings, SSH public keys

  • LOW: Service URLs, base64 strings, high-entropy data

ResultClassificationStrategy

Responsibility: Organize detected secrets by severity and prepare output formats

class ResultClassificationStrategy:
    def classify_by_severity(self, detected_secrets: List[Dict[str, Any]]) -> Dict[str, Any]:
        """
        Create two output formats:
        - Terminal display format with emojis and location info
        - Structured evidence entries for JSON export and detailed analysis
        """

Output Structure:

  • findings: Terminal-friendly display strings with emojis

  • secrets: Structured evidence entries with full metadata

FindingGenerationStrategy

Responsibility: Generate final SecurityFinding objects with remediation guidance

class FindingGenerationStrategy:
    def generate_security_findings(self, classified_results: Dict[str, Any]) -> List[SecurityFinding]:
        """
        Create SecurityFinding objects with:
        - Secret-finder style messaging with emojis
        - Comprehensive remediation steps
        - Evidence limited to prevent overwhelming output
        - Severity-appropriate recommendations
        """

Finding Features:

  • Secret-finder style titles: “🔴 CRITICAL: 2 Hard-coded Secrets Found”

  • Detailed remediation steps: 3-5 actionable steps per finding

  • Evidence limitation: 10-20 items max to prevent information overload

  • OWASP categorization: Proper mapping to A02:2021-Cryptographic Failures

Refactored AnalysisEngine Architecture

The AnalysisEngine has been refactored from monolithic methods to a clean, focused architecture.

Result Building Architecture

Before: Single massive method handling all result creation

After: Focused builder methods with clear responsibilities

def _create_full_results(self, module_results, tool_results, security_results, context):
    """Orchestrate result creation using focused builder methods (32 lines)"""
    apk_overview = self._build_apk_overview(module_results)
    in_depth_analysis = self._build_in_depth_analysis(module_results, context)
    apkid_results, kavanoz_results = self._build_tool_results(tool_results)

    # Assemble final results object
    full_results = FullAnalysisResults()
    # ... populate results
    return full_results

Builder Methods:

_build_apk_overview(module_results)

  • Responsibility: Create APK overview object from module results

  • Size: 26 lines (was part of 211-line method)

  • Features: Fallback to manifest analysis if APK overview failed

_build_in_depth_analysis(module_results, context)

  • Responsibility: Create in-depth analysis object using mapping methods

  • Size: 15 lines

  • Delegates to: 7 specialized mapping methods

_build_tool_results(tool_results)

  • Responsibility: Create external tool result objects

  • Size: 22 lines

  • Handles: APKID and Kavanoz results with success/failure handling

Mapping Architecture

Specialized mapping methods handle specific result types:

# Each mapping method has a single responsibility
def _map_manifest_results(self, in_depth_analysis, module_results):
    """Map manifest analysis results to in-depth analysis structure"""

def _map_permission_results(self, in_depth_analysis, module_results):
    """Map permission analysis results to in-depth analysis structure"""

def _map_string_results(self, in_depth_analysis, module_results, context):
    """Map string analysis results with fallback support"""

def _map_library_results(self, in_depth_analysis, module_results):
    """Map library detection results to in-depth analysis structure"""

String Analysis with Fallback:

def _map_string_results(self, in_depth_analysis, module_results, context):
    """Handle string results with built-in fallback logic"""
    string_result = module_results.get('string_analysis')

    if string_result and string_result.status.value == 'success':
        self._apply_successful_string_results(in_depth_analysis, string_result)
    else:
        # Resilient fallback using legacy string extraction
        self._apply_string_analysis_fallback(in_depth_analysis, context)

Benefits of New Architecture

Maintainability Improvements

Before: - Methods with 200+ lines were difficult to understand and modify - Mixed responsibilities made changes risky - Testing required complex setup for entire workflows

After: - Focused methods (5-25 lines) are easy to understand and modify - Single responsibilities make changes safer and more predictable - Individual methods can be tested in isolation

# Easy to test individual responsibilities
def test_string_collection_strategy():
    strategy = StringCollectionStrategy(mock_logger)
    result = strategy.collect_strings(mock_analysis_results)
    assert len(result) > 0
    assert all('value' in item for item in result)

Performance Improvements

Parallel Execution: Smaller methods enable better parallelization

# Methods can be executed in parallel when dependencies allow
with ThreadPoolExecutor() as executor:
    apk_future = executor.submit(self._build_apk_overview, module_results)
    tool_future = executor.submit(self._build_tool_results, tool_results)

    apk_overview = apk_future.result()
    apkid_results, kavanoz_results = tool_future.result()

Strategy Pattern Benefits: Different strategies can be optimized independently

# Fast strategy for basic analysis
if analysis_mode == 'fast':
    pattern_detector = FastPatternDetectionStrategy(basic_patterns, logger)
# Comprehensive strategy for deep analysis
else:
    pattern_detector = PatternDetectionStrategy(all_patterns, logger)

Extensibility Improvements

New Strategies: Easy to add new detection strategies

# Add machine learning-based detection without changing existing code
class MLSecretDetectionStrategy:
    def detect_secrets(self, strings_with_location):
        return self.ml_model.predict_secrets(strings_with_location)

New Result Builders: Easy to add new result types

# Add new result builder for custom analysis types
def _build_custom_results(self, module_results):
    """Build custom analysis results"""
    custom_result = module_results.get('custom_analysis')
    if custom_result and custom_result.status.value == 'success':
        return CustomResults(data=custom_result.findings)
    return CustomResults()

Testing Improvements

Unit Testing: Individual methods can be tested in isolation

class TestStringCollectionStrategy:
    def test_collect_strings_from_string_analysis(self):
        # Test specific responsibility without complex setup
        strategy = StringCollectionStrategy(mock_logger)
        result = strategy.collect_strings(mock_analysis_results)
        # Focused assertions on single responsibility

Integration Testing: Strategy coordination can be tested separately

class TestSecretDetectionWorkflow:
    def test_complete_strategy_workflow_integration(self):
        # Test strategy coordination without implementation details
        assessment = SensitiveDataAssessment(config)
        findings = assessment._assess_crypto_keys_exposure(mock_results)
        assert isinstance(findings, list)

Error Handling Improvements

Isolated Failures: Problems in one strategy don’t affect others

def _assess_crypto_keys_exposure(self, analysis_results):
    try:
        all_strings = string_collector.collect_strings(analysis_results)
    except Exception as e:
        self.logger.error(f"String collection failed: {e}")
        all_strings = []  # Continue with empty strings

    try:
        enhanced_strings = deep_analyzer.extract_deep_strings(analysis_results, all_strings)
    except Exception as e:
        self.logger.error(f"Deep analysis failed: {e}")
        enhanced_strings = all_strings  # Fall back to basic strings

Graceful Degradation: System continues to work even if some components fail

Migration Guide

For developers working with the refactored code:

Accessing Refactored Methods

Old approach (calling massive methods directly): - Direct access to monolithic methods was discouraged

New approach (using focused public interfaces):

# AnalysisEngine public interface remains the same
engine = AnalysisEngine(config)
results = engine.analyze_apk(apk_path)  # Same as before

# Internal methods are now focused and testable
# (but still internal - use public interface)

Working with Strategy Pattern

For security assessment customization:

# Custom strategy implementation
class CustomDetectionStrategy(PatternDetectionStrategy):
    def detect_secrets(self, strings_with_location):
        # Custom detection logic
        custom_secrets = self._apply_custom_patterns(strings_with_location)
        base_secrets = super().detect_secrets(strings_with_location)
        return custom_secrets + base_secrets

# Use in configuration
assessment = SensitiveDataAssessment(config)
# Could be extended to accept strategy injection

Testing Patterns

New testing patterns for focused methods:

# Test individual strategies
def test_pattern_detection_strategy():
    patterns = load_test_patterns()
    strategy = PatternDetectionStrategy(patterns, mock_logger)

    test_strings = [
        {'value': 'sk_test_12345', 'location': 'test.java', 'file_path': None, 'line_number': None}
    ]

    results = strategy.detect_secrets(test_strings)
    assert len(results) == 1
    assert results[0]['severity'] == 'HIGH'

Integration testing for strategy coordination:

# Test complete workflow
def test_security_assessment_integration():
    config = load_test_config()
    assessment = SensitiveDataAssessment(config)

    mock_results = create_mock_analysis_results()
    findings = assessment._assess_crypto_keys_exposure(mock_results)

    assert isinstance(findings, list)
    # Test workflow coordination without testing implementation details

This architectural refactoring provides a solid foundation for future enhancements while maintaining backward compatibility and improving code quality across all SOLID principles.