Skip to content

System Architecture

Version: 0.9.8.1

Overview of the CreativeDynamics library architecture. The library is a flexible, data-source-agnostic toolkit for time-series analysis, using rough path signatures to detect patterns and changes.

Python library providing tools for analysing time-series data, primarily for creative fatigue detection in marketing. Usable via Python script imports, command-line interface (CLI), and FastAPI. CLI approach recommended for most use cases.

Core principle: operates on pandas DataFrames. Users load data from any source (CSV, databases, etc.) and apply the library’s analytical capabilities.

Structured into core analytical engine and interaction interfaces.

  1. analyzer.py

    • Heart of the library containing main analysis logic
    • Primary function analyze_all_items takes dictionary of pandas DataFrames and orchestrates entire analysis pipeline: change-point detection, trend classification, impact metrics calculation, plotting, report generation
    • Data-source agnostic; data loading and preparation happen before calling library
  2. signature_calculator.py

    • Specialised module for computing rough path signatures
    • Uses roughpy library for efficient and accurate signature calculations
    • Provides compute_signature() function used by analyzer.py
  3. contracts/

    • Canonical naming contracts for shared concepts
    • Defines standard metric identifiers (for example ctr, cpc) and column identifiers (for example day)
    • Provides normalisation helpers to reduce casing and naming drift across interfaces
  4. api/main.py and api/adapter.py

    • Implements the FastAPI-based HTTP API
    • v1 analysis endpoints call the same core pipeline as the CLI/library via the API adapter
    • Adapter performs input normalisation (records + column mapping) before calling the core pipeline
  5. _version.py & __init__.py

    • Standard Python files for package initialisation and version tracking
  1. Direct library use (e.g., run_analysis.py)

    • Flexible way to use library, though CLI now recommended for most use cases

    • User script responsible for:

      1. Loading data from any source into pandas DataFrame
      2. Performing necessary preprocessing (cleaning, filtering)
      3. Structuring data into library format (dictionary of DataFrames, where each key is item to be analysed)
      4. Calling creativedynamics.analyze_all_items to run analysis
  2. cli.py (top-level script)

    • Provides command-line interface for running predefined analysis pipeline
    • Recommended approach for most analyses, offering standardised, repeatable workflows with YAML configuration
    • Handles column mapping and data preparation automatically

    CLI entry point: python -m creativedynamics.cli.main

  1. tests/: unit and integration tests
  2. documentation/: Sphinx documentation source files
  3. examples/: example scripts demonstrating library usage
  4. data/: sample data files (CSVs, DuckDB databases)
  5. Configuration files: pyproject.toml for packaging and dependency management, development configuration files

Primary workflow accessible via CLI (recommended) or direct library use:

  1. Load & prepare data

    • Process starts with data loading via CLI or user script
    • Data loaded from source (CSV file) into pandas DataFrame
    • All column names standardised to lowercase immediately after CSV ingestion
    • DataFrame cleaned, filtered, and preprocessed as needed
    • Data grouped by items to be analysed (e.g., by ad_id) and converted into dictionary of DataFrames
  2. Analyse all items (analyzer.analyze_all_items)

    • Core library function processes prepared data dictionary with analysis parameters (metrics, window size, etc.)

    • Function iterates through each item’s time series and performs analysis:

      • Detect change points: Uses sliding window and path signatures (roughpy) to find significant change points
      • Classify trends: Determines trend (“improving”, “worsening”) for segments between change points
      • Calculate impact metrics: Computes actual_overspend_gbp (financial) and engagement_gap_clicks (operational) based on performance benchmarks, with correlation risk context (metrics are not combined)
  3. Generate outputs

    • Plots (analyzer.plot_item_analysis): For each item, generates plot visualising time series, change points, and trends. Saved as image files
    • Reports (analyzer.generate_summary_report): Aggregated results used to create detailed summary reports in HTML and CSV format
  1. Core language: Python 3.10+

  2. Key libraries:

    • pandas: data manipulation and time series handling
    • numpy: numerical operations
    • roughpy: core rough path signature calculations
    • matplotlib, seaborn: plot generation
    • scikit-learn, scipy: statistical and machine learning utilities
  3. Optional/supporting tech:

    • duckdb: efficient local data storage (not required for analysis)
    • fastapi: HTTP API
  4. Documentation: Sphinx with myst-parser for Markdown support