System Architecture

Version: 0.9.8.1

Overview of the CreativeDynamics library architecture. The library is a flexible, data-source-agnostic toolkit for time-series analysis, using rough path signatures to detect patterns and changes.

Overview

Python library providing tools for analysing time-series data, primarily for creative fatigue detection in marketing. Usable via Python script imports, command-line interface (CLI), and FastAPI. CLI approach recommended for most use cases.

Core principle: operates on pandas DataFrames. Users load data from any source (CSV, databases, etc.) and apply the library’s analytical capabilities.

Core components

Structured into core analytical engine and interaction interfaces.

Core library (`src/creativedynamics/`)

analyzer.py
- Heart of the library containing main analysis logic
- Primary function analyze_all_items takes dictionary of pandas DataFrames and orchestrates entire analysis pipeline: change-point detection, trend classification, impact metrics calculation, plotting, report generation
- Data-source agnostic; data loading and preparation happen before calling library
signature_calculator.py
- Specialised module for computing rough path signatures
- Uses roughpy library for efficient and accurate signature calculations
- Provides compute_signature() function used by analyzer.py
contracts/
- Canonical naming contracts for shared concepts
- Defines standard metric identifiers (for example ctr, cpc) and column identifiers (for example day)
- Provides normalisation helpers to reduce casing and naming drift across interfaces
api/main.py and api/adapter.py
- Implements the FastAPI-based HTTP API
- v1 analysis endpoints call the same core pipeline as the CLI/library via the API adapter
- Adapter performs input normalisation (records + column mapping) before calling the core pipeline
_version.py & __init__.py
- Standard Python files for package initialisation and version tracking

User interfaces & scripts

Direct library use (e.g., run_analysis.py)
- Flexible way to use library, though CLI now recommended for most use cases
- User script responsible for:
  1. Loading data from any source into pandas DataFrame
  2. Performing necessary preprocessing (cleaning, filtering)
  3. Structuring data into library format (dictionary of DataFrames, where each key is item to be analysed)
  4. Calling creativedynamics.analyze_all_items to run analysis
cli.py (top-level script)
- Provides command-line interface for running predefined analysis pipeline
- Recommended approach for most analyses, offering standardised, repeatable workflows with YAML configuration
- Handles column mapping and data preparation automatically
CLI entry point: python -m creativedynamics.cli.main

Supporting components

tests/: unit and integration tests
documentation/: Sphinx documentation source files
examples/: example scripts demonstrating library usage
data/: sample data files (CSVs, DuckDB databases)
Configuration files: pyproject.toml for packaging and dependency management, development configuration files

Data flow (typical analysis)

Primary workflow accessible via CLI (recommended) or direct library use:

Load & prepare data
- Process starts with data loading via CLI or user script
- Data loaded from source (CSV file) into pandas DataFrame
- All column names standardised to lowercase immediately after CSV ingestion
- DataFrame cleaned, filtered, and preprocessed as needed
- Data grouped by items to be analysed (e.g., by ad_id) and converted into dictionary of DataFrames
Analyse all items (analyzer.analyze_all_items)
- Core library function processes prepared data dictionary with analysis parameters (metrics, window size, etc.)
- Function iterates through each item’s time series and performs analysis:
  - Detect change points: Uses sliding window and path signatures (roughpy) to find significant change points
  - Classify trends: Determines trend (“improving”, “worsening”) for segments between change points
  - Calculate impact metrics: Computes actual_overspend_gbp (financial) and engagement_gap_clicks (operational) based on performance benchmarks, with correlation risk context (metrics are not combined)
Generate outputs
- Plots (analyzer.plot_item_analysis): For each item, generates plot visualising time series, change points, and trends. Saved as image files
- Reports (analyzer.generate_summary_report): Aggregated results used to create detailed summary reports in HTML and CSV format

Technology stack

Core language: Python 3.10+
Key libraries:
- pandas: data manipulation and time series handling
- numpy: numerical operations
- roughpy: core rough path signature calculations
- matplotlib, seaborn: plot generation
- scikit-learn, scipy: statistical and machine learning utilities
Optional/supporting tech:
- duckdb: efficient local data storage (not required for analysis)
- fastapi: HTTP API
Documentation: Sphinx with myst-parser for Markdown support