Methodology
CreativeDynamics Library v0.9.8.1
The library employs techniques from Rough Path Theory to analyse time-series data. Core concept: calculating mathematical signatures of a data path and measuring distance between signatures over time for change-point detection.
Rough path signatures
Section titled “Rough path signatures”Rough path signature: mathematical object capturing geometric features of a path (time series) through a hierarchy of Lie increments. Provides rich, non-linear summary of path evolution as a powerful feature extraction tool.
Key properties:
- Robust to re-parameterisation: Depends on geometric shape, not traversal speed
- Faithful representation: Under mild conditions, uniquely determines the path up to tree-like equivalences
- Universal approximators: Truncated signatures approximate any continuous function on path space
Rough path essentials (formal)
Section titled “Rough path essentials (formal)”This section summarises the minimum rough path notation needed to understand the library. It follows the notation and framing used in the accompanying paper (arXiv-2509.09758v3/main.tex).
Let be a continuous path of bounded variation. In our application, we typically use and embed a time series as a path , then normalise time and metric values to over each analysis window.
Signature (iterated integrals)
Section titled “Signature (iterated integrals)”The signature of over is the sequence of iterated integrals
where, for ,
In practice we truncate at depth to obtain a finite-dimensional feature vector.
Concatenation (Chen identity)
Section titled “Concatenation (Chen identity)”If , the signature satisfies a multiplicative property (Chen’s identity)
which is the algebraic reason signatures are useful for analysing local changes along a path.
Log-signatures (Lie increments)
Section titled “Log-signatures (Lie increments)”The library computes log-signatures (Lie increments) for efficiency. Conceptually, this amounts to working in the free Lie algebra while preserving the geometric information of the truncated signature.
Mathematical foundation:
Implements signatures using Lie increments rather than tensor products for computational efficiency whilst maintaining mathematical rigour. For -dimensional path , the log-signature is an element of the free Lie algebra capturing the same information as the truncated signature up to depth (up to the truncation order).
Implementation details:
- Uses
roughpylibrary with Lie increment computation - Paths normalised to [0,1] interval before signature computation
- Signature depth controls geometric detail level (default depth=4)
- Computational complexity: O(T·d²) for fixed window size w
Signature calculation
Section titled “Signature calculation”Uses roughpy library for path signature calculation:
- Accuracy: Well-tested library providing correct signature computations
- Efficiency: Optimised C++ backend for performance
- Standardisation: Standard, community-accepted tool
Primary module: creativedynamics.core.signature_calculator.
Path construction and normalisation
Section titled “Path construction and normalisation”Specific normalisation procedure ensures numerical stability and consistent signature computation:
-
Two-dimensional path construction: For each metric, constructs 2D path X(t) = (t_norm, y_norm) where:
- t_norm ∈ [0,1]: normalised time coordinate
- y_norm ∈ [0,1]: normalised metric value
-
Normalisation procedure:
t_norm = (t - t_min) / (t_max - t_min)y_norm = (y - y_min) / (y_max - y_min + ε)where ε = 10^-8 prevents division by zero for constant metrics.
-
Signature parameters:
- Depth: Controls Lie increments level (default=4)
- Window Size (w): Consecutive data points per window (default=7)
- Sliding step: Windows slide by one time point for detailed analysis
Normalisation ensures signatures from different time periods and metrics are comparable, essential for distance-based change point detection.
Signature distance and change point detection
Section titled “Signature distance and change point detection”Sliding window approach detects changes in time series patterns:
-
Window-based signature computation: For time series of length T, computes signatures for overlapping windows of size w.
-
Distance calculation: Euclidean distance between consecutive window signatures:
d_t = ||S_t - S_{t-1}||_2where S_t is the signature of window t.
-
Statistical thresholding: Change points detected when distance exceeds:
threshold = μ_d + k·σ_dwhere μ_d and σ_d are mean and standard deviation of all distances, k is threshold multiplier (default k=1.5).
-
Computational efficiency: Overall complexity O(T·d²) for fixed window size w, efficient for real-time analysis.
Applications of signatures in the library
Section titled “Applications of signatures in the library”Primary built-in application within creativedynamics.core.analyzer module: change-point detection.
Four-phase analysis process
Section titled “Four-phase analysis process”Detailed four-phase analysis pipeline:
-
Phase 1: Change point detection
- Computes sliding window signatures across time series
- Calculates signature distances between consecutive windows
- Identifies statistically significant change points using adaptive thresholding
- Output: List of change points segmenting time series
-
Phase 2: Segment analysis
- Divides time series into segments based on detected change points
- Computes segment statistics (mean, variance, trend)
- Classifies segment trends as “Stable”, “Improving”, or “Declining”
- Output: Characterised segments with trend classifications
-
Phase 3: Benchmark calculation
- Identifies longest stable or improving segment
- Computes benchmark values from optimal performance periods
- Validates benchmark reliability based on segment duration
- Output: Benchmark values for impact calculation
-
Phase 4: Impact quantification
- Calculates impact during declining periods
- Quantifies
actual_overspend_gbp(financial inefficiency) andengagement_gap_clicks(operational impact) - Provides correlation risk context; metrics are reported separately and not combined
- Output: Operational and financial impact of performance degradation (reported separately)
Implemented in creativedynamics.core.analyzer module with configurable parameters for each phase.
Visual representation
Section titled “Visual representation”Visual reports for change-point analysis include:
- Upper chart: Original time-series metric(s)
- Lower chart: Calculated signature distances over time with significance threshold line and vertical markers for detected change points
Theoretical properties and advantages
Section titled “Theoretical properties and advantages”Signature-based approach provides theoretical guarantees and practical advantages:
Theoretical properties:
- Consistency: Change point detection is statistically consistent under mild conditions
- Convergence: Signature distances converge to true pattern distance as window size increases
- Invariance: Detection invariant to monotonic time transformations
Practical advantages:
- Early detection: Captures subtle pattern changes before manifesting in aggregate metrics
- Non-linearity: Naturally handles non-linear dynamics and complex interactions
- Robustness: Resistant to outliers due to integral-based computation
- Interpretability: Signature distances have clear geometric interpretation
Performance characteristics:
- Precision-recall trade-off: Controlled by threshold multiplier k
- Default settings: k=1.5 provides balanced precision (~0.7) and recall (~0.6)
- Computational efficiency: Linear in time series length for fixed window size
Multi-dimensionality
Section titled “Multi-dimensionality”“Multi-dimensionality” is key:
- Path dimensionality: Input data is often multi-dimensional (e.g., time, metric A, metric B)
- Signature dimensionality: Signature is a high-dimensional vector (or tensor), where each term captures different aspects of path geometry
Multi-dimensional approach allows detailed characterisation of time series compared to methods analysing each metric in isolation or considering only simple trends.
Data preparation and column naming conventions
Section titled “Data preparation and column naming conventions”The library standardises all column names to lowercase throughout the processing pipeline, simplifying the codebase by eliminating case-sensitivity issues and reducing complexity:
- Column names transformed to lowercase immediately after CSV ingestion
- Only lowercase column names used throughout entire analysis pipeline
- Standard names include
day,link_clicks,amount_spent_gbp,impressions,cpc, andctr
Library entry points
Section titled “Library entry points”Two primary entry points for analysis:
-
CLI entry point (
cli.py): Recommended for production use. Uses YAML configuration files with nested column mapping structure for flexible and maintainable configuration. -
Script entry point (
run_analysis.py): Alternative entry point using flat JSON mapping files. Maintained for backward compatibility but may be deprecated in future versions.
For new implementations, CLI entry point with YAML configuration is the standard approach.