\[ \renewcommand{\d}{{\bf{d}}} \renewcommand{\b}{{\bf{b}}} \newcommand{\J}{{\bf{J}}} \newcommand{\A}{\bf{A}} \newcommand{\B}{\bf{B}} \newcommand{\RR}{\mathbf{R}} \newcommand{\h}{{\bf{h}}} \newcommand{\x}{{\bf{x}}} \newcommand{\bfa}{{\bf{a}}} \newcommand{\bfb}{{\bf{b}}} \newcommand{\bfc}{{\bf{c}}} \newcommand{\y}{{\bf{y}}} \newcommand{\z}{{\bf{z}}} \newcommand{\w}{{\bf{w}}} \newcommand{\f}{{\bf{f}}} \newcommand{\tf}{{\bf{\tilde f}}} \newcommand{\tx}{{\bf{\tilde x}}} \renewcommand{\d}{{\rm{d}}} \newcommand{\s}{{\bf{s}}} \newcommand{\g}{{\bf{g}}} \newcommand{\W}{{\bf{W}}} \newcommand{\vol}{{\operatorname{vol}}} \newcommand{\zz}{\mathbf{z}} \newcommand{\xx}{\mathbf{x}} \newcommand{\bdelta}{\bm{\delta}} \renewcommand{\H}{\mathbf{H}} \newcommand{\txx}{{\tilde{\mathbf{x}}}} \newcommand{\tzz}{{\tilde{\mathbf{z}}}} \newcommand{\tyy}{{\tilde{\mathbf{y}}}} \newcommand{\invf}{f^{-1}} \newcommand{\Sp}{\mathbb{S}} \]

Disentangling Time Series Representations via Contrastive Independence-of-Support on l-Variational Inference

Khalid Oublal*
Institute Polytechnique de Paris
David Benhaiem
OneTech TotalEnergies & DS-AI
Saïd Ladjal
Institute Polytechnique de Paris, Telecom Paris LTCI/S2A
Emmanuel Le-Borg
OneTech TotalEnergies & DS-AI
François Roueff
Institute Polytechnique de Paris, Telecom Paris LTCI/S2A

tl;dr: This work disentangles factors in time series data, allowing better understanding and optimization. It handles real-world correlations between factors and introduces TDS (Time Disentangling Score) to measure disentanglement quality effectively.

News

May '21 The paper was accepted at ICLR 2024 🚀🚀!
February '21 The pre-print is now available on arXiv: arxiv.org/abs/2102.08850
December '20 A shorter workshop version of the paper was accepted for spotlight 🥇 presentation at the NeurIPS 2023 Workshop on Unifying Representations in Neural Models.

Abstract

Learning disentangled representations is crucial for Time Series, offering benefits like feature derivation and improved interpretability, thereby enhancing task performance. We focus on disentangled representation learning for home appliance electricity usage, enabling users to understand and optimize their consumption for a reduced carbon footprint. Our approach frames the problem as disentangling each attribute’s role in total consumption (e.g., dishwashers, fridges, etc). Unlike existing methods assuming attribute independence, we acknowledge real-world time series attribute correlations, like the operating of dishwashers and washing machines during the winter season. To tackle this, we employ weakly supervised contrastive disentanglement, facilitating representation generalization across diverse correlated scenarios and new households. Our method utilizes innovative l-variational inference layers with self-attention, effectively addressing temporal dependencies across bottom-up and top-down networks. We find that DIoSC (Disentanglement and Independence-of-Support via Contrastive Learning) can enhance the task of reconstructing electricity consumption for individual appliances. We introduce TDS (Time Disentangling Score) to gauge disentanglement quality. TDS reliably reflects disentanglement performance, making it a valuable metric for evaluating time series representations.

Overview: We focus on disentangled representation learning for home appliance electricity usage, enabling users to understand and optimize their consumption for a reduced carbon footprint. Our approach frames the problem as disentangling each attribute’s role in total consumption (e.g., dishwashers, fridges, etc). Unlike existing methods assuming attribute independence, we acknowledge real-world time series attribute correlations, like the operating of dishwashers and washing machines during the winter season. We assume the observations are generated by an (unknown) injective generative model \(g\) that maps unobservable latent variables from a hypersphere to observations in another manifold. Under these assumptions, the feature encoder \(f\) implictly learns to invert the ground-truth generative process \(g\) up to linear transformations, i.e., \(f = \mathbf{A} g^{-1}\) with an orthogonal matrix \(\mathbf{A}\), if \(f\) minimizes the InfoNCE objective.

Contributions

  1. We establish a theoretical connection between the InfoNCE family of objectives, which is commonly used in self-supervised learning, and nonlinear ICA. We show that training with InfoNCE inverts the data-generating process if certain statistical assumptions on the data generating process hold.
  2. We empirically verify our predictions when the assumed theoretical conditions are fulfilled. In addition, we show a successful inversion of the data-generating process even if theoretical assumptions are partially violated.
  3. We build on top of the CLEVR rendering pipeline (Johnson et al., 2017) to generate a more visually complex disentanglement benchmark, called 3DIdent, that contains hallmarks of natural environments (shadows, different lighting conditions, a 3D object, etc.). We demonstrate that a contrastive loss derived from our theoretical framework can identify the ground-truth factors of such complex, high-resolution images.

Theory

To achieve disentanglement, invariant, and aligned latent representations, we use a contrastive objective that ensures each latent component \(z_m\) is only influenced by its corresponding output in the decoder. Specifically, disentanglement is achieved when each ground truth variable \(y_m\) aligns one-to-one with \(z_m\), even with limited labels in settings like NILM (Non-Intrusive Load Monitoring).

We build on weakly supervised contrastive learning, using an objective modified from Zbontar et al. (2021). This objective enforces two core components:

  • Latent-Invariant: Minimizes overlap between \(z_m\) and its negatives \(z_m^{-}\), reducing redundant information.
  • Latent-Alignment: Encourages similarity between \(z_m\) and its augmented \(z_m^{+}\), ensuring variations align with changes in ground truth attributes.

We empirically relax independence assumptions (Assumption 4.1) by enforcing independence of support (IoS) in mini-batches, ensuring that augmented latents \(Z_{:,m}^+\) are close to their original \(Z_{:,m}\) and far from negatives \(Z_{:,m}^-\), without requiring Cartesian product support.

The final objective combines these components: \[ L_{\text{DIOSC}} = \eta \sum_m \sum_V D(z_m, z_m^{-})^2 \, |_{\text{Latent-Invariant}} + \sum_m \sum_U \left(1 - D(z_m, z_m^{+})\right)^2 \, |_{\text{Latent-Alignment}}, \] where \(D(\cdot, \cdot)\) denotes cosine similarity distance, and \(\eta = 1\).

Dataset

We conducted experiments on three public datasets:

UK-DALE (Kelly & Knottenbelt, 2015),

REDD (Kolter & Johnson, 2011),

REFIT (Murray et al., 2017)

These datasets provide power measurements from multiple homes.

We focus on six appliances:

  • Washing Machine
  • Oven
  • Dishwasher
  • Cloth Dryer
  • Fridge

We performed cross-tests on different dataset scenarios, each with varying sample sizes:

Scenario A: Training on REFIT and testing on UK-DALE

Sample Size: 18.3k | Time Window (T) = 256 | Frequency = 60Hz

Test Set: 3.5k samples

Scenario B: Training on UK-DALE and testing on REFIT

Sample Size: 13.3k

Scenario C: Training on REFIT and testing on REDD

Sample Size: 9.3k

The augmentation pipeline is applied for all scenarios. For training and testing under correlation, we use the corresponding sampling.

Acknowledgements & Funding

This work was granted access to the HPC resources of IDRIS under the allocation AD011014921 made by GENCI (Grand Equipement National de Calcul Intensif). Part of this work was funded by the TotalEnergies Individual Fellowship through One Tech.

BibTeX

If you find our analysis helpful, please cite our paper:

@inproceedings{oublal2024disentangling,
  author = {
    Oublal, Khalid and
    Ladjal, Said and
    Benhaiem, David and
    LE BORGNE, Emmanuel and
    Roueff, François
  },
  title = {
    Disentangling Time Series Representations via Contrastive Independence-of-Support on l-Variational Inference
  },
  booktitle = {
    The Twelfth International Conference on Learning Representations
  },
  year = {2024},
  url = {
    https://openreview.net/forum?id=iI7hZSczxE
  }
}
        
Webpage designed using Bootstrap 4.5.