Thesis Topics and Hiwis

❇️ Last updated: 2025-06-15 13:16:02 CEST

Here are several topics that we are actively offering (this list is actively updated). For all the applications, we would require a cover letter, your curriculum-vitae, your latest grades, and your university enrollment proof.

Thesis/Internship/Hiwi I

🎓 Thesis Pitch: Learning Discrete Temporal Patterns for Time Series Forecasting Link

🔍 Background 

Traditional deep learning models for time series (e.g., LSTM, Transformer) often struggle with noisy, redundant, or high-dimensional input signals. Inspired by advances in sequence modeling, this project explores a novel intermediate representation to improve forecasting performance and interpretability.

💡 Core Idea 

The thesis investigates a two-stage approach where time series data are first discretized into a learned symbolic form, followed by a sequence model trained on this compact representation. This abstraction allows the model to focus on recurring temporal motifs rather than raw data.

🔑 Why It’s Exciting

🧠 New representation: Extract and operate on high-level temporal units.
🎯 Modular & extensible: Encourages transfer learning and hybrid architectures.
🤖 Real-world impact: Applicable to scenarios with noise, missing data, or limited labels.
📊 Evaluation: Compare against existing state-of-the-art on standard forecasting benchmarks.

📚 Learning Outcomes

Implement unsupervised sequence compression techniques for time series.
Apply sequence models on symbolic or latent representations.
Conduct rigorous benchmarking and performance analysis.
Investigate interpretability and robustness in challenging environments.

🧪 Stretch Goals (Optional)

Study latent attention patterns and temporal abstraction.
Experiment with self-supervised objectives for time series.
Apply the model in domains such as energy, finance, or scientific sensor data.

Duration: 6 months

📩 How to Apply Send a short email with your background and motivation to: nicholas.tanjerome[at]kit.edu

Thesis/Internship/Hiwi II

🎓 Thesis Pitch: Building a Scalable Causal Inference System for High-Dimensional Time Series Link

🔍 Background

In complex systems—such as smart buildings, climate sensors, or particle physics experiments like KATRIN—understanding which variables influence others over time is critical. This is the domain of causal inference. Traditional methods like Granger causality or PCMCI struggle to scale to hundreds or thousands of sensors, especially under noise, missing data, or time-lagged interactions. This thesis tackles the challenge of scaling causal inference to large sensor networks, combining ideas from graph learning, deep learning, and sparsity-aware modeling.

🎯 Core Objectives

The student will develop a scalable pipeline for causal discovery in multivariate time series, with a focus on:

✅ Dimensionality Reduction & Preprocessing
- Select top candidate variables using correlation, anomaly detection, or mutual information.
- Automatically isolate time windows where causal activity is high (e.g., transitions).
✅ Causal Model Core
- Apply scalable causal inference algorithms like:
  - Neural Relational Inference (NRI)
  - PCMCI+ or GVAR-based approaches
  - Transformer-based attention for causal structure learning
- Evaluate performance across synthetic and real-world datasets.
✅ Scalability Design
- Modular pipeline that can handle 100–1000+ time series.
- Techniques like distributed computation, memory-efficient batching, and pruning.
✅ Interactive Visualization (Optional)
- Use network graph tools or immersive UIs (e.g., WebGL, Unity) to display causal structures.
- Highlight actionable or uncertain causal relationships.

🧠 Learning Outcomes

Deep understanding of time-series causality methods and their assumptions.
Practical experience building efficient and modular ML systems.
Skills in evaluating both structural accuracy (causal graphs) and forecasting benefit.
Insight into scientific applications of causality in real experiments or smart environments.

📊 Evaluation Datasets

Synthetic benchmarks: e.g., VAR, simulated sensor networks, or physical diffusion models —> I have a temperature-room model that would fit nicely here.
Real-world data: e.g., electricity, KATRIN tritium sensors, temperature in building rooms, finance.
- Ultimately, we would like to deploy to all KATRIN datasets (but this is ambitious)
- Start with smaller datasets that is rather controlled such as the KATRIN Tank Temperature Distributions.

💡 Stretch Goals

Integrate uncertainty estimation into causal graph output.
Implement a prototype for real-time causal monitoring (sliding window updates).
Interface with retrieval-augmented memory or graph databases for causal query answering.

Duration: 6 months

📩 How to Apply

Send a short email with your background and motivation to: nicholas.tanjerome[at]kit.edu

Thesis/Internship/Hiwi III

🎓 Thesis Pitch: Hybrid Quantum Anomaly Detection for Scientific Sensor Data using Qiskit and CUDA-Q Link

🔍 Background

Quantum computing is rapidly gaining traction as a transformative approach to accelerate machine learning tasks—especially in complex, high-dimensional, and noisy domains. In large-scale physics experiments like KATRIN, thousands of sensors continuously generate time-series data streams, where detecting anomalies (such as instabilities or faults) is mission-critical. Yet, classical methods often struggle to capture rare and nonlinear behaviors under such conditions.

💡 Core Idea

This project explores hybrid quantum-classical anomaly detection by designing Variational Quantum Circuits (VQCs) that learn to encode and identify rare patterns in time-series sensor data. These models will be implemented and tested using two state-of-the-art quantum software frameworks:

Qiskit (IBM) for its accessibility and educational ecosystem
CUDA-Q (NVIDIA) for its performance-focused, GPU-accelerated simulations

The goal is to benchmark these hybrid models against classical baselines like Isolation Forests and Autoencoders, and to assess their potential in terms of accuracy, scalability, and interpretability—on both synthetic and real-world datasets (e.g., from KATRIN).

🧪 What You’ll Work On

Depending on your interests and strengths, you can focus on one or more of the following areas:

🧠 Quantum Circuit Design: Build and optimize VQCs for anomaly detection tasks
🔗 Hybrid Integration: Combine quantum modules with classical preprocessing/postprocessing pipelines
⚙️ Framework Comparison: Analyze and compare Qiskit vs. CUDA-Q in terms of usability, speed, and extensibility
📊 Benchmarking: Evaluate on real or simulated sensor data, comparing with traditional ML models ’ 🎓 (Optional): Create educational demos or teaching materials for hybrid ML/quantum workflows

🛠️ Requirements

Solid Python programming skills

Familiarity with machine learning concepts
Basic knowledge of quantum computing or motivation to learn (e.g., Qiskit tutorials)
Curiosity about scientific computing and hybrid AI systems

⏳ Duration & Collaboration

This thesis is designed for 6 months and encourages interdisciplinary collaboration across quantum computing, machine learning, and sensor-based scientific research.

📩 How to Apply

Send a short email with your background and motivation to: nicholas.tanjerome[at]kit.edu

Hiwi

🚀 Research Assistant / Thesis Opportunity

Help Extend a Published Forecasting Paper with Lightweight Innovations

📍 Institute for Data Processing and Electronics (IPE), KIT

We are looking for a motivated Master’s student to help extend a recently published paper on forecasting tritium source stability in a large-scale neutrino experiment. The original study benchmarked various deep learning models (e.g., LSTM, NHITS, TSMixer, Chronos-LLM) on real-world time series from the KATRIN experiment.

Now, we aim to refine the approach with lightweight yet impactful innovations that can form the basis of a follow-up publication.

💡 What You’ll Work On You’ll help explore one (or more) of the following:

✅ Custom Loss Functions: Tailor loss functions to better capture transition dynamics or long-term equilibrium behavior.
✅ Hybrid Forecasting Models: Combine classical methods (e.g. Kalman filter, piecewise regression) with deep models.
✅ Forecasting with Uncertainty: Add Monte Carlo dropout, quantile loss, or ensemble methods for confidence estimation.
✅ Low-resource Adaptation: Explore model robustness under reduced data, noisy samples, or sensor dropout.

🛠️ What You’ll Gain

Experience with state-of-the-art time series forecasting frameworks Hands-on with real scientific data from a high-profile physics experiment Potential to co-author a peer-reviewed paper Flexible scope: suitable for a thesis, HiWi position, or research project

🎓 What We Expect

Python & PyTorch skills Understanding of machine learning and time series Interest in scientific applications of AI Independent and collaborative mindset

📩 How to Apply

Send a short email with your background and motivation to: nicholas.tanjerome[at]kit.edu

📬 [Your Email Here]

Subject: Tritium Forecasting Extension – Student Interest

Publications