Thesis Topics and Hiwis
❇️ Last updated: 2025-06-15 13:16:02 CEST
Here are several topics that we are actively offering (this list is actively updated). For all the applications, we would require a cover letter, your curriculum-vitae, your latest grades, and your university enrollment proof.
Thesis/Internship/Hiwi I
🎓 Thesis Pitch: Learning Discrete Temporal Patterns for Time Series Forecasting Link
🔍 Background
Traditional deep learning models for time series (e.g., LSTM, Transformer) often struggle with noisy, redundant, or high-dimensional input signals. Inspired by advances in sequence modeling, this project explores a novel intermediate representation to improve forecasting performance and interpretability.
💡 Core Idea
The thesis investigates a two-stage approach where time series data are first discretized into a learned symbolic form, followed by a sequence model trained on this compact representation. This abstraction allows the model to focus on recurring temporal motifs rather than raw data.
🔑 Why It’s Exciting
- 🧠 New representation: Extract and operate on high-level temporal units.
- 🎯 Modular & extensible: Encourages transfer learning and hybrid architectures.
- 🤖 Real-world impact: Applicable to scenarios with noise, missing data, or limited labels.
- 📊 Evaluation: Compare against existing state-of-the-art on standard forecasting benchmarks.
📚 Learning Outcomes
- Implement unsupervised sequence compression techniques for time series.
- Apply sequence models on symbolic or latent representations.
- Conduct rigorous benchmarking and performance analysis.
- Investigate interpretability and robustness in challenging environments.
🧪 Stretch Goals (Optional)
- Study latent attention patterns and temporal abstraction.
- Experiment with self-supervised objectives for time series.
- Apply the model in domains such as energy, finance, or scientific sensor data.
Duration: 6 months
📩 How to Apply Send a short email with your background and motivation to: nicholas.tanjerome[at]kit.edu
Thesis/Internship/Hiwi II
🎓 Thesis Pitch: Building a Scalable Causal Inference System for High-Dimensional Time Series Link
🔍 Background
In complex systems—such as smart buildings, climate sensors, or particle physics experiments like KATRIN—understanding which variables influence others over time is critical. This is the domain of causal inference. Traditional methods like Granger causality or PCMCI struggle to scale to hundreds or thousands of sensors, especially under noise, missing data, or time-lagged interactions. This thesis tackles the challenge of scaling causal inference to large sensor networks, combining ideas from graph learning, deep learning, and sparsity-aware modeling.
🎯 Core Objectives
The student will develop a scalable pipeline for causal discovery in multivariate time series, with a focus on:
- ✅ Dimensionality Reduction & Preprocessing
- Select top candidate variables using correlation, anomaly detection, or mutual information.
- Automatically isolate time windows where causal activity is high (e.g., transitions).
- ✅ Causal Model Core
- Apply scalable causal inference algorithms like:
- Neural Relational Inference (NRI)
- PCMCI+ or GVAR-based approaches
- Transformer-based attention for causal structure learning
- Evaluate performance across synthetic and real-world datasets.
- Apply scalable causal inference algorithms like:
- ✅ Scalability Design
- Modular pipeline that can handle 100–1000+ time series.
- Techniques like distributed computation, memory-efficient batching, and pruning.
- ✅ Interactive Visualization (Optional)
- Use network graph tools or immersive UIs (e.g., WebGL, Unity) to display causal structures.
- Highlight actionable or uncertain causal relationships.
🧠 Learning Outcomes
- Deep understanding of time-series causality methods and their assumptions.
- Practical experience building efficient and modular ML systems.
- Skills in evaluating both structural accuracy (causal graphs) and forecasting benefit.
- Insight into scientific applications of causality in real experiments or smart environments.
📊 Evaluation Datasets
- Synthetic benchmarks: e.g., VAR, simulated sensor networks, or physical diffusion models —> I have a temperature-room model that would fit nicely here.
- Real-world data: e.g., electricity, KATRIN tritium sensors, temperature in building rooms, finance.
- Ultimately, we would like to deploy to all KATRIN datasets (but this is ambitious)
- Start with smaller datasets that is rather controlled such as the KATRIN Tank Temperature Distributions.
💡 Stretch Goals
- Integrate uncertainty estimation into causal graph output.
- Implement a prototype for real-time causal monitoring (sliding window updates).
- Interface with retrieval-augmented memory or graph databases for causal query answering.
Duration: 6 months
📩 How to Apply
Send a short email with your background and motivation to: nicholas.tanjerome[at]kit.edu
Thesis/Internship/Hiwi III
🎓 Thesis Pitch: Hybrid Quantum Anomaly Detection for Scientific Sensor Data using Qiskit and CUDA-Q Link
🔍 Background
Quantum computing is rapidly gaining traction as a transformative approach to accelerate machine learning tasks—especially in complex, high-dimensional, and noisy domains. In large-scale physics experiments like KATRIN, thousands of sensors continuously generate time-series data streams, where detecting anomalies (such as instabilities or faults) is mission-critical. Yet, classical methods often struggle to capture rare and nonlinear behaviors under such conditions.
💡 Core Idea
This project explores hybrid quantum-classical anomaly detection by designing Variational Quantum Circuits (VQCs) that learn to encode and identify rare patterns in time-series sensor data. These models will be implemented and tested using two state-of-the-art quantum software frameworks:
- Qiskit (IBM) for its accessibility and educational ecosystem
- CUDA-Q (NVIDIA) for its performance-focused, GPU-accelerated simulations
The goal is to benchmark these hybrid models against classical baselines like Isolation Forests and Autoencoders, and to assess their potential in terms of accuracy, scalability, and interpretability—on both synthetic and real-world datasets (e.g., from KATRIN).
🧪 What You’ll Work On
Depending on your interests and strengths, you can focus on one or more of the following areas:
- 🧠 Quantum Circuit Design: Build and optimize VQCs for anomaly detection tasks
- 🔗 Hybrid Integration: Combine quantum modules with classical preprocessing/postprocessing pipelines
- ⚙️ Framework Comparison: Analyze and compare Qiskit vs. CUDA-Q in terms of usability, speed, and extensibility
- 📊 Benchmarking: Evaluate on real or simulated sensor data, comparing with traditional ML models ’ 🎓 (Optional): Create educational demos or teaching materials for hybrid ML/quantum workflows
🛠️ Requirements
Solid Python programming skills
- Familiarity with machine learning concepts
- Basic knowledge of quantum computing or motivation to learn (e.g., Qiskit tutorials)
- Curiosity about scientific computing and hybrid AI systems
⏳ Duration & Collaboration
This thesis is designed for 6 months and encourages interdisciplinary collaboration across quantum computing, machine learning, and sensor-based scientific research.
📩 How to Apply
Send a short email with your background and motivation to: nicholas.tanjerome[at]kit.edu
Hiwi
🚀 Research Assistant / Thesis Opportunity
Help Extend a Published Forecasting Paper with Lightweight Innovations
📍 Institute for Data Processing and Electronics (IPE), KIT
We are looking for a motivated Master’s student to help extend a recently published paper on forecasting tritium source stability in a large-scale neutrino experiment. The original study benchmarked various deep learning models (e.g., LSTM, NHITS, TSMixer, Chronos-LLM) on real-world time series from the KATRIN experiment.
Now, we aim to refine the approach with lightweight yet impactful innovations that can form the basis of a follow-up publication.
💡 What You’ll Work On You’ll help explore one (or more) of the following:
- ✅ Custom Loss Functions: Tailor loss functions to better capture transition dynamics or long-term equilibrium behavior.
- ✅ Hybrid Forecasting Models: Combine classical methods (e.g. Kalman filter, piecewise regression) with deep models.
- ✅ Forecasting with Uncertainty: Add Monte Carlo dropout, quantile loss, or ensemble methods for confidence estimation.
- ✅ Low-resource Adaptation: Explore model robustness under reduced data, noisy samples, or sensor dropout.
🛠️ What You’ll Gain
Experience with state-of-the-art time series forecasting frameworks Hands-on with real scientific data from a high-profile physics experiment Potential to co-author a peer-reviewed paper Flexible scope: suitable for a thesis, HiWi position, or research project
🎓 What We Expect
Python & PyTorch skills Understanding of machine learning and time series Interest in scientific applications of AI Independent and collaborative mindset
📩 How to Apply
Send a short email with your background and motivation to: nicholas.tanjerome[at]kit.edu
📬 [Your Email Here]
Subject: Tritium Forecasting Extension – Student Interest