KathDB

An Explainable Multimodal Database System with Human-AI Collaboration

About KathDB

Modern data-intensive applications increasingly rely on multimodal data such as text, images, and videos, yet traditional database systems are limited to structured tables, while recent AI-powered systems often sacrifice explainability and semantic guarantees.

KathDB is an explainable multimodal database management system that bridges this gap. It combines the relational model and cost-based query optimization of traditional DBMSs with the reasoning capabilities of foundation models, enabling users to query multimodal data using natural language while retaining structured semantics.

KathDB introduces a unified relational semantic layer over text, images, and video, together with a function-as-operator execution model that compiles queries into modular, versioned functions. This design enables fine-grained lineage tracking, cost-based optimization, and rich explanations that trace query results back to their underlying data and transformations.

Unlike black-box LLM-based systems, KathDB keeps users in the loop through interactive clarification, debugging, and explanation channels, allowing users to iteratively refine queries and understand results across modalities.

KathDB: Explainable Multimodal Database Management System with Human-AI Collaboration. Guorui Xiao, Enhao Zhang, Nicole Sullivan, Will Hansen, Magdalena Balazinska. CIDR, 2026. arXiv Preprint

People

Questions?

Please contact Guorui Xiao.

Acknowledgments

This work was supported in part by the National Science Foundation through awards 2211133.