Magdalena Balazinska
- CSE 584
- 206-616-1069
- magda '@' cs.washington.edu
- Data Science, Data Management & Visualization
- Areas of interest: Database management systems, AI and data management, multimodal data management, cloud computing, big-data analytics, image and video analytics.
Research
My interests are broadly in the fields of databases and distributed systems. My current work focuses on data management for multimodal data (video, images, text, and relational) as well as interactions between AI and data management. My past projects included work on cloud computing, big data processing, stream processing, and much more.
Current Projects:
-
KathDB: Data management for multimodal data. We are building a new data management system and developing techniques that leverage LLMs and other approaches to manage multimodal data, which includes videos, images, text, and relational data.
-
Data management and AI (website forthcoming) We are redesigning data mangement systems for the new world of AI Agents and AI methods.
-
VOCAL: Data management for video data. We are building new data management systems and techniques for video data.
Past Projects:
-
VisualWorldDB (scroll down past VOCAL): Video data storage, benchmarking, and AR/VR video data management.
-
Several small projects on data management and ML including NeuralArtifactDB, CENTS, Querying DNN weights, DeepQuery.
-
Mosaic: Open world data management and analytics system.
-
Myria: Big data management as a cloud service. In this project, we are building a new big data management system as a cloud service and are studying the various associated technical challenges.
-
Data Eco$y$tem: Data Management, Data Policies, and Pricing in the Cloud: paper, paper, paper, paper, paper, paper, paper, paper
-
CQMS: Collaborative query management.
-
Nuage: Data management in the cloud (first project).
-
SciDB: Array data management. The UW-local part of SciDB is described here.
-
RFID Ecosystem: Experimenting with a pervasive RFID-based infrastructure.
-
Lahar: Markovian Stream Processing.
-
Moirae: Exploiting history in monitoring applications.
-
PEEX: Probabilistic Event EXtractor for RFID data.
-
FlowDB: Using relational databases in network forensic analysis.
-
StreamClean: Cleaning sensor data.
-
HomeViews: Helping home users organize and share their data.
-
Distributed stream processing with Borealis and Medusa.
-
Study of user mobility patterns and network utilization in a corporate WLAN.
-
Twine: scalable resource discovery system for pervasive computing environments.
-
Infranet: Internet censorship circumvention system.