The VisualWorld Video Data Management Project

Our ability to collect and reason about video data at scale is revolutionizing how we interact with the world. To explore these ideas, we introduce VisualWorld, a collection of video data management projects ongoing in the University of Washington database group. VisualWorld projects explore video data management from a number of perspectives, including new approaches to VR and AR (LightDB and Visual Cloud), low-level video data storage especially in the context of machine learning (TASM and VFS), and evaluation of the performance and scalability of video data management systems (Visual Road).


VisualWorldDB is a vision and an initial architecture for a new type of database management system optimized for multi-video applications. VisualWorldDB ingests video data from many perspectives and makes them queryable as a single multidimensional visual object. It incorporates new techniques for optimizing, executing, and storing multi-perspective video data. Our preliminary results suggest that this approach allows for faster queries and lower storage costs, improving the state of the art for applications that operate over this type of video data.



LightDB is a database management system (DBMS) designed to efficiently ingest, store, and deliver virtual reality (VR) content at scale. LightDB currently targets both live and prerecorded spherical panoramic (a.k.a. 360°) and light field VR videos. It persists content as a multidimensional field that includes both spatiotemporal and angular (i.e., orientation) dimensions. ontent delivered through LightDB offers improved throughput, less bandwidth, and scales to many concurrent connections.

Details Paper Code

Tile-Aware Storage Management (TASM)

The tile-aware storage manager (TASM) utilizes We present the design of TASM is a tile-aware storage manager which uses a feature in modern video codecs called ``tiles'' to enable spatial random access into encoded videos. TASM significantly improves the performance of machine and deep learning queries over videos when the workload is known, and can incrementally adapt the physical layout of videos to improve performance even when the workload is not known. Layouts picked by TASM speed up individual queries by an average of 51% and up to 94% while maintaining good quality.

Paper Code


VFS is a new file system designed to decouple high-level video operations such as machine learning and computer vision from the low-level details required to store and efficiently retrieve video data. Using VFS, users read and write video data as if it were to an ordinary file system, and VFS transparently and automatically arranges the data on disk in an efficient, granular format, caches frequently-retrieved regions in the most useful formats, and eliminates redundancies found in videos captured from multiple cameras with overlapping fields of view.

Visual Road

To accelerate innovation in video data management system (VDBMS) research, we designed and built Visual Road, a benchmark that evaluates the performance of these systems. Visual Road comes with a dataset generator and a suite of benchmark queries over cameras positioned within a simulated metropolitan environment. Visual Road's video data is automatically generated with a high degree of realism, and annotated using a modern simulation and visualization engine. This allows for VDBMS performance evaluation while scaling up the size of the input data.

Details Paper Code

Visual Cloud

Visual Cloud persists virtual reality (VR) content as a multidimensional array that utilizes both dense (e.g., space and time) and sparse (e.g., bit-rate) dimensions. It uses orientation prediction to reduce data transfer by degrading out-of-view portions of the video. Content delivered through Visual Cloud requires up to 60% less bandwidth than existing methods and scales to many concurrent connections.



Maureen Daum

Related Publications


This work is supported by the NSF through grants CCF-1703051, IIS-1546083, CCF-1518703, and CNS-1563788; DARPA award FA8750-16-2-0032; DOE award DE-SC0016260; a Google Faculty Research Award; an award from the University of Washington Reality Lab; gifts from the Intel Science and Technology Center for Big Data, Intel Corporation, Adobe, Amazon, Facebook, Huawei, and Google; and by CRISP, one of six centers in JUMP, a Semiconductor Research Corporation (SRC) program sponsored by DARPA.