Monitoring applications enable users to continuously observe the current state of a system and receive alerts when interesting events occur. For example, an administrator can monitor a cluster of computers, a computer network, the car traffic in some area, etc. In many situations, historical information about current events may help users address ongoing or imminent problems in the monitored system. However, providing timely historical information for real-time events is challenging because of the large volume of historical data.
In this project, we are building a new type of continuous monitoring system called Moirae. The goal of Moirae is to complement newly detected events with useful historical information in near-real-time. To achieve this goal, Moirae allows users to describe what constitutes the interesting context of an event. The system then delivers, for each new event, a set of k results derived from the most similar (in terms of given context) recent events.
In the Moirae project, we are addressing the following challenges related to history-enhanced monitoring.
When a new event occurs in a monitored system, a large fraction of relevant historical information corresponds to those times in the past when the state of the system was the same or similar to the state at the time of the event. The first challenge is thus in efficiently comparing the state of a monitored system at different points in time. Of course, we want to compare only those parts of the state which are relevant to the current event (e.g., the list of logged users and the list of running processes). We call this part of the state the context of the event.
Because the historical log is large, the complete set of past events related to a current event can be large. To avoid overwhelming the user, the monitoring system may present only a small set of k most similar events and their own contexts. These types of queries are often called k-NN queries. However, unlike previous systems which define similarity over individual tuples or objects, we expand the notion of similarity to a set of tuples that constitute an event context.
Naive techniques relying on materializing all past events and their contexts then scanning these events at runtime to report the most similar ones do not work well because of the huge volume of historical data. Users also want the system to behave in real-time so they don't miss any useful past information when an event occurs. The goal of Moirae is thus to examine only small parts of history yet return relevant and useful past information.
The system must work well in the presence of multiple concurrent events. Because continuous queries can produce different events at different or varying rates, the system is likely to have multiple concurrent historical queries. These queries compete for shared resources and the system should be able schedule these queries and allocate resources properly.
Moirae is a framework which tightly integrates a stream processing engine (e.g., Borealis), for continuous monitoring, and an RDBMS, for archiving historical information.
The main insight behind the design of Moirae is that users will
more interested in receiving a few relevant results soon after each
new event (especially if these events are recent), rather than a
complete set of results or the best results with higher latency. We
thus proposed a system architecture based on hierarchical log
partitioning and hierarchical query execution, where the recent past
is stored at a higher cost, but can be queried faster than older
The Moirae project is partially supported by NSF grant IIS-0713123, NSF CRI grant CNS-0454425, a gift from Cisco Systems Inc., a Mitre contract, and Balazinska's Microsoft Research New Faculty Fellowship.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Last Modified at $Id: index.htm 4456 2008-09-08 03:08:43Z yongchul $