A Flexible Data-Centric Approach for Modeling and Analyzing Hyper Connected Megacities
K. Selçuk Candan, Shade T. Shutters and Christian Fortunato
The hyper-connected megacity epitomizes the multi-domain environment. As global centers of trade, communications, and migration, it is broadly recognized that megacities—or more generally Dense Urban Areas (DUA)—will emerge at the epicenter of future events necessitating military intervention. The often picturesque skyline of the megacity belies the interconnected complexity of the systems and people that function within the cities themselves. DUAs extend vertically from outer space where satellites that enable navigation and communications reside downward though urban valleys underground where subways operate, and horizontally across the city from the waterways that support trade to distinct cultural centers to the hinterlands that sustain the city. Urban systems crisscross hundreds of kilometers of sprawl and are further linked to systems that cross land, oceans, space, and airways. The interconnected nature of these systems means that second and third order effects to a change in a system can have drastic, unintended consequences to the system as a whole. In multi-domain operations the breadth scale and scope of the interconnected complexity can have a significant impact on a commander’s decision making process.
Military discussions of urban operations often conjure images of quagmires of urban combat. Though this historical perspective is arguably accurate, these images frame operations in a kinetic perspective. Future military operations in urban areas will be defined by data. Understanding data in terms of the urban operational environment and acting upon the data seamlessly across military domains enables exploitation of the situation and Operating Environment. To enable this capability, a flexible data-centric approach for modeling and analyzing hyper connected megacities is needed.
Commanders, analysts, and operators need a capability that provides a meaningful level of understanding to enable decision making, shaping the situation, and exploiting the Operating Environment. Though capabilities must be developed to support situational understanding and decision making for DUAs, there are several interrelated problems that inhibit the effective development of a multi-model monitoring or decision support capability for DUAs.
- Dense Urban Areas, though similar, requires a multi-model approach.
- Given the wide variety of available models, identification of the appropriate model(s) to answer a specific question is a daunting assignment.
- Integration of multiple models is hampered by inconsistent measures between models; spatial resolution, temporal scale, units of measure, data requirements, and variables.
- Data for use by the modeling capability must be identified, extracted from its native format, aligned with the appropriate dictionary, and cleaned.
- Current visualization methods do not reflect the multidimensional nature of DUA, nor do they adequately enable an operator to understand second and third order effects.
To address these issues and to effectively understand interconnected complexity of Dense Urban Areas an open adaptive framework of integrated modeling capabilities is needed. The framework should enable users to query the modeling framework in order to select the best models for the specific operational question, create an instance of a modeling system for a specific question, tie models to the appropriate datasets, provide operational norms for factors being used in the models, and create an aggregated view of the data interpreted within the models for the operator using the visualization framework.
Figure 1. Conceptual Modeling framework depicting (A) the operational question, (B) an instantiation of the modeling framework, (C) the appropriate data for the model sets, and (D) the visualized data.
A cornerstone of developing an integrated DUA modeling capability is the use of a high-level environment, or framework, that links together different sophisticated computational models so that the output of one sub-model or process can provide input to another. This structuring of models in a single framework creates an adaptive framework capable of evolving to add greater capabilities over time. An open source framework enables the modeling and simulation community to develop models and analytical capabilities that are specifically designed for integration into the framework. The Framework should enable workflows that automatically process and aggregate numerical and graphical outputs to web based graphical dashboards. The use of workflows in this manner allows the models to operate in high speed computing environments while the interface to the models can function on browser enabled and bandwidth limited devices.
In conjunction with the framework a common modeling ontology is required. An ontology establishes a common language to describe and evaluate models. This allows a homogeneous description of model coming from different communities using an established and defined set of criteria. An example ontology is listed in the following graphic (Figure 2).
Figure 2. An ontology used to describe, evaluate, and compare computational models.
Most importantly, an ontology enables an operator to evaluate, compare or contrast the models to support a specific purpose. When fully integrated into a modeling framework the ontology supports structured, operator defined queries, which configure the modeling framework. For example, given a question such as “Which models can be used to discover important figures in a religious organization given their Tweets in the last 6 months in Jakarta and help describe their network” the information in the framework enables the identification and comparison of candidate models across their critical dimensions, depicted as green circles in the following graphic (Figure 3).
Figure 3. The application of the modeling ontology to the specific query, Which models can be used to discover important figures in a religious organization given their Tweets in the last 6 months in Jakarta and help describe their network.
Further, when an ontology is used in this manner and as part of a modeling framework an instantiation of a modeling system can be created to specifically address the question being asked. Essentially, the ontology and framework interact to enable the most appropriate set of models to address a specific question.
In conjunction with the modeling framework and ontology, a data framework should be implemented to provide data sources to the models on demand into an open source data storage architecture. This could be achieved with data profiling and mining techniques to automatically discover and integrate data sources for DUAs. While in general the process cannot be fully automated, machine learning algorithms and similar technologies, would reduce user involvement to only when it is needed to guarantee the desired level of quality for the data. Once a new source is brought into the data framework, historical and static data can be stored in a central repository. Data, once collected, can then be analyzed determine relationships and establish operational norms. Once quantified continuous evaluation of data provides an indicator of trends toward abnormal behavior.
Development of advanced visualization capabilities should focus operators on relevant data through the creation of simplistic, user centric visualization concepts yet utilize minimal bandwidth to enable operation on limited bandwidth networks. The visualization framework should focus on representing large inter-related datasets that are typical to DUAs. A key attribute in developing situational understanding through visualization is providing the tools to understand the relationships between data, from the current situation though predictive analysis and inclusive of second and third order effects within the DUA.
A flexible data-centric approach to understanding megacities provides critical capabilities in the hyper-connected urban multi-domain environment. First and foremost the approach enables users to define specific questions that need answered based upon an ontology, connects the appropriate models and data to a visualization system, and provides graphical and numerical data tailored to answer the specific question posed to the framework. Second, the hosting of the modeling framework on a high-speed computing platform and serving of the visualization system to any network connected browser enabled device means a standard synchronize view of the Operating Environment is potentially accessible at the tactical and strategic level of all services. Finally, the approach creates an open, extensible structure, using a defined set of architects to integrate additional data and modeling capabilities as needed.