D1.1 Understanding of challenges and their impact on network resilience
Delivery date:April 2009
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:NEC
Relevant work package:WP1 Framework for resilience
PDF file:Download
The deliverable proposes a risk assessment-based approach for the characterization of challenges and their impact. It presents the results of Task 1.2 on "Understanding Challenges". We call all events, which have the potential to lead to a degradation of delivered service, challenges to the networked system. Understanding these events and how they affect the system is essential to build defensive measures and remediation strategies in WP2 and WP3. We gather and analyse a variety of challenges from the literature on incidents which have led to system failures or a degradation of service in the past. These incidents encompass classical threats like attacks in the physical or virtual world, large-scale natural and human-caused catastrophes, but also misconfigurations, interoperability problems, and challenging communication environments. Our investigations lead to a taxonomy to classify challenges on their respective nature. This taxonomy is presented and provides the basis for a risk assessment process for resilience. This methodology serves as driver for some prioritization in the treatment of challenges; the assumption is that over the large space of defensive measures and detection algorithms, the ones to be adopted and built should make the system resilient against the challenges with the highest impact. Last, a case study demonstrating the application of the proposed risk assessment process is presented. We use a small community mesh network in the north of England for our studies. An excerpt of our findings in this case study exemplifies our methodology and its outcomes.
D1.2a Defining metrics for resilient networking (interim)
Delivery date:Feb 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:TU Delft
Relevant work package:WP1 Framework for resilience
This is the first deliverable concerning defining metrics for network resilience. In this document, we present a general framework for assessing the robustness of a system, present results from a case study intended to test the approaches presented in this document and outline a number of still open issues that will need to be investigated throughout the remainder of the project.
D1.2b Defining metrics for resilient networking (final)
Delivery date:Sep 2011
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:TU Delft
Relevant work package:WP1 Framework for resilience
This deliverable concerns the denition of metrics for network resilience. In this document, we present a general framework for assessing the robustness of a system, the R-model, and demonstrate how a variety of metrics from simultaneous exploration of a networking system can be captured through one characterization. We propose the concept of resilience envelopes which can be used as an intuitive way to understand the resilience and performance of a network under perturbation, and introduce the notion of resilience classes. Finally, we present results from a number of case studies intended to test the approaches described in this report.
D1.3a Policies for resilience (interim)
Delivery date:Feb 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:NEC
Relevant work package:WP1 Framework for resilience
PDF file:Download
The ResumeNet project proposes a policy-based approach to network and service resilience. This deliverable presents an overview of the policy frameworks that we are considering for using as the basis for building a resilience architecture. Currently, three policy frameworks are investigated: XACML, Ponder2, and Or-BAC. For each framework we present the features that are usable for a resilience architecture. The assessments presented in this deliverable are used by various activities on network and service resilience, which are expected to provide further insights into their applicability to design and build a resilience framework on the presented policy frameworks. Therefore, this deliverable presents only preliminary results and will be updated at the end of the project.
D1.3b Policies for resilience (final)
Delivery date:Sep 2011
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:NEC
Relevant work package:WP1 Framework for resilience
PDF file:Download
The ResumeNet project proposes to use policies to set up the configuration of resilience mechanisms, which implement a resilience strategy. Such strategies, for example, define the configuration of a set of resilience mechanisms in order to detect and mitigate a specific challenge. In this deliverable, we outline a body of requirements for policy-based management frameworks when applied to this problem domain. These requirements are used as a basis for evaluating three prominent policy frameworks, namely XACML, Ponder2 and Or-BAC. In short, we suggest that, whilst these frameworks address many of our requirements, they do not meet all of them, and that no single framework is suitable in all network deployment contexts and can applied to all resilience problems. Evidently, careful consideration is needed when applying them to a specific problem domain, i.e., network context and set of anticipated challenges. We have made use of policy-based management frameworks as part of our activities on the project; here, we summarise how Ponder2 and Or-BAC have been applied. Finally, we identify areas for further research - addressing these items could expedite the adoption of a policy-based management approach to network resilience.
D1.4 Cross-layer optimization and multilevel resilience
Delivery date:Sep 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ULANC
Relevant work package:WP1 Framework for resilience
PDF file:Download
In the ResumeNet project, the D2R2 + DR resilience strategy is being evaluated. A part of this work is an exploration of the usefulness of cross-layer optimization to help assure multi-level resilience in the network. Cross-layering has been envisaged as a support mechanism for the resilience framework. Extracting useful information from the protocol stack and providing essential information for the analysis component should be seen as one of the fundamental tasks that the resilience framework should be capable of doing. An essential goal for resolving the resilience puzzle is what information should be provided and where the information should come from. Having identified information which might affect the decision on remediation, this could serve as the basis to reason about effective resilience actions subsequently. By exploring this direction it should lead us to the information sources to be adopted in the framework. We suggest cross-layering, monitoring, and context as potential candidates for helping us to solve the puzzle: cross-layering as a candidate for obtaining information across the protocol layers, distributed correlated network monitoring for getting a view from the different angles across the network, and context for getting essential information from wherever is appropriate to inform the decision-making. Context information, indeed, is likely to come from outside the network, from the environment in which the network operates. Together, these three aspects constitute an information framework that, we propose, should form the basis for an implementation of the D2R2 +DR resilience strategy. From this study, we have learned that a cross-layer sharing database design could be the most applicable approach for building resilient networks. Next, the use of task-centric monitoring tools could be more applicable for network resilience than device-centric ones. Finally, by considering three case studies, we see the way ahead for further research on the information framework.
D1.5a First interim strategy document for resilient networking
Delivery date:Oct 2009
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ULANC
Relevant work package:WP1
PDF file:Download
Resilience will need to be a key property of future Internet because of our unrelenting demand for Internet-based services, the challenging environments it will operate in, and the continued existence of intelligent adversaries. In this deliverable, we describe our current understanding, after one year in the ResumeNet project, about the challenge of making resilience an integral component of future networks. Our overall strategy to resilience is neatly summarized into the so-called D2R2+DR acronym, standing for Defend, Detect, Remediate, Recover, Diagnose and Refine. This otherwise straightforward approach is only an abstraction of the actual tools that have to be in place to realize it. We first focus on what we consider fundamental components of our resilience framework: resilience metrics, understanding challenges, policies for resilience, and cross-layer information sharing and control. These are the kind of ingredients that can be used to inform the design and implementation of resilience mechanisms. Based upon our current research findings, we reflect on the suitability of the strategy, and describe our on-going work populating the strategy with algorithms and mechanisms. This work, carried out within the context of WP2 and WP3 of the project, is presented in more detail in Deliverable D6.3 "Report of technical work in WP2 and WP3 during the 1st year of the project". Therefore, we limit ourselves to a summary of the undertaken work, pointing the reader to D6.3 for more details. Finally, we discuss the experimental scenarios we will use to evaluate our research. The four study cases exemplify the applicability of the framework concepts and mechanisms and assess their effectiveness in widely variable networking scenarios. In the same time, they retrofit the overall framework design process allowing more quantitative characterization of framework components.
D1.5b Second interim strategy document for resilient networking
Delivery date:Sep 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ULANC
Relevant work package:WP1
D1.5b Second interim strategy document for resilient networking:Download
This second interim strategy document for resilient networking is realised as two publications that discuss our current and developing understanding of the resilience problem: Resilience principles and the relationship of resilience to the various related disciplines, such as fault tolerance, along with other matters, are discussed by Sterbenz et al. [SHc+10]. This article is an introductory piece to an Elsevier Computer Networks Special Issue on resilient networking. The second paper that forms part of the deliverable is to be sent to a magazine outlet, such as IEEE Communications, and shows how the ongoing activities on the resilience framework in WP1 of the project relate to the mechanisms, architectures and experimentation work that are being investigated in WPs 2-4. Of course, given the limited size of such publications (4500 words) we cannot be comprehensive, but instead select highlights that give a flavour of our research. It was found to be useful to understand the relationship of activities in the project to the D2R2 + DR strategy, and therefore each other, by using an analogy of a control loop from systems theory and annotating it with our activities. We have developed this analogy by casting an alternative architectural view on it, which further helps us to understand how the activities in the project interrelate.
D1.5c Final strategy document for resilient networking
Delivery date:Sep 2011
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ULANC
Relevant work package:WP1
PDF file:Download
A continuous thread running through the ResumeNet project has been the development of a framework for resilient networking. The framework reflects what the project has learned about how to design and implement network and service resilience, and as such takes input from all the technical work packages. What has emerged is a systematic approach to network and service resilience, whose core component is a resilience control loop - the central element of our resilience framework. We propose our framework as a consistent approach that identifies how multilevel resilience mechanisms should be deployed. This is based on an understanding of resilience metrics and probable high impact challenges. Resilience mechanisms are managed using a loosely coupled policy-driven management architecture. Our framework improves on the state of the art through this coherent approach.

In this deliverable, we present our resilience framework, which includes implementation guidelines, processes, and toolsets that can be used to underpin the design of resilient networks and services with resilience mechanisms that function at various protocol levels. As one might expect, the deliverable (and the framework described therein) highlights research carried out in the project; throughout we point the reader to key deliverables that provide greater detail of our research outcomes. The elements of our framework that form the key contribution of our research, include:

A multilevel resilience metrics framework Being able to specify and measure desired levels of resilience is of critical importance, and is understood to be an area in which there is little consensus on how to approach it. We have developed a multilevel resilience metrics framework, summarised in Section 2, that can be used to understand and describe the resilience of networks and services, and the relationship metrics from di?erent levels of the protocol stack have e.g., whether they exhibit correlated or orthogonal behaviour. Accompanying the framework is a set of tools, such as simulation models and software libraries for examining metrics [ScH+11, DMH10], that can be used to evaluate a given network topology in the presence of various challenges.

Processes for understanding challenges Deployed resilience mechanisms should be targeted at addressing the most probable high-impact challenges the network may face. In the context of network resilience, the challenges that could occur transcend those normally considered in other thematic areas, such as information security, fault tolerance and disruption tolerant networks. Without considering this broad spectrum of challenges, mechanisms could be inappropriately deployed. To manage this problem, we have developed a risk assessment process that can be used to identify high-impact challenges [SSH11]. This process builds on an informal categorisation of the forms of challenges that one must consider to ensure network resilience [SS09]. Our challenge categorisation and risk assessment process is described in Section 3.

A resilience management architecture The management of multilevel resilience mechanisms that potentially interact across di?erent administrative domains can be complicated. Furthermore, the operation of resilience mechanisms should in many cases be done in real-time with potentially limited human intervention; incorrect operation could have significant negative consequences. To tackle these issues, we have developed a loosely coupled network management architecture, outlined here in Section 4, which makes use of policies to specify multi-stage resilience strategies - configurations of mechanisms that address a given challenge set [DSS+10]. By using policies, strategies can be carefully crafted and evaluated, using a policy-driven network simulator we have developed, without the need to take resilience mechanisms o?-line [YFSF+11].

A set of resilience mechanisms We have developed a number of resilience mechanisms that can be applied to a wide range of challenges. They span a number of stages of the D2R2+DR strategy and function at the network and service level. In particular, we have produced mechanisms to address malicious behaviour in networks, such as monetary-less cooperation incentives to mitigate selfish nodes in wireless mesh networks [PGLK10],game-theoretic approaches to protection against malware propagation [OOVM09], and an anomaly detection approach to detect and traceback attacks on encrypted protocols [FTV+10]. Furthermore, our mechanisms can be applied at different levels of the protocol stack in light of node and link failure, and include novel approaches to multipath routing in multi-hop wireless networks [LSZB10] and algorithms for creating resilient large-scale overlay networks [GHK11]. In Section 5, we highlight the novel aspects of the mechanisms developed in the project, and their likely deployment timescales. An enemy of network resilience is complexity; using multilevel resilience mechanisms that share information and perform cross-layer control has the potential to increase complexity and produce undesirable emergent behaviours. To address this problem, we have developed a cross-layer framework, which uses a formalism to evaluate the optimal layer to place resilience functionality, thus reducing replicated functionality at different layers [BBF+10]. We introduce the formalism in Section 5.

Approaches to challenge detection Our understanding of the purpose of the detect stage of the D2R2 + DR strategy has evolved over the lifetime of the project. We understand that its primary goal is to build situational awareness to inform decision making regarding remediation and recovery. How we have applied theories of situational awareness to the detect stage is discussed in Section 6. To identify challenges, we propose an incremental multi-stage approach that enables rapid remediation to reduce the likelihood of challenges causing catastrophic failure [YFSF+11]. Subsequently, remediation can be refined using improved challenge identification mechanisms. To support this multi-stage approach, we have developed an challenge identification architecture, which can be implemented using model-driven fault localisation techniques [SFM+10]. Ensuring resilience is a venture that should be tackled at multiple levels of the protocol stack in diverse topological (and geographical) locations. This involves information sharing across protocol layers, to build situational perception. We have investigated what multilevel metrics should be measured for resilience, and which tools should be used to collect and distribute this information. Our initial findings on this matter are presented in Section 6.

Aspects of the framework are readily applicable, whereas other elements represent our longer-term vision of how to realise network resilience. For example, the toolsets that are part of the multilevel metrics framework can be applied immediately to gain an understanding of the resilience of networks and services to various challenges. Furthermore, some of the resilience mechanisms we have developed, particularly those that operate at the service level, can be used to address challenges in the near-term future. Our longer-term vision for ensuring network resilience is embedded in our resilience management architecture and challenge detection approaches. These are arguably more disruptive from a (business) processes and technical implementation perspective, and further research is required in some cases to confirm their applicability. In Section 7, we discuss important areas for future research.
D1.6 Collaborative cross-layer ,monitoring as resilience enabler
Delivery date:Jan 2012
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:University of Passau
Relevant work package:WP1 Framework for resilience
PDF file:Download
Distributed correlated network monitoring is important for network resilience in order to identify faults and threats that are otherwise hard to track. In particular, challenges involving a combination of diff erent services may pose a problem for troubleshooting. In order to diagnose the root cause of a given challenge, multitudes of monitoring tools/methods are in use, each having its own advantages and disadvantages. Nowadays, end-user services are often combined from a multitude of network services and will fail if one of those provider services fails. From the user perspective, it is diffcult to remediate these challenges without intricate knowledge about the underlying process. Cooperative correlation of events on diff erent network layers is an important building block in uncovering complex challenges. This deliverable presents a use-case scenario and highlights the importance of distributed correlated network monitoring that enables information to be gathered from multiple layers and renders networks and services resilient to challenges, such as attacks and failures.
D2.1a First draft on defensive measures for resilient networks
Delivery date:
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:FT
Relevant work package:WP2
D2.1a First draft on defensive measures for resilient networks:Download
This is the _rst deliverable concerning defensive measures for network resilience. Being the _rst D in the D2R2+DR framework, defensive measures are also the _rst step toward building resilient networks. Defensive measures include all actions that can be performed before the challenges e_ectively occur so that the network is su_ciently well-armed to resist most of these challenges without considerable performance degradation. In this deliverable, we concentrate on selected defensive measures in these areas that are being addressed in ResumeNet. These measures include the network design process (de_ne the topology and dimension the links), the choice of e_cient routing mechanisms, and a protection strategy assessed by relevant models.
D2.1b Defensive measures for resilient networks
Delivery date:Sep 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:TUDelft
Relevant work package:WP2
This is the second deliverable concerning defensive measures for network resilience. Being the first D in the D2R2+DR framework, defensive measures are also the first step toward building resilient networks. Defensive measures include all actions that can be performed before the challenges effectively occur so that the network is sufficiently well-armed to resist most of these challenges without considerable performance degradation. In this deliverable, we concentrate on selected defensive measures in these areas that are being addressed in ResumeNet. These measures include the network design process (define the topology and dimension the links), the choice of efficient routing mechanisms, and a protection strategy assessed by relevant models.
D2.2a First draft on new challenge detection approaches
Delivery date:Feb 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ULg
Relevant work package:WP2
PDF file:Download
This deliverable presents the problem of challenge detection, that is, how do we allow a system to \understand" a challenging situation by letting it identify its occurrences and assert its impact. These two outputs are required to later select a proper mitigation strategy. We motivate this approach and detail how it improves from previous anomaly, intrusion and DoS detection techniques. We exemplify this approach in the context of wireless mesh networks and opportunistic networks on specific challenges. The different stages of detection are mapped onto monitoring, analysis and con figuration activities suited to the considered scenario. We then present the design of a distributed information store, which assist and captures information exchanges between the detection and correlation algorithms. The gathered data are then automatically organised and stored so that it can be fed to machine-learning based process involved in detection, remediation and later diagnostic tasks. We illustrate its behaviour in the well-studied case of resource starvation challenges such as DDoS.
D2.2b New challenge detection apporaches
Delivery date:Sep 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ULg
Relevant work package:WP2
PDF file:Download
This deliverable supersedes D2.2a [SFM+10] in presenting the problem of challenge detection, that is, how do we allow a system to understand a challenging situation by letting it identify its occurrences and assert its impact. These two outputs are required to later select a proper mitigation strategy as discussed in the companion deliverable D2.3b [SDS+10]. We motivate this approach and detail how it improves from previous anomaly, intrusion and DoS detection techniques. We then present the design of a policy-driven challenge identi cation architecture and the distributed information store, which assist and captures information exchanges between the detection and correlation algorithms. The gathered data are then automatically organised and stored so that it can be fed to machine-learning based process involved in detection, remediation and later diagnostic tasks. We illustrate their behaviour in the well-studied case of resource starvation challenges such as DDoS. We further exemplify our approach in the context of wireless mesh networks and opportunistic networks on specific challenges. The different stages of detection are mapped onto monitoring, analysis and configuration activities suited to the considered scenario.
D2.3a 1st draft on the remediation, recovery, and measurement framework
Delivery date:Feb 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:NEC
Relevant work package:WP2
PDF file:Download
This deliverable presents the preliminary version of the ResumeNet project's adaptation framework, which implements the two phases of remediation and recovery of our D2R2 + DR strategy. Initially, three scenarios are presented which provide the requirements the adaptation framework has to fulfil. From these requirements, we derived an architecture for the framework. It is based on the DISco system, introduced in D2.2a, which triggers remediation and recovery steps by informing the adaptation framework about ongoing challenges and their cessation, respectively. Designs on how to realize this architecture are presented, thereafter. One design is based on access control policies, whilst the second is based on obligation policies. Finally, we introduce the Graph Explorer - a technology to assist the policy-based adaptation framework with solving topology-related optimization problems. This deliverable will be updated by its final version in six months. We provide information throughout the deliverable on how we are proceeding with our work and its expected integration.
D2.3b Remediation, recovery, and measurement framework
Delivery date:Sep 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:NEC
Relevant work package:WP2
PDF file:Download
This deliverable presents the ResumeNet project's adaptation framework, which implements the two phases of remediation and recovery of our D2R2+DR strategy. It updates Deliverable 2.3a, which presented the preliminary version of this document. Initially, three scenarios are presented which provide the requirements the adaptation framework has to fulfill. From these requirements, we derived an architecture for the framework. It is based on the DISco system, introduced in D2.2b, which triggers remediation and recovery steps by informing the adaptation framework about ongoing challenges and their cessation, respectively. Designs on how to realize this architecture are presented, thereafter. One design is based on access control policies, whilst the second is based on obligation policies. Finally, we introduce the Graph Explorer - a technology to assist the policy-based adaptation framework with solving topology-related optimization problems. The main updates of this deliverable can be found in the network resilience technology section (Section 4.2): the section on dynamic adaptation of access policies has been signi cantly updated. In addition, a new section on how to utilize the Graph Explorer to optimize our rope ladder forwarding scheme - a multi-path forwarding scheme for high resiliency - and a second new section on isolating services during remediation has been added to this section.
D2.4a First draft of the learning framework for resilient networks
Delivery date:Dec 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:NEC
Relevant work package:WP2
PDF file:Download
This deliverable presents the ResumeNet project's work objectives on the evolution framework, which implements the background control loop of our $D^2R^2 + DR$ strategy. It is based on the ResumeNet Deliverables D2.1b, D2.2b, and 2.3b, which presented the real-time control loop part of our strategy. The work items on the evolution of a networked system use information from past cycles of this real-time control loop to learn for improving future adaptation, and to extract new features from the collected data, such as placement strategies of multipath trunks. Note, that we consider learning schemes as an essential building block of this system evolution, but do not envision to propose new learning algorithms. In this deliverable we focus on the whole background control loop enabling evolution instead of learning only.
D2.4b The learning framework for resilient networks
Delivery date:Dec 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:NEC
Relevant work package:WP2
PDF file:Download
This deliverable presents the ResumeNet project's work objectives on the evolution framework, which implements the background control loop of our $D^2R^2 + DR$ strategy. It is based on the ResumeNet Deliverables D2.1b, D2.2b, and 2.3b, which presented the real-time control loop part of our strategy. The work items on the evolution of a networked system use information from past cycles of this real-time control loop to learn for improving future adaptation, and to extract new features from the collected data, such as placement strategies of multipath trunks. Note, that we consider learning schemes as an essential building block of this system evolution, but do not envision to propose new learning algorithms. In this deliverable we focus on the whole background control loop enabling evolution instead of learning only.
D3.1a Taxonomy of P2P, Overlays and Virtualization Techniques with respect to service resilience
Delivery date:Oct 2009
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:UP
Relevant work package:WP3
PDF file:Download
The Future Internet will be built using recent and future technological advancements, but the ever increasing speed of network development makes it difficult to gain a complete overview of all technologies. The scope of this document is to present the technologies that are being considered in ResumeNet to build resilient services. In particular, system virtualization and Peer-to-Peer overlay networks are analyzed in order to offer a starting point into Future Internet research.
D3.1b Resilient service architecture (interim)
Delivery date:Apr 30, 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:TUM
Relevant work package:WP3
PDF file:Download
The ResumeNet project overall strategy to resilience is neatly summarized into the so-called D2R2+DR acronym, standing for Defend, Detect, Remediate, Recover, Diagnose and Re fine. This otherwise straightforward approach is only an abstraction of the actual mechanisms that have to be in place to realize it. The ResumeNet Work Package 3 (Service-level resilience) investigates resilience mechanisms particularly at the application layer. The general approach is to use techniques which can provide abstractions from the underlying hardware resources and thus can proactively (Defence) improve the resilience of the services provided by the network. These techniques are more precisely: i) P2P signalling for session setup and service lookup; ii) overlay-based end-to-end connectivity; and iii) virtualisation. These techniques facilitate Remediation in the presence of challenges, and thus allow for a faster Recovery.

Additionally to the proactive mechanisms, the network and services are monitored continuously in order to Detect challenges in a timely manner. With the Detection, all four steps of the inner loop of the ResumeNet resilience strategy (Defence, Detect, Remediate, Recover: D2R2) are realised at the service level.

This delieverable summarizes the status of research work in WP3. It could be regarded as an aggregate intermediate deliverable for multiple upcoming deliverables (D3.2 \Service Surveillance" in M22, D3.3 \P2P Overlays and Virtualisation for Service Resilience" in M24 and D3.4 \Overlay-based Connectivity" in M30). The nal ResumeNet architecture for resilient services will be presented in an updated version of this deliverable, which will be in M36 (D3.1c \Resilient Service Architecture (Final)").
D3.1c Resilient service architecture (final)
Delivery date:Aug 31, 2011
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:TUM
Relevant work package:WP3
PDF file:Download
The services market is growing very rapidly worldwide with superb margin in the last years. Networking of services is a fundamental requirement, not only because a large portion of services is provided via the Internet but also because a service is often compound out of different networked components which need to function together in order to provide the service properly.

This documents provides a framework and guidlines for resilient networked services. It summarizes the results of the research work within the ResumeNet project in the area of resilient services (WP3). This includes innovations in the areas of P2P networking, virtualisation, overlay- based connectivity, service surveillance, challenge detection using chronicles, and policy-based remediation. Moreover, we describe further service resilience mechanisms typically used today. This allows for an overview of service resilience mechanisms including innovative ones developed in the project.

This document is complementary to other ResumeNet project deliverables which result from WP2, ``Network-level resilience"" where resilience aspects are addressed lower in the network stack, e.g., resilient physical network topologies, path diversity and resilience in wireless mesh networks (WMNs). Both work at the network-level and service-level resilience follow the D2 R2 -DR strategy (Defence, Detect, Remediate, Recover, Diagnose and Refine).
D3.2 Service surveillance and detection of challenging situation (interim)
Delivery date:Jun 30, 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:FT
Relevant work package:WP3
PDF file:Download
This interim deliverable, result of the task 3.4 entitled "Service surveillance and detection of challenging situations" is about the Detection/Remediation/Recovery (DR2) phases of ResumeNet's D2R2+DR strategy.

The task aims to build a framework for monitoring any service requiring a certain level of resilience. To this end, probes need to be inserted at the proper location in order to detect abnormal events. Their outputs, called alarms, are analyzed and treated using a correlation engine which requires as input a clear definition of challenging situations, including the resilience metrics used, for the service under study. The outcome of this analysis, i.e., a challenge detection called alert, will trigger remediation, e.g., with the help of policies deployment/modification. Continuous monitoring and event analysis provide information about the end of the threat, leading to actions to recover the system, bringing the service back to its normal performance operation.

As challenge detection is also realized in ResumeNet, but at a network level (task 2.2), the general architecture adopted in the project for this DR2 purpose, including the Publish/Subscribe distributed store for challenges and remediation, is described. After a general description of the service monitoring problem, the approach proposed for this task in presented. Basic monitoring principles will be then surveyed, focusing on general policy, techniques, data, and maintenance operations. The scope of events correlation, i.e., fault localization, will finally be presented.

The correlation engine to be used is based on chronicles recognition, which will be described, followed by some applications of this technique in networks and services alarms analysis for security-related objectives (intrusion detection, reflexive DDoS attack), or dependability-related concerns (recovery actions monitoring, handover initiation in mobile systems).

The illustration of the Remediation phase, i.e., reaction to an alert generated by the correlation analysis, is done with the use of a dynamic policy engine, through the experimentation scenario "Communicating objects' data platform" which will deployed as a service use case in task 4.4. The conclusion reviews the principal contributions of this deliverable, as well as some aspects of the work planned during year 3 which will be reported in the final version of D3.2.
D3.3 P2P overlays and virtualization for service resilience
Delivery date:Aug 31, 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:TUM
Relevant work package:WP3
PDF file:Download
P2P networking and virtualisation are promising techniques among the resilience mechanisms which have been investigated in ResumeNet for providing resilient network services. A major and common motivation behind both these techniques is that they provide an abstraction from the underlying hardware resources. Thus, they allow for isolating failures at lower layers or at least reducing the Mean-Time-To-Repair (MTTR) if failures occur. Therefore, P2P networking and virtualisation address inherently several aspects of the ResumeNet D2R2-DR strategy, particularly the inner control loop. This deliverable describes the results of our investigations in these two topics. We take Voice over IP (IP) as an example of a critical application. Based on reliability theory and traces from the Skype network, we provide a quantication of the reliability that can be provided by a P2P network for VoIP session setup.

In contrast to a pure P2P network approach, we propose a supervised P2P network approach to address the security issues in P2P networks. The \supervisor" provides the peers with veri able identities. It is involved in the session setup under normal operation. However, even if the supervisor is unavailable, the service can still be provided by the P2P network. A supervised P2P network can be considered as a solution between server-based and pure P2Pbased signaling solutions. The goal is the combination of the advantages of both architectures leading to improved reliability and security. An extensive security threat analysis evaluates to what extent the supervisor addresses the security issues of P2P networks and which additional security mechanisms are still required.

Virtualisation allows for migration of services with a variety of migration strategies. However, resilience requirements of services can vary widely. Thus, there is a need for managing service migration in a resilient way, picking the best strategy for service migration. In the paper attached to this deliverable, we present an architecture for resilient service migration, taking into account changing service requirements and di erent properties of migration strategies.
D3.4 Overlay-based end-to-end connectivity
Delivery date:Feb 28, 2011
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:TUM
Relevant work package:WP3
PDF file:Download
This deliverable describes ongoing work on an overlay framework that helps to protect IP connectivity between hosts. We describe challenges against which we protect; we develop requirements and design of the overlay framework. Our main contribution is a description of Perco-Pastry, our overlay routing algorithm. Furthermore, we describe our method to derive a node's geographic location from RTT measurements.
D4.1a Federation requirements (interim)
Delivery date:Apr 2009
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:NEC
Relevant work package:WP4 Experimental evaluation of resilient networking
PDF file:Download
This is the first one of the two "light" deliverables requested by European Commission in the context of the project's commitment to closed interaction with the FIREworks Coordination Action. The deliverable aims at providing inputs to FIREworks for the compilation of a deliverable on federation requirements, which will aggregate contributions from all FIRE projects.

The deliverable consists of three main sections: an introduction briefly summarizing the scope of the experimentation within ResumeNet and the current thinking regarding the kind of experimentation facilities to be used. Note that since the project is only six months old and the experimentation activities officially start in M18, some of these decisions will be definitely reiterated and may be modified in the year to come.

The second section outlines the four experimental scenarios in the project and the testbeds that will be used to realize them, whereas the last section brings together some thoughts on the, rather limited, federation requirements coming out of ResumeNet.
D4.1b Federation requirements
Delivery date:Mar 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ETHZ
Relevant work package:WP4 Experimental evaluation of resilient networking
PDF file:Download
The deliverable 4.1 is one of the project "light" deliverables requested by European Commission in the context of the project's commitment to close interaction with the FIREworks Coordination Action. The deliverable aims at providing inputs to FIREworks for the compilation of a deliverable on federation requirements, which will aggregate contributions from all FIRE projects.

This is the second release of the document, which updates and revises the first version as issued in February 2009.

The deliverable consists of three main sections: an introduction briefly summarizing the scope of the experimentation within ResumeNet and the current thinking regarding the kind of experimentation facilities to be used. The main part of the experimentation activities will be carried out within WP4, officially launched in M18 of the project (March 2010); yet, there has been considerable interaction on experimentation issues in the project. The second section outlines the four experimental scenarios in the project and the testbeds that will be used to realize them. The level of detail in this description varies inline with the progress made so far in each experimentation scenario. Finally, the last section brings together some thoughts on the, rather limited, federation requirements coming out of ResumeNet.
D4.2a Interim report on experimental evaluation of resilient networking
Delivery date:Oct 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:Uppsala University
Relevant work package:WP4 Experimentation with resilient networking
PDF file:Download
This deliverable concerns the experimental evaluation of resilient networking. Our approach is to evaluate the ResumeNet framework in four different scenarios, highlighting the versatil- ity of the framework: Wireless mesh networks, opportunistic networks, cooperative SIP, and a publish/subscribe platform for communicating objects, i.e., spaces filled with sensors and actuators. The scenarios are complementary with respect to challenges, mechanisms, and framework components to address as many as possible resilience aspects in the context of different types of networks and network operations.

The scenarios are described following the ResumeNet strategy and mapped to the parts of the resilience framework they address. We highlight what mechanisms proposed within the project that are evaluated by the experiments. A table summarizing the mapping shows good coverage. We finally discuss what kind of results that can be expected.

Ongoing work has already led to preliminary results presented in the appendix of this deliverable. Most experimentation has also generated collaboration between the partners, either explicitly in WP4 or in the context of the other work packages.
D4.2b Interim report on experimental evaluation of resilient networking
Delivery date:Jan 2012
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ETH Zurich
Relevant work package:WP4 Experimentation with resilient networking
PDF file:Download
This document presents the the set of experiments performed in WP4 on a per-scenario basis. Each scenario tests the D2R2+DR strategy for a number of practical situations thereby proving that it is generally applicable. The nal results obtained during activities in work package 4 are presented and explained in the context of the proposed network resilience strategy. They show how various components are mapped in practice to the functional blocks of the D2R2 + DR concept. An interesting observation is that the mapping may have a different form, depending on the scenario. For instance, the fi rst scenario shows that in self-enforcing protocols, defense is the dominant component which ful lfils the main functions, the remaining blocks having minimal functionality. For each scenario, the conclusions and lessons learnt regarding the practical use of the proposed resilience strategy are gathered in a separate section. Finally, since the opportunistic networks are a particularly challenging setting, we give a more formal account of the experiments through the four papers attached in the appendix.
D4.s Special deliverable on WP4 experimentation plans (final)
Delivery date:Jan 2012
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ETH Zurich
Relevant work package:WP4 Experimentation with resilient networking
PDF file:Download
The present deliverable provides the information required by reviewers during the second review meeting in Brussels. This document presents in more detail the experimentation plans and the schedule thereof in WP4, until the end of the project. We address the review remarks by motivating the chosen experimentation scenarios and the practical signfi cance of the chosen parameters, explaining the integration of different components and showing the implicit multi-level character of the resilience mechanisms devised. For the fi rst two scenarios, wire- less mesh and opportunistic networking, the component integration consists of implementing mechanisms which have only been analyzed in WP2 from a theoretical perspective. From this point of view, WP4 is not only the point where these components are integrated, but also where they are built in the first place. As we have decided to validate the D2R2+DR strategy for complementary scenarios covering different types of networks in order to demonstrate its generality, software components are largely developed by each partner for its own experimentation scenario. Integration between partners at the implementation level is represented in a more complex scenario (service-level resilience), which requires both VoIP and P2P as well as virtualization technologies.
D5.2a First year report on dissemination activities
Delivery date:Oct 1st 2009
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:C. Lac (FT)
Relevant work package:WP5
D5.2a First year report on dissemination activities:Download
This deliverable reports on the numerous dissemination and standardization activities carried out during the first year of the ResumeNet project. These activities include:

D5.2b Second year report on dissemination activities
Delivery date:Sep 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:FT
Relevant work package:WP5
D5.2b Second year report on dissemination activities:Download
This deliverable reports on the different dissemination and standardization activities carried out during the second year of the ResumeNet project. These activities include:



For each activity, the involved consortium partners are explicitly mentioned.
D5.2c Final year report on dissemination and standardization activities
Delivery date:Jan 2012
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:FT
Relevant work package:WP5
D5.2c Final year report on dissemination and standardization activities:Download
This deliverable reports on the different dissemination and standardization activities carried out during the third year of the ResumeNet project, and provides in its annex a synthesis of the publications during the first and second year of the project. These activities include: · the maintenance of a public Web site (for external use) and Wiki pages (for daily internal use);



For each activity, the involved project consortium partners are explicitly mentioned.
D5.3a Exploitation plans (Interim)
Delivery date:Sep 2010
Nature:R (Report)
Dissemination:CO (Confidential, only for members of the consortium, including the Commission Services)
Lead partner:FT
Relevant work package:WP5
As described in [D5_2b], the ResumeNet consortium as a whole, makes maximum use of capabilities provided by clustering activities to ensure propagation by dissemination of research results towards all target audiences worldwide. In addition to this action, this deliverable describes the exploitation tasks the project has planned, some of those being already realized. After an introductory chapter describing the exploitation strategy adopted both by the consortium academic members and its industrial partners, the deliverable first reports on the following diverse academic actions:



The exploitation plan followed by the two industrialists of the consortium will finally be detailed.
D5.3b Exploitation plans (Final)
Delivery date:Jan 2012
Nature:R (Report)
Dissemination:CO (Confidential, only for members of the consortium, including the Commission Services)
Lead partner:FT
Relevant work package:WP5
As indicated in [D5_2c], the ResumeNet consortium as a whole, makes maximum use of capabilities provided by clustering activities to ensure propagation by dissemination of research results towards all target audiences worldwide. In addition to this action, this deliverable describes the exploitation tasks realized by the project. After an introductory chapter describing the exploitation strategy adopted both by the consortium academic members and its industrial partners, the deliverable first reports on the following diverse academic actions:



The exploitation plan followed by the two industrialists of the consortium will finally be detailed.
This deliverable is a revised version of D5.3a "Exploitation plans (Interim)" [D5_3a].
D6.1 Project management guidelines
Delivery date:Oct 2008
Nature:R (Report)
Dissemination:PP (Restricted to other programme participants, including the Commission services)
Lead partner:ETHZ
Relevant work package:WP6 Project management
This deliverable sets out the overall management structure for planning, carrying out and monitoring the day-to-day co-operative work within the ResumeNet project (Grant Agreement 224619).

The first section outlines the management entities in ResumeNet, the full details being given in Article 6: "Governance structure" in the ResumeNet Consortium Agreement.

The second one addresses the management tools that have been set up for supporting the management of the project, ranging from meetings and PhCs to software infrastructure, such as mailing lists and Wiki software.

Finally, the third section reviews the main management procedures that may be used in the course of the project.
D6.2a Links between research and experimentation
Delivery date:Apr 2009
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ULANC
Relevant work package:WP6 Project management
PDF file:Download
This is the second "light" deliverable requested by ESC in the context of the project's commitment to closed interaction with the FIREworks Coordination Action. The deliverable aims at providing inputs to FIREworks for the compilation of a deliverable on links between research and experimentation, which will aggregate contributions from all projects running under the FIRE initiative.

The deliverable explains how experimentation will be applied to validate our research findings within the ResumeNet project. In short, the ResumeNet work package structure is such that experimentation results feed into a core task that runs for the duration of the project. This task is intended to consolidate these results, and produce a number of deliverables throughout its lifetime that describe best practices and strategies for building resilient networked systems.

To conduct our experiments, we will use a number of bespoke testbeds. While there are no overt plans to integrate these into wider testbeds, the project is open to suggestions from the FIREworks coordination action regarding this matter. Finally, the deliverable discusses open issues relating to testbed sustainability and desired features.
D6.2b Links between research and experimentation
Delivery date:Apr 2010
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ULANC
Relevant work package:WP6 Project management
PDF file:Download
The deliverable is one of the project "light" deliverables requested by EC in the context of the projects commitment to close interaction with the FIREWorks Coordination Action. The deliverable aims at providing inputs to FIREWorks for the compilation of a deliverable on links between research and experimentation, which will aggregate contributions from all project running under the FIRE initiative.

The deliverable explains how experimentation will be applied to validate our research endings within the ResumeNet project. In short, the ResumeNet work package structure is such that experimentation results feed into a core task that runs for the duration of the project. This task is intended to consolidate these results, and produce a number of deliverables throughout its lifetime that describe best practices and strategies for building resilient networked systems. To conduct our experiments, we will use a number of bespoke testbeds. While there are no overt plans to integrate these into wider testbeds, the project is open to suggestions from the FIREWorks coordination action regarding this matter. Finally, the deliverable discusses open issues relating to testbed sustainability and desired features.

Experimentation within the project did not ocially start until M18, therefore the discussion presented here describes the current thinking and should be viewed as ongoing work; aspects of it could be changed in due course.
D6.3 Report of technical work in WP2 and WP3 during the 1st year
Delivery date:Oct 2009
Nature:R (Report)
Dissemination:PU (Public)
Lead partner:ETHZ
Relevant work package:WP6 Project management
PDF file:Download
The deliverable D6.3 originates from the management WP (WP6) of the ResumeNet project in response to a request coming from the European Union. Since WP2 and WP3 have been officially initiated in month M9 of the project lifetime, no deliverables reporting on technical work progress in these two WPs are due by the fi rst annual review of the project (M12). Therefore, the current deliverable aims at summarizing the status of research work in these two WPs. It could be viewed as an aggregate intermediate deliverable for multiple deliverables that are due in M18 and report on the research work progress in specifi c WP2 and WP3 tasks.

The structure of the deliverable is rather straightforward. There is an introductory section giving the overall view of the project and focusing on the positioning of the two WPs in it. Emphasis is given on the way WP2 and WP3 enable the resilience framework described in WP1. Then the report for each of the two WPs is organized along the research themes pursued in them. The presentation of each theme includes: a) description of the problem along with outline of the starting point of investigations (i.e., state-of-the-art); b) research methodology adopted for addressing the problem together with work carried out so far within the project. In some cases, this includes activities that have been running prior to the project official launch and are now further pursued in the context of ResumeNet; c) the work plan for the next six months.

There are nine research themes in WP2 and four in WP3 addressing various aspects of the generic D2R2 + DR (Defence, Detection, Remediation, Recovery plus Diagnosis, and Re finement) resilience strategy adopted in ResumeNet. Most of the themes, e.g., Survivable Network Design (2.2), Distributed Challenge Detection (2.6), Distributed Store (2.7), Virtualisation (3.4), look into fundamental mechanisms that should be deployed in the network. Whereas others, such as themes 2.9 (Coping with Node Misbehaviours in Wireless Mesh Networks) or 2.10 (Coping with Node Misbehaviours in Opportunistic Networks), tailor the more generic work in WP1 and other themes in WP2 and WP3 to the experimentation cases pursued in WP4.

In all cases, the attempt has been to retain the description of themes as compact as possible, yet without screening the technical elements. Pointers to scientific publications with more exhaustive detail on the themes are provided where appropriate.
D6.4a Periodic progress report
Delivery date:Apr 2009
Nature:R (Report)
Dissemination:RE (Restricted to a group specified by the consortium, including the Commission services)
Lead partner:ETHZ
Relevant work package:WP6 Project management
PDF file:Download(finance-related info are omitted)
This represents the project progress report for the period Sep 2008 - Feb 2009.

The work in the context of the ResumeNet project proposes a fundamentally new architectural approach to Internet resilience that is multilevel, systemic, and systematic. At the same time, we aim to maximize interoperability with legacy network components. We define resilience as the ability of the network to provide and maintain an acceptable level of service in the face of various faults and challenges to normal operation.

Our architectural approach can be summarized as follows: first, we develop a set of architectural principles on which resilient systems in general, and the Internet in particular, should be based. Then we characterize the challenges for the network operation to understand the threats against which the network must be resilient; the resilience aim can be generally achieved via a six-step strategy: defense, detection, remediation, recovery, diagnosis and refinement.

In ResumeNet, besides detailing and quantifying the aforementioned framework, the aim is to also look into particular mechanisms that can be viewed as its building blocks (monitoring, learning processes, decision engines). Last, the project picks particular network-level and service provision scenarios for deepening into the mechanism-level analysis and carrying out their experimental evaluation.

ResumeNet aims at having a broader socio-economic impact by contributing, though not to the same extent, to the following four points, as quoted from the FP7 ICT:


The emphasis on these first six months of the project has been on the development of the framework for embedding resilience in the future networks.
D6.4b Periodic progress report
Delivery date:Oct 2009
Nature:R (Report)
Dissemination:RE (Restricted to a group specified by the consortium, including the Commission services)
Lead partner:ETHZ
Relevant work package:WP6 Project management
PDF file:Download(finance-related info is omitted)
This represents the project progress report for the period Sep 2008 - August 2009.
D6.4c Periodic progress report
Delivery date:May 2010
Nature:R (Report)
Dissemination:RE (Restricted to a group specified by the consortium, including the Commission services)
Lead partner:ETHZ
Relevant work package:WP6 Project management
PDF file:Download(finance-related info is omitted)
This represents the project progress report for the period September 2009 - February 2010.
D6.4d Periodic progress report
Delivery date:
Nature:R (Report)
Dissemination:RE (Restricted to a group specified by the consortium, including the Commission services)
Lead partner:ETHZ
Relevant work package:WP6 Project management
PDF file:Download(finance-related info is omitted)
This represents the project progress report for the period September 2009 - October 2010.
D6.4e Periodic progress report
Delivery date:May 2011
Nature:R (Report)
Dissemination:RE (Restricted to a group specified by the consortium, including the Commission services)
Lead partner:ETHZ
Relevant work package:WP6 Project management
PDF file:Download(finance-related info is omitted)
This represents the project progress report for the period September 2009 - October 2010.

Public deliverables


TitleDelivery date
D1.1Understanding of challenges and their impact on network resilienceFeb 2009
D1.2aDefining metrics for resilient networking (interim)Feb 2010
D1.2bDefining metrics for resilient networking (final)Sep 2011
D1.3aPolicies for resilience (interim)Mar 2010
D1.3bPolicies for resilience (final)Sep 2011
D1.4Cross-layer optimization and multilevel resilienceOct 2010
D1.5aFirst interim strategy document for resilient networking Oct 2009
D1.5bSecond interim strategy document for resilient networkingSep 2010
D1.5cFinal strategy document for resilient networkingSep 2011
D1.6Collaborative cross-layer monitoring as resilience enablerJan 2012
D2.1aFirst draft on defensive measures for resilient networksDec 2009
D2.1bDefensive measures for resilient networks Sep 2010
D2.2aFirst draft on new challenge detection approaches Mar 2010
D2.2bNew challenge detection approachesSep 2010
D2.3aFirst draft on the remediation,recovery, and measurement frameworkMar 2010
D2.3bRemediation, recovery, and measurement frameworkSep 2010
D2.4aFirst draft of the learning framework for resilient networksDec 2010
D2.4bThe learning framework for resilient networksJan 2012
D3.1aTaxonomy of P2P, overlays and virtualization techniques with respect to service resilienceOct 2009
D3.1bResilient service architecture (interim)Apr 2010
D3.1cResilient service architecture (Final)Sep 2011
D3.2Service surveillance and detection of challenging situation (interim)Jul 2010
D3.3P2P overlays and virtualization for service resilienceSep 2010
D3.4aOverlay-based end-to-end connectivityMar 2011
D3.4bOverlay-based end-to-end connectivity (Final)Dec 2011
D4.1aFederation requirements (interim)Feb 2009
D4.1bFederation requirements (final)Mar 2010
D4.2aInterim report on experimental evaluation of resilient networkingOct 2010
D4.2bFinal report on experimental evaluation of resilient networkingJan 2012
D4.sSpecial deliverable on WP4 experimentation plans (final)Feb 2011
D5.1ResumeNet website and Wiki pagesOct 2008
D5.2aFirst year report on dissemination activitiesOct 2009
D5.2bSecond year report on dissemination activitiesSep 2010
D5.2cFinal report on dissemination and standardization activitiesJan 2012
D5.3aExploitation plans (Interim)Sep 2010
D5.3bExploitation plans (final)Jan 2012
D6.1Project management guidelinesOct 2008
D6.2aLinks between research and experimentation (interim) Feb 2009
D6.2bLinks between research and experimentation (final)Apr 2010
D6.3Report on technical work in WP2 and WP3 during first yearOct 2009
D6.4aPeriodic progress reportMar 2009
D6.4bPeriodic progress reportOct 2009
D6.4cPeriodic progress reportMay 2010
D6.4dPeriodic progress reportOct 2010
D6.4ePeriodic progress reportMay 2011
D6.4fPeriodic progress reportMar 2012
D6.5Final reportMar 2012

©Copyright by ResumeNet