A recursive orchestration and control framework for large-scale, federated SDN experiments: the FELIX architecture and use cases

Programmable networks are a substantial part of current R&D on future internet (FI) in Europe and worldwide, with considerable impact generated by large-scale test bed infrastructures. In such test beds, researchers validate proof-of-concept prototypes for new algorithms and mechanisms for efficiently controlling and managing network resources. One of the key domains for FI research is software-defined networking (SDN), which creates innovations in existing Internet architectures by shifting the control and logic outside the network equipment to Data Centres. International cooperation among leading research centres in Europe, Americas and Asia is key to validate SDN foundations and tools. EU and Japan have jointly funded the FELIX project (federated test-beds for large-scale infrastructure experiments), which defines a common control and orchestration framework to manage federated FI test beds across continents. This framework enables an experimenter to (i) request and obtain resources across different test bed infrastructures dynamically; (ii) manage and control the network paths connecting the federated SDN test beds; (iii) monitor the underlying resources and (iv) use distributed applications executed on the federated infrastructures. This paper describes the high-level architecture of the FELIX framework and details six use cases that will be employed for validation. We present our analysis and end-user considerations, highlighting the necessity for resource accessibility and coherent use of physical connections over a large-scale test bed where different control technologies such as OpenFlow and the network service interface (NSI) are simultaneously used.


Introduction
Programmable networks, based on software-defined networking (SDN) principles, decouple the control and data planes and allow for remote software to assume the control and management of the underlying network. These networks are a substantial part of existing R&D on the future internet (FI) in Europe and worldwide. Both academic and industrial SDN researchers around the world are embracing large-scale test bed infrastructures to validate their proof-of-concept prototypes and experiment with new algorithms, protocols or network functions in large-scale, efficient, predictable, realistic environments.
In general, FI test beds differ in the resources provided as well as in the geographic regions in which they operate. With the aim of promoting the use of heterogeneous resources across different infrastructures, FI test beds usually provide the experimenter with common interfaces and workflows. This is typically referred to as test bed federation and is a way of abstracting different internal infrastructures, resources and procedures to enable the definition of larger experiments with unified resources that are handled in the same way. An open federation of heterogeneous test beds is non-trivial, however, and requires the design of a suitable architecture and framework. The FELIX (http://www.ictfelix.eu) project aims to facilitate the federation and integration of different network and computing resources residing in a multi-domain heterogeneous environment across different continents. To achieve this, the FELIX architecture extends and advances assets previously developed in other FI projects (e.g. OFELIA), for instance by realising the federation concepts defined in slice-based federation architecture (SFA) [10] and implemented by (GENI Aggregate Manager API, http://groups.geni.net/geni/wiki/ GeniApi). In particular, FELIX uses a combination of recursive and hierarchical configurations for orchestration, request delegation and inter-domain dependency management. Resource orchestrating entities are responsible for the synchronisation of resources available in particular administrative domains. These entities and other key building blocks of the FELIX architecture are introduced in the following sections.
This paper details six use case scenarios for validating and demonstrating the FELIX framework over its distributed SDN virtual infrastructure spanning multiple domains. These use cases are grouped in two major domains: Data Domain and Infrastructure Domain. The Data Domain use cases focus on the efficient use of SDN technologies to provide interconnections across geographically dispersed test beds with the ability to realise data migration dynamically and efficiently. The Infrastructure Domain use cases are mainly oriented towards the use of a virtually distributed infrastructure that can be employed for migrating entire data processing workloads. This paper reports early-phase work focusing on use-case identification [6] and architecture definition [5]. Future work will address the validation of these use cases on the FELIX federated infrastructure.
The remainder of this paper is organised as follows: Section 2 describes the different resources and key concepts considered in the FELIX experimental facility, Section 3 is an in-depth discussion into the FELIX architecture, Section 4 details the use cases considered and Section 5 presents conclusions and future work.

Resources in the FELIX federated test bed
Resources in FELIX include both networking and computing capacities available at geographically dispersed facilities. Resources are under the administrative control of different but cooperating (federated) stakeholders. Federated resources in FELIX are used to create a virtual infrastructure that spans multiple domains. Note that this environment is starkly different from the case of (a) a single administrative domain with resources geographically distributed across the world, e.g. data centres of a single cloud operator, and (b) loosely coupled, interconnected islands that allow for remote access to certain resources. FELIX is primarily interested on network enablers and, in particular, integration of SDN test beds with network service interface (NSI) [12] controlled transit domains, with particular focus on the Connection Services (NSI-CS). These services, in turn, can be used to solve the dynamic establishment and teardown of network connectivity services (based on L2 switching and L3þ flow routing/forwarding) across multiple domains and technologies.

Virtual Infrastructures through federation
Monga et al. [8] note that connecting facilities at continental and inter-continental scale is not a trivial task and they motivate the need for connecting facilities (such as those considered in FELIX) at the lower layers (e.g. L2), thus avoiding the system overheads introduced by the connections established at L3 and above. However, the resulting proposals [7,8,13] do not conform to emerging standards, such as NSI. Moreover, previous work does not comprehensively consider the elements of each island from a network control perspective, and does not account for policies and trust. We believe that these aspects will play a crucial role in determining the adoption of a framework suitable for federated resources. Our analysis of the latest research literature on this topic has highlighted the need to introduce new APIs and logic for globally distributed heterogeneous facilities (e.g. OFELIA islands and JGN-X RISE test beds). It is clear that these new APIs and logic should capitalise on SDN and NSI mechanisms and protocols to facilitate the dynamic, on-demand establishment of end-to-end, crosscontinental virtual network infrastructures.
While SDN test bed infrastructures are constructed from the viewpoint of network research and development, computing and storage resources are also important components in each test bed. FI services can be grouped into two categories: those that use network resources to move data, and those that use the whole infrastructure (including computing and storage resources) to provide network-based services. Therefore, we consider two major classes of use cases for the demonstration of virtual infrastructure based on federated test bed resources. Namely, the first category of use cases belongs to the data domain since the primary focus here is the use of data. The second category of use cases forms part of the infrastructure domain, which includes the three resource types in a test bed: networking, storage and computing.

Key system concepts and definitions
The foundation of the FELIX experimental facility consists of the key system concepts summarised in this section.
FI experimental facilities (or SDN-controlled network domains) are controlled by dedicated software, exposing interfaces that can be used by a federation framework to orchestrate resources in a multi-domain environment. The SDN-controlled network domains are illustrated in Figure 1. An SDN island is a set of virtualised network and computing resources under the same administrative ownership and control. It may consist of multiple SDN zones, each characterised by a specific set of control tools and interfaces. Each SDN Zone is a set of resources grouped together by common technologies and/or control tools and/or interfaces, e.g. L2 switching zone, optical switching zone, OpenFlow protocol controlled zone and other transit domain zones with a control interface. The major goal of defining SDN zones is to implement appropriate policies for increasing availability, scalability and control of the different resources of the SDN islands. Examples of zone definitions can be found in widely deployed Cloud Management Systems (CMS) such as CloudStack, where infrastructure is partitioned into regions, zones [1], pods and so on. In addition, OpenStack offers infrastructure partitioning through availability zones and host aggregates (Scaling Openstack, http://docs.openstack.org/openstack-ops/content/ scaling.html).
Transit network domains use NSI to expose either automatically or on-demand control of the connectivity services and, optionally, exchange inter-domain topology information. On-demand interconnectivity with a specific granularity must be provided in order to federate resources that belong to distant experimental facilities. In FELIX, it is assumed that all experimental facilities will be interconnected with networks running NSIcompatible network controllers. The NSIv2.0 standard interface will be used as a means to orchestrate network resources for an experiment set-up.
In Figure 1, a slice is a user-defined subset of virtual networking and computing resources. Each slice is an abstraction created upon the set of physical resources available in the federated SDN Zones and SDN Islands. Every slice is isolated from other slices running simultaneously on the same physical resources, thereby avoiding interferences from other separate experiments. A slice should also be dynamically extensible across multiple SDN Islands. Each slice is instantiated when the experimenter's control tools need to access or traverse the specific zones defined within the slice.
The slice concept originates from the SFA that defines a number of abstractions to identify provisioned resources, enable their aggregation and identify entities to manage resources (i.e. slivers, slices and component managers). SFA also provides a minimal set of structures and interfaces, which consist of two parts: (1) a specific data type per resource encapsulated in an RSpec to define, for example, a computing node, and (2) a list of methods following a specific workflow in order to reserve and provision any resource. These interfaces and data models facilitate and standardise the process of providing a federated slice composed of heterogeneous resources located in any other test bed in the federation. In order to do this, every SFA-compliant test bed must share the same interfaces and data models with its federated members to be able to understand resource requests. More information and samples can be found on the FELIX project implementation deliverables, available at (FELIX Project website, http:// www.ict-felix.eu).

The FELIX architecture
The FELIX architecture is the result of a careful analysis of the state of the art in relevant FI research projects in both Europe and Japan. From the European side, the OFELIA [15], FIBRE [14] and Fed4FIRE [19] projects were taken into account, thereby providing a working approach to large-scale distributed systems and federation (e.g. through SFA and GENI), as well as addressing federation between heterogeneous test beds. From the Japanese test beds, GridARS [17] and RISE [4] were considered in order to manage seamlessly the establishment of dynamic inter-domain communication through the NSI protocol. Taken together, we have defined a modular and multi-layer architecture for the FELIX control framework. As illustrated in Figure 2, we use the combination of two different 'spaces', namely the FELIX Space and User Space, which cooperate to build, manage, control and monitor a large-scale virtual infrastructure.

Spaces in the architecture
The FELIX Space is composed of management and control tools that coordinate the creation of a virtual environment in heterogeneous, multi-domain and geographically distributed facilities. The elements in this layer operate in both hierarchical and recursive models for efficient multi-domain information management and sharing. The User Space is composed of any tool or application a user wants to deploy to control his or her virtual network environment or to run a particular experiment within it. These two logical spaces glue together different functional building blocks, as shown in Figure 2.
In the FELIX Space, the Resource Orchestrators (ROs) are responsible for orchestrating the end-to-end network service and resources reservation in the entire infrastructure, as well as delegating end-to-end resource and service provisioning in a technology-agnostic way. ROs are connected to the different types of Resource Managers (RMs), which control and manage different kinds of technological resources similar to the concept of Component Manager in SFA. For example, the Transit Network RM and Stitching Entity RM provide connectivity between L1/L2 transport network domains and manage physical devices by using frame, packet or circuit switching technologies, whilst able to support different protocols.
The SDN resource manager (SDN RM) manages the user traffic environment and the network infrastructure, composed of SDN-enabled devices, by updating the flow tables of the physical devices. In addition, the Computing RM is responsible for setting up and configuring computing resources, i.e. creating new virtual machine instances, network interface card configuration, etc. Moreover, the FELIX Space can provide essential functionalities to the FELIX architecture using dedicated modules such as the authentication, authorisation and accounting (AAA) for authenticating and authorising users, or the Monitoring Functions module to retrieve, aggregate and store metering information from networking and computing resources to be used as feedback in the experiments.
In the User Space, the Slice Controller can dynamically control the physical and virtual resources belonging to the user's slice environment. In other words, it can request more bandwidth, virtual CPU or RAM, add new resources such as storage, or even completely reconfigure the slice behaviour.

Common design considerations
The modules of the FELIX architecture address a number of common design requirements that are inherent and key to any successful federated FI test bed.
Resource Orchestration: This entity performs the necessary orchestration of the different types of virtualised resources present in the test beds, such as compute, storage or network nodes.
Resource Allocation Planning: This entity controls the reservation or allocation status for any resource subject to it. Both the experimenter and administrator dimensions are considered. For instance, the experimenter may define or request the maximum allocation time desired, or the provisioning requirements of a resource could be an input for load balancing decisions.
Provisioning: The system must be able to instantiate and initialise the resources requested by the experimenter. The infrastructure at the different islands must provide applications with a virtual flat environment that behaves like a dedicated cluster and in which some specific user-space resource information, such as IP addresses, is made available to the experimenter after proper instantiation and configuration tasks.
Domain Resource Management: The heterogeneous resources that are provided by their corresponding resource management systems need to be managed accordingly, within that same domain.
Authentication, Authorisation and Accounting: In order to achieve a controlled environment where any action is authorised and can be traced back in detail, we need to ensure that (a) an actor has a valid claim on the presented identity, (b) any action is exclusively performed by an authorised actor and (c) every action is tracked by the accounting system.
Monitoring: The framework must provide the user with a coordinated set of monitoring data for both virtual and physical resources provisioned or located in different domains. These monitoring data are used primarily for the accounting of the infrastructure use, but also for enhanced control applications based on traffic measurements.
User Access: The federated experimental facility provides friendly interfaces to simplify the operations of the experiment lifecycle for the user, as well as to facilitate the general management of the test bed resources for the administrator.

Building blocks in the FELIX architecture
The FELIX Architecture is composed of a number of modules that implement the functionalities identified in the common design considerations. These modules or 'building blocks' are intended to be as generic as possible in order to deal with different environments. Figure 3 shows a schema with the different modules and the interactions between them.

Resource orchestrator
The FELIX RO is a key element of the FELIX architecture and the cornerstone of the management and orchestration system design.
We consider that the RO operates over a federated test bed infrastructure of SDN 'islands', which are interconnected by Transit Networks subject to dynamic configuration through the NSI. The RO module is responsible for orchestrating the end-to-end network service as well as for instructing the resource reservation and provisioning for the entire FELIX infrastructure.
There are two different levels at which the RO may operate: (a) the upper layer, right below the User Access level, and (b) the layer immediately underneath. In a typical scenario, the RO in the upper layer can operate at continental level, whilst the ROs in the lower layer may communicate with the RMs or with other ROs.
The main functionalities of RO are as follows: (i) proxying requests between the experimenter and the RMs, (ii) recursively delegating requests between ROs of other federated infrastructures according to pre-defined policies, (iii) maintaining an updated and aggregated topological view of its managed, underlying infrastructures and (iv) verifying proper workflow and notifying the experimenter of any detected error condition.
Request forwarding for allocating and provisioning resources is performed in a technology-agnostic way within the infrastructure and depends on the conditions defined in the federation policy engine previously configured for the particular domain. It is therefore necessary to ensure similar interfaces for each orchestrator. RO must also interpret the experimenter's request to be able to perform any request forwarding as well as evaluate the set of actions received from the user for correctness and notify the user of error conditions. An internal view of the cross-island topology (for computing, SDN and NSI nodes) is necessary for the RO to properly forward requests, for instance by detecting where computing resources can be provisioned to meet the experimenter's requirements. Such topological information is filled from the lower layers (its scope is the resource management) to the upper layers, where this abstraction and mapping takes place. This resource discovery procedure minimises the data being transmitted by taking advantage of the data interchanged between the modules during the expected workflow for SFA test beds.

Transit network resource manager (TN RM)
The TN RM enhances the FELIX architecture with mechanisms for network connectivity within and between particular domains. In order to deliver the network services in the FELIX architecture, the TN RM must be integrated with its southbound interfaces within a particular network domain. Such a domain can use different L1/L2 technologies and may be controlled by specific interfaces, systems such as the network management system (NMS), or protocols that are technology-dependent and unique in each case.
A single TN RM must communicate with a single RO in order to (i) advertise resources under its control, (ii) receive requests and (iii) notify the RO about success and failure events. A single TN RM is responsible for a group of particular network resources, which belong to a network domain and are usually managed by a single entity, i.e. a network administrator or NMS.
TN RM usually manages L1/L2 transport networks that are composed of physical devices using frames/packets or circuit switching technologies and support different protocols, e.g. MPLS/GMPLS. In order to support inter-island connectivity between existing OFELIA islands in Europe realised with VPN services over the Internet, the TN RM also supports the management of VPN set-up and tear down procedures.
In the FELIX architecture, the TN RM southbound interface is based on the NSI-CS protocol for L1/L2 transport networks, a proprietary interface for L3 VPN services, whilst the northbound interface uses SFA-based APIs that can be understood by the RO.

Stitching entity resource manager (SE RM)
The stitching entity (SE) RM is a software element of the FELIX architecture that controls the SE, a network element providing the necessary translation mechanisms for a slice setup on top of the L2 protocol stack. SE RM hides from a user the actual complexity of the multi-domain transport network topology. SE must provide at least one of the following network functions: (i) QinQ, to encapsulate slice traffic into transport network Ethernet frames, or (ii) a VLAN translation mechanism to hide from users the actual VLAN tagging, which is used by carrier networks while interconnecting two or more FELIX islands.
The SE RM communicates with a single RO to (i) advertise an internal topology and capabilities of the SE under its control, (ii) receive requests and (iii) notify the RO about success and failure events.
A single SE RM must be implemented in each FELIX island and is responsible for single or multiple SEs, which belong to a network domain and act as an entry point to the island infrastructure.

SDN resource manager
The domains in FELIX provide SDN-enabled infrastructures, usually equipped with OpenFlow-enabled network devices. This provides experimenters with tools to control their network behaviour in a programmatic way, that is, by defining a set of flow rules through a software controller that communicates with the physical network devices. Each rule defines a matching condition (any OpenFlow header, such as VLAN, and a value to match against) as well as an action, and can be either polled from the controller by the network device, or directly inserted into special tables within the latter.
These switches and routers are configured and controlled through the SDN RM. This module can interact with a special purpose controller (e.g. FlowVisor) that is able to proxy the OpenFlow packets to the corresponding user controller, thus keeping each environment for experimentation properly isolated from others. The SDN RM also allows the administrator to observe the experimenter's set of rules (which define her 'virtual network') and grant or revoke it.

Computing resource manager
The Computing resource manager (C RM) provides experimenters with mechanisms to configure instantiate and operate on computing resources, and gives administrators the means to manage and monitor them.
This module interacts with the underlying infrastructure (virtualisation servers and associated hypervisors) through an agent that acts as a proxy between the FELIX Control Framework side and the hypervisor. Therefore, the agent is able to communicate with the hypervisor and perform the operations provided by the former, such as creating, deleting and changing the status of the machines; informing of the status of each operation; and synchronising the status of such resources between the infrastructure and management layers.

Authentication, authorisation and accounting
AAA makes available the necessary mechanisms to authenticate and authorise users and provides accounting for the user actions. These actions may be used from any other module (see Figure 3).
The FELIX architecture implements a Certificate-Based Clearinghouse (CH) [18] that establishes the root of a trust chain. By using a certificate-based approach, the architecture has the flexibility to federate different SDN islands easily and allows verifying the identity and privileges of all actors in the FELIX architecture.
The CH comprises a set of related services supporting AAA operations and acts as a location to look up information about members, slices and other available services in the test bed. These services are offered through the Member Authority for certificate and credentials management, the Slice Authority for slice registry and privilege-mapping against its members, the Project Service for experiment registry and role evaluation, the Logging Service for accounting purposes and finally the Service Registry to register the aforementioned services. These authorities -derive from SFA principals -and the services complement the processes of (i) registration and management, (ii) authentication and authorisation, and (iii) accounting.
For federating purposes, these services can be accessed via the Common Federation API [2], allowing thus compatibility with tools such as OMNI and test beds complying with the GENI Aggregate Manager API [2].
The AAA module is closely related to the User Access and is ultimately responsible for granting access to resources, as authentication and especially authorisation procedures are invoked on many operations. It may also be extended through policies, which are a set of rules defined by administrators to tailor the upper-level control on resources and test beds usage.

Monitoring functions
The monitoring functions are responsible for retrieving monitoring data for heterogeneous resources (e.g. compute, SDN, TN nodes) in the diverse test beds of the federation, as well as aggregating and storing it. Monitoring data can be categorised in two types: facility monitoring and infrastructure monitoring.
The facility monitoring data encompass the basic status information about the facility, such as the status of the servers' availability or network connectivity, and may also include the status of the functional components in the FELIX Control Framework, which is aggregated from RMs and ROs. This information can be offered through the user portal for both experimenters and administrators.
The infrastructure monitoring checks the status of the resources currently available or provisioned, as well as some data for the experiments. This includes availability of virtualisation and networking hardware or other information such as uptime and resource usage statistics.
The slice monitoring developed in FELIX builds upon state of the art monitoring tools, as well as test bed-specific (facility) monitoring tools. Among these tools, we use SNMP and OpenFlow flow space monitoring statistics. Monitoring data are provided not only graphically, but also via a Monitoring API so that the statistics can be directly used by control mechanisms to make routing decisions based on traffic measurements or link status. The use of monitoring data is key to implement reactive decisions in all use cases considered in the FELIX project, guaranteeing feedback to the Planning and Provisioning tools to ensure that the required slice resources are available and can be used for specific workloads.

User access
The FELIX graphical user interface offers intuitive access to the lifecycle management of an experiment for experimenters and general management operations for administrators. To do this, the User Portal communicates with the underlying modules of the FELIX architecture to use a subset of their functionalities to authenticate and authorise users, as well as track their actions and provide them with any requested resource management, operation or observation.
The experimenters are thus able to list the available resources, define a subset of resources and allocate or provision them, perform operational actions on the resources, retrieve a description of the resources made available to them as well as release the resources when no longer needed.
On the other hand, the administrators may configure and manage resources, define different types of policies, grant or revoke resource requests and monitor different sets of resources, among others.

The FELIX use cases
As previously mentioned, FELIX usage scenarios are clustered into two groups: Data Domain and Infrastructure Domain. Use cases in the Data Domain include virtual infrastructure consisting of SDN islands interconnected with dynamic circuit-switched (inter-continental) networks. One important goal is to optimise the use of interconnectivity between test beds to realise data migration. The Infrastructure Domain use cases describe user scenarios based upon federated resources placing emphasis on the optimised use of the infrastructure as a whole.

Data domain use cases
Data Domain use cases are primarily oriented towards the efficient utilisation of the physical network by taking advantage of SDN and NSI operations for the dynamic interconnection of test beds dispersed across different continents. The focus here is on the coordination of caching, processing and network services rather than on the exact caching algorithms to be used, which are in the full scope of FELIX user/experimenter priorities and control.
The test beds for the Data Domain use cases form a virtual infrastructure that consists of SDN islands (L2 domains) interconnected with dynamic circuit-switched networks (multi-domain transit networks). In this large-scale facility, data must be transferred from the origin to its destination end-point, typically in another SDN island. The following subsections summarise each use case and explain how the aforementioned flows of data traverse a real network.
4.1.1 Data-on-Demand: delivery of distributed data by setting data flows over the network This use case investigates how to process large amounts of data stored in distributed sites. For instance, several applications, such as astronomical observations or collaborative investigations, generate huge data-sets, typically stored in dedicated storage servers or devices in a nearby data centre. A user may want to run a post-processing algorithm on the data collected by different data providers. In this context, it is neither suitable nor efficient to move the whole data from the original sites to the final location, but it could be convenient to perform sequential processing on partial data. Figure 4 depicts the main components of the scenario, the relevant actors and their potential interactions.
In this use case, an SDN-based (e.g. OpenFlow) controller maintains a global view of the whole network topology and could access monitoring information from the Data Centres (e.g. delay and jitter per link) and then use this information at any given time to automatically select adequate paths based on any selected metric. Connectivity would be established by requesting the FELIX framework in each Data Centre, thus effectively setting the SDN flowspace and inter-domain Transit Network connections, and stitching them together. By setting the most appropriate connection in each moment, an optimal use of the physical network resources could be achieved.
4.1.2 Data pre-processing for minimising network latency effect for live data This use case aims to provide near real-time data, e.g. satellite images, to users located in different and very distant places without incurring in the large round trip delay (RTD) values typically found with transfers through the public Internet. Figure 5 presents an overview of this scenario. Common examples of such a condition are congested links during peak working hours of the day, which generate fast re-adaptations of the TCP transmission windows and, consequently, critical degradation of the throughput. This is particularly evident in inter-continental data transfers that typically have high values of RTD (. 200 ms) between the communicating servers due to long-distance transmissions. In these situations, a dedicated platform would be placed near the receiver station and perform a suitable pre-processing of the data. This platform could be able to allocate computing, caching and networking resources at both source and destination islands. It could also be able to implement on-demand and application-driven network services for the specific data transfers, which require well-defined network parameters. Consequently, this approach can significantly reduce the size of data to be delivered across the transit network and improve the overall system performance.

High-quality media transmission over long-distance networks
In the last few years, we are experiencing a rapid evolution in media content delivery, especially in the context of the ultra-high definition of the video streaming, i.e. 4 and 8 K Figure 5. Data pre-processing: minimising network latency effect for live data.
resolution. This evolution directly relates to a higher quality of media playback, but also imposes higher bandwidth and lower delay constraints on the network. In this scenario, illustrated in Figure 6, hardware optimisation is required for the transmission and reception of the data content, especially in a very long-distance environment.
At the same time, network streamlining is needed both in the transport segments and in the inter-data centre networks (NSI-and SDN-enabled). In this use case, all the defects of poor management and control of the network will manifest themselves in visible playback artefacts: jitter, incorrect frame sequencing, transmission disruption, etc. Moreover, strict requirements are imposed in order to serve 3D video to the user, as two flows have to be delivered separately for the left and right eye. In this scenario, proper synchronisation is extremely important to achieve a satisfactory quality of service. This is measured through quality of service and quality of experience metrics.

Infrastructure domain use cases
The Infrastructure Domain use cases are mainly concerned with the services and workloads that can be facilitated by a software platform built on top of the federated resources. It is important to note that both the Infrastructure and Data Domain use cases share common architectural, trust and security assumptions. In the Infrastructure Domain use cases, we consider network, computing and storage resources that can dynamically migrate over the allocated physical environment. This work is in line with recent developments in leading standardisation fora, such as IETF and ETSI, where significant attention has been drawn from both industry and academia towards network service chaining and the ability to relocate network functions, infrastructure scale-out and scaleup, as well as continuous service delivery [3]. The following subsections introduce the Infrastructure Domain use cases and explain how the services can be deployed in a largescale facility, such as FELIX.

Inter-cloud use case: data mobility service by SDN technologies
This use case focuses on cloud systems and the services provided by them in carrier-grade, mission-critical areas. This includes electronic administration, medical care and finance. To satisfy the requirements, these complex cloud systems should meet the demands of an end-to-end guaranteed service quality, reliability of compliance and energy efficiency. In this context, every single-cloud system is limited by its available resources. This limit can be easily exceeded with a flexible reassignment of resources belonging to different cloud systems. Therefore, it is important to establish cooperation between data centres, at least on a temporary basis.
For example, consider a user who moves to a remote location due to a business trip. The user wants to use a number of cloud-based services with the same level of quality of experience as when using local resources and on par with the quality experienced in their home network. Note that in this case, traditional mobility management solutions [9] would not be able to mitigate the expected large propagation delays between the present user location and the data centre processing the user's workload. Instead, the scenario illustrated in Figure 7 shows that it would be preferable to transfer user data (such as credentials, applications and services) to a cloud system closer to the visiting place (e.g. the cloud with minimum delay relative to the user's visiting place), and therefore reconstructing the user's work place in a remote location.

Follow the sun (or moon) principles
As detailed in [11], Internet usage curves follow a similar daily pattern everywhere in the world, and there is a natural shift in the load of data centres to places in the world where it is currently daytime. The opposite is true during the night, when data centres are under a different amount of load. This is often referred to as the 'follow the sun/moon' principle. Moreover, the prices of renewable energy strongly depend on the availability of wind and solar energy (green energy). As a result, several data centres are moved to locations such as Iceland and Finland and perhaps in the future to desert areas.
In this case, one could shift the load of one data centre to another one following two different approaches (Figure 8): (a) move the entire workload to a more efficient data centre basically with a rerouting of the user's traffic, or (b) handle the user's requests at the less-efficient data centres by delegating the workflow to more efficient data centres. It is important to note that both scenarios require dynamic and on-demand end-to-end connections between the federated data centres. Moreover, when the workload is moved from one data centre to another, a number of different resources (network, compute and storage) need to be configured accordingly.

Disaster recovery by migrating IaaS to a remote data centre
This use case is inspired by the business continuity planning of key services to cloud providers. This is particularly pertinent after the experience of the Great East Japan earthquake in 2011. Typically, the cloud systems are managed by Infrastructure as a Service (IaaS) software, such as OpenStack or CloudStack, and provide isolated tenants on physical resources (computers, storage and network) in a data centre with multiple IaaS users. These users expect a stable and fault-free environment, but under particular conditions, in a serious disaster, it can be difficult to continue providing the desired services. In such a case, middleware can assist in enabling the migration of the cluster of servers and virtual machines to some remote data centre and guarantee business continuity. Another factor used in creating this use case is the Hardware as a Service (HaaS) paradigm [16], which can dynamically configure virtual IaaS-enabled resources using nested virtualisation technologies (e.g. KVM and FlowVisor). These resources can be migrated on the HaaS layer of another data centre, as depicted in Figure 9, coordinating the configuration of the hypervisor resources with the network bandwidth constraints to allow a fast and efficient migration of the IaaS instance from one site to another.

Conclusions and future work
We presented the FELIX Control Framework architecture and six use cases for large-scale SDN experiments over cross-continental federated environments. We grouped the set of use cases into two major categories, namely the Data Domain and Infrastructure Domain, in order to better reflect their primary applicability area and stakeholders. These scenarios highlight the necessity to have a single management and control of the intra-and interconnectivity for the data centres. These scenarios serve as the foundation for the development of complex architectural models and software platforms able to manage resources in more efficient ways.
From the users' perspective, all presented use cases apply to the same and unique FELIX framework architecture that includes common system functionalities derived from the specific use cases and the users' goals, in the form of requirements. The current list of use cases is not meant to be exhaustive.
The resulting architecture allows experimenters of the test bed to request, manage, and monitor a slice in a heterogeneous, distributed, and multi-domain environment. This highlevel specification is generic enough to allow flexible and scalable deployments in the different test beds that are part of the federated environment.
The work in the FELIX project is now proceeding towards the implementation of the FELIX system components. As part of our future work, we aim to complete the development of the feature set defined for each software component and -in paralleltest, validate and refine the implemented functionalities through the use cases presented in this paper.