Network Functions Virtualization: Testing Best Practices
NFV is a network architecture concept that proposes using IT virtualization related technologies to virtualize entire classes of network node functions into building blocks that may be connected, or chained, together to create communication services.
Feb 25, 2021
17 min read
Get a free maturity assessment for your organization.START NOW
The document provides a list of best practices to be used for ensuring smooth migration of Network Elements and Services to NFV environment.
It is recognized that a certain portion of the best practices and recommendation are not required in all cases of NFV implementations.
Some part of testing, such as security and usability, are not addressed in this document, but still should be addressed when defining the right strategy for NFV testing.
This document may apply to:
ISVs developing VNF and NFV based network services and functions
Telco operators in a process of implementing NFV
2. References
This document is based on ETSI ISG NFV Standards.
The following referenced documents are served as a baseline for this document:
GS NFV 001 Network Functions Virtualisation (NFV); Use Cases
GS NFV 003 Network Functions Virtualisation (NFV); Terminology for Main Concepts
Network Functions Virtualization (NFV) is a network architecture concept that proposes using IT virtualization related technologies to virtualize entire classes of network node functions into building blocks that may be connected, or chained, together to create communication services.
NFV relies upon, but differs from traditional server virtualization techniques such as those used in enterprise IT. A virtualized network function, or VNF, may consist of one or more virtual machines running different software and processes, on top of industry standard high volume servers, switches and storage, or even cloud computing infrastructure, instead of having custom hardware appliances for each network function.
The European Telecommunications Standards Institute (ETSI) has formed an Industry Specification Group on Network Function Virtualization (ISG NFV).
The contributors of the NFV Introductory white paper as well as the ETSI ISG have identified Testing and QoE monitoring as one of the main use cases and subjects to address when implementing NFV environment.
NFV Framework
The NFV framework consists of three main components.
Virtualized Network Functions (VNF) are software implementations of network functions that can be deployed on a Network Function Virtualization Infrastructure (NFVI).
NFV Infrastructure (NFVI) is the totality of all hardware and software components which build up the environment in which VNFs are deployed. The NFV-Infrastructure can span across several locations. The network providing connectivity between these locations is regarded to be part of the NFV-Infrastructure.
Network Functions Virtualization Management and Orchestration Architectural Framework (NFV-MANO Architectural Framework) is the collection of all functional blocks, data repositories used by these functional blocks, and reference points and interfaces through which these functional blocks exchange information for the purpose of managing and orchestrating NFVI and VNFs.
High level NFV framework
The building block for both the NFVI and the NFV-MANO is the NFV platform. In the NFVI role, it consists of both virtual and physical compute and storage resources, and virtualization software. In its NFV-MANO role it consists of VNF and NFVI managers and virtualization software operating on a hardware controller. The NFV platform implements carrier-grade features used to manage and monitor the platform components, recover from failures and provide effective security – all required for the public carrier network.
NFV Use Cases
The first standard issued by ETSI identified 9 use cases for NFV in GS NFV 001:
Use Case #1: NFV Infrastructure (NFVI) as a Service
Use Case #2: Virtual Network Functions as a Service (VNFaaS)
Use Case #3: Virtual Network Platform as a Service (VNPaaS)
Use Case #4: VNF Forwarding Graphs
Use Case #5: Virtualisation of Mobile Core Network and IMS
Use Case #6: Virtualisation of Mobile Base Station
Use Case #7: Virtualisation of the Home Environment
Use Case #8: Virtualisation of CDNs (vCDN)
Use Case #9: Fixed Access Network Functions Virtualisation
Each of those use cases requires different level of techniques and has different set of QoS and QoE KPIs. Nevertheless, from System Test point of view, same workload scenarios may apply for all use cases.
5. Assuring compliance and conformance to ETSI ISG Standards
Assuring compliance aim to find the deviations from the ETSI ISG Standards. It determines whether the NFV is meeting the defined standards.
Compliance is performed on two levels:
MANO
VNF/NE
When performing compliance testing on NFV layer, each component is analyzed independently (MANO, VNFM, VIM, NFVI and interfaces).
VNF/NE compliance is archived mainly by comparing the VNF and NE descriptor to the main
Test Procedure
Compliance tests are done using the following steps:
Step 1: Analyze ISG current standards
Step 2: Prepare compliance requirements
Step 3: Analyze each NFV component and interfaces
Step 4: Define gap points and analyze
Step 5: Prepare correction plan
Step 6: Monitor correction plan execution
Step 7: Repeat steps 1-6 for each VNF and NE
MANO Compliance Criteria
Following criteria are evaluated:
VNF Descriptor format
VNF redundancy model
VNF state transitions
Host affinity and anti-affinity rules for deployment of VNFC instances
Authorization of the lifecycle management request
Validation of the lifecycle management request
VNF Package
Support multiple SWA-1 interfaces
Support multiple SWA-5 interfaces
Allocation of addresses
Dynamic allocation and deallocation of addresses
Scaling event types and format
Auto-scaling
Maintain records on VNF Packages
VNF Compliance Criteria
Following criteria are evaluated:
VNF Design Patterns
VNF Internal Structure
VNF Instantiation
VNFC States
VNF Load Balancing Models
VNF Scaling Models
VNF Component Re-Use
VNF Update and Upgrade
Automatic procedure
Control Update and Upgrade process
Requesting virtual resources
Roll-back
VNF’s Properties
Hardware Independence
Virtualization and Container Awareness
Elasticity
VOID
VNF Policy Management
Migration operations
VNF State
VNF Internal Structure
Reliability
Location Awareness
Application Management
Diversity and Evolution of VNF Properties
VNF Topological Characteristics
Deployment Behaviour
Virtualisation containers
NFVI Resources
Components and Relationship
Location
VNF States and Transitions
States and Transitions as Architectural Patterns
The VNF Descriptor
VNF Instantiation
VNFC Instantiation
VNFC Instance Termination
VNF Instance Termination
VNF Instance Scaling
Start and Stop VNF
VNF Instance Configuration
VNF Fault Management
Virtualised resource faults
VNF faults
6. NFV Test Environment
Build NFV Test Environment
NFV provides great flexibility when building Test Environment. Using NFV, the end user can:
Build parallel test environments to address different needs
Scale up and down compute resources allocated based on type of testing (performance, functional etc.)
Copy any existing environment, either from development environment (once testing starts) or from production environment (to investigate production problems)
When building Test Environment for NFV, the following guidelines should be kept:
Management and Orchestration should be similar to the production, including all elements of NFV (MANO, VNFM, VIM, NFVI)
NS and VNF Instantiation should be similar to production environment (hypervisors, computing, storage and network resources)
If possible, based on the available resources, same level of resources should be allocated. However, in case not enough free resources are available, a load environment should be established when running performance and scalability testing
Test Appliances
Test Appliances are required in order to:
Simulate valid workload traffic on the client and server
Simulate both data plane and control plane traffic
Measure key metrics, both data plane metrics and control plane metrics
Assuring the appropriate compute, storage and network resources are allocated
Active and passive monitoring of a single function, a set of functions or the entire service chain to ensure services adhere to established QoS and QoE metrics
Monitoring production SLA (not covered in this document)
Virtual Test Appliances vs. Physical Test Appliances
Existing physical test appliances may be used, but most test equipment vendors have developed virtual test appliances that work on separated VMs in the NFV Test Environment. Virtual test appliances offer equivalent capabilities for almost all scenarios and meets the flexibility required when testing NFV on multiple, geographically disparate. Virtual test appliances are usually much more cost effective.
Physical test appliances are mainly recommended for the testing virtual environments that require the highest levels of data-plane performance (line-rate) or microsecond-level timing accuracy.
Simulating real workload traffic
When simulating real workload traffic, consider the following steps:
Step 1 – measure production busy hours traffic rates (BHTR)
Step 2 – Simulate a traffic mix on the client and server including:
HTTP
FTP
DNS
Streaming video
NTP
SSH
Syslog
NFS
Messaging
VoIP
Social Networks
Unicast Video
P2P
CIFS
Background Traffic
Step 3 – create different workload scenarios based on percentage of BHTR. The following traffic rates are recommended:
Level
% of BHTR
Used for
Low
5%
Functional Testing
Average
50%
Scalability, on-going, fail-over
Busy
100%
Performance
Stress
150%
Stress Testing, auto-scaling
Metrics
NFV Service Metrics
VM Provisioning Latency, instantiation latency. Time between VM instantiation and first available packet
VM Stall (event duration and frequency)
VM Scheduling Latency
QoS and Data-plane metrics:
Latency on each of tens of thousands of data streams
Throughput and forwarding rate
Frame loss rate
Packet-delay variation and short-term average latency
Dropped frames and errored frames
Service Disruption Time for Fail-over Convergence
QoE and Control-plane metrics:
HTTP: page load time, load time variance
Video: MOS-AV score, range = 2-5 with 5 being the best
HTML5 video – AS score, 100 % score as the maximum
Direct metrics:
Peak Signal to Noise Ratio (PSNR)
Structural Similarity (SSIM) – compare the original image with the received image
Video Quality Metric (VQM)
Mean Opinion Score (MOS) – This metric combines delays, perceived jitter at application layer, codec used for communication and packet loss at application layer
Indirect metrics:
Startup time: Time difference between sending the request for content and the time when the user actually received the content
Delivery synchronization – In a multicast many-to-many scenario it is important that the content is received by all participants at the same time. Consider online gaming or video conferencing
Freshness: The time difference between the time when the content is actually generated and the time when the users receives it, e.g. celebrating a goal with friends while watching a sports event
Blocking: When the buffers on the receiver are empty and the user has to wait for content.
Connections establishment rate, and transactions per second
Total number of connections, round trip time and goodput
7. Test Automation in NFV
Disclaimer: The solution described in this document is based on CloudShell for SDN/NFV by QualiSystems. Similar solution may be built using a different tool.
The Need for Heterogeneous SDN/NFV Self-Service Test Infrastructure Automation
Software Defined Networking introduces the concept of network programmability from applications that interact with centralized SDN controllers via northbound APIs. This API-driven network paradigm opens the way for agile SDN application development, but also creates the need for networking organizations to deliver access to end-to-end network environments to application delivery stakeholders for supporting a DevOPS process. Network Function Virtualization (NFV) adds to the complexity of this picture by allowing agile service chaining of network functions hosted on virtual machines rather than hardware appliances. Automation of access to network sandboxes is complicated by the fact that SDN and NFV will gradually phase into networks, leaving significant portions of the network operating in a legacy mode, potentially for years. Networking teams need a self-service automation platform that can handle both SDN/NFV and legacy networks in a unified manner.
Network DevOps Orchestration & Automation
Testing NFV requires a comprehensive automation platform including resource management, provisioning, test automation and integrated reporting and business intelligence that is ideal for delivering self-service automation and continuous network certification processes for SDN and NFV.
The following capabilities should be included:
Centralized inventory management of all legacy and SDN/NFV network resources allowing engineers to gain visibility to any components needed to design and publish network topologies required by developers and testers
Integration with all the existing and future infrastructure, including legacy network devices, SDN-enabled switches, SDN controllers and virtualized network functions
Visual network topology and service chain design and publishing
Visual workflow and test automation creation to build continuous integration
Integrated reporting and business intelligence
Sustainable object-based automation architecture
Easy to use web-based self-service portal
NFV Test infrastructure automation framework based on CloudShell
Object-Based Test Automation Approach with TestShell
TestShell allows all automation elements to be captured as small-scope objects to enable high reusability for test workflow construction.
Objects are divided into the following categories:
NFV-Mano – an automation object for each MANO operation, such as instantiation of Network Service, Disable VNF package etc.
NFV Infrastructure – an automation object for each NFVI operation, such as create, shutdown, destroy and update virtual machine (VM)
VNF – Each VNF shall have dedicated objects based on its internal workflow.
Legacy Network Services – each legacy network service shall have dedicated objects
Test Appliances – each Test Appliance should have a list of Test Automation objects, such as simulate control plane traffic and retrieve QoE parameters
8. Testing MANO
The first step is to test NFV Management and Orchestration Architectural Framework.
NFV Management and Orchestration Architecture
NFV Orchestrator
On-boarding of new Network Service (NS) and VNF Packages
Instantiate Network Service i.e. create a Network Service using the NS on-boarding artefacts
Query VNF – retrieve VNF instance state and attributes
Check VNF instantiation feasibility
Scale Network Service, i.e. grow or reduce the capacity of the Network Service
Update Network Service by supporting Network Service configuration changes of various complexity such as changing inter-VNF connectivity or the constituent VNF instances
Create, delete, query, and update of VNF Forwarding Graphs (VNF-FG) associated to a Network Service
Create, delete, query, and update of Virtual Links (VL)
Query Network Service and assure all attributes retrieved properly
Terminate Network Services, i.e. request the termination of constituent VNF instances, request the release of NFVI resources associated to NSs, and return them to NFVI resource pool if applicable
Get VNF performance metrics and notification
Simulate Notification and different fault information:
physical infrastructure (compute, storage, and networking related faults)
application logic (i.e. VNF instance related faults)
Monitoring and collection of information related to resource usage, including mapping of usage
Scheduled request regarding VNF instances
Global resource management, validation and authorization of NFVI resource requests
Resources sharing between VNFs
Create, update, delete, query, activate and de-activate policy (e.g., policies related with affinity/anti-affinity, management of VNF or NS scaling operations, access control, resource management, fault management, NS topology etc.)
Constraints management
SLA parameters
Network capacity adaptation to load
Coexistence with legacy network equipment
Controlling and managing Inventory of versions, releases and patches of all units of hardware and software
Maintenance, hardware and software exchange, SW upgrades, Firmware upgrades, repair
Manage log of all changes to inventory, unexpected events and maintenance activities
Administration domains and permissions
VNF Manager:
VNF instantiation, including VNF configuration if required by the VNF deployment template (e.g., VNF initial configuration with IP addresses before completion of the VNF instantiation operation)
VNF instantiation feasibility checking
VNF instance software update/upgrade
VNF instance modification
VNF instance scaling out/in and up/down
VNF instance-related collection of NFVI performance measurements and faults/events information, and correlation to VNF instance-related events/faults
VNF instance assisted or automated healing
VNF instance termination
VNF lifecycle management change notifications
Configuration and event reporting between NFVI and the E/NMS
Liveness checking of an VNF, e.g. watchdog timer or keepalive
Failure detection
Fault remediation of each VNF resiliency category
Virtualised Infrastructure Manager (VIM) and NFV Infrastructure (NFVI):
Resource catalog management – controlling and managing Inventory of software (hypervisors), computing, storage and network resources
Collection and forwarding of performance measurements and faults/events
NFV Infrastructure faults collection and remediation
Create, shutdown, destroy and update virtual machine (VM)
Create list of virtual VMs, query, reboot, suspend, resume, save and restore VM
Create, modify, list, delete and query storage pool
Create, delete, list, and query VM storage
Allocate, query, update scale, migrate, operate, release of NFVI resources, and managing the association of the virtualised resources to the compute, storage, networking resources, e.g. increase resource to VMs, improve energy efficiency and resource reclamation
Create, query, update and release resource reservation
Create, query, update, delete and notify of VNF Forwarding Graphs, e.g., by creating and maintaining Virtual Links, virtual networks, sub-nets, and ports
Add, delete, update, query and copy of software images
Root cause analysis of performance issues from the NFV infrastructure perspective
Mechanism for time-stamping of hardware (e.g. network interface cards, NICs and NIDs)
Create, update, list, query and delete hypervisor policies
Create, delete, update, list and query virtual network
Create, update, list, query, and delete subnet
Create, update, list, query and delete port
Orchestrator – VNF Manager (Or-Vnfm)
Resource related requests, e.g. authorization, validation, reservation, allocation by VNF Manager(s)
Sending configuration information to the VNF Manager, so that the VNF can be configured appropriately to function within the VNF Forwarding Graph in the NS
Virtualised hardware resource configuration and state information (e,g. events)
OSS/BSS – NFV MANO (Os-Nfvo)
Requests for network service lifecycle management
Requests for VNF lifecycle management
Forwarding of NFV related state information
Policy management exchanges
Data analytics exchanges
Forwarding of NFV related accounting and usage records
NFVI capacity and inventory information exchanges
VNF – VNF Manager
Requests for VNF lifecycle management
Exchanging configuration information
Exchanging state information
9. Scalability and Performance Testing
Scalability refers to the maximum number of control plane sessions that can be established. Examples include the number of PPPoX sessions or number of routing peers. For routers, number of routes per session and total number of routes in the routing table are also a measure of scalability.
In NFV environments, VNF should have auto-scale feature, so resources (compute, memory, network etc.) should be scaled automatically in response to the varying network function performance needs.
The purpose of this test is:
Ensure auto-scale works properly
Ensure resources consumption is efficient
Measure performance meets SLA in high performance workloads
Scalability
Disable auto-scale and run low (5%), Average (50%) and busy (100%) traffic. Measure metrics for each traffic level.
Manually allocate resources for average traffic to meet SLA. Increase traffic to busy and manually scale resources until SLA is met. Compare resources allocations and analyze results
Enable auto-scale and measure SLA in average and busy traffic. Analyze resources allocation results
Run a dynamic test to shift between different traffic rates each 5 minutes randomly. Analyze results
Performance
Run tests with low traffic. Increase traffic by 5% every 1 hour and analyze metrics
Run on-going busy traffic rate for at least 5 days and measure metrics over time
Run on-going busy traffic rate and execute all MANO tests
Stress testing
Run stress traffic rate (150%) over 5 hours and measure metrics
Define resource limit to support full traffic and run stress traffic rate (150%) over 5 hours and measure metrics
10. Failover Convergence Testing
Convergence time is one of the key metrics for validating SLAs and high availability in service provider networks. The requirements for failover convergence times can be in the order of milliseconds depending on the services and applications that are under consideration. These constraints are equally valid in an NFV environment as well.
In NFV deployments, there is an added factor of variability where failover convergence time for a VNF can be impacted by the number of VNFs on the physical server that is converging to alternate routes. Convergence measurement involves the measurement of processing time of the trigger event in the control plane and the traffic switchover time. It is important in a multiple VNF deployment scenario that the convergence time of any VNF is not impacted by the other VNFs on the same physical server, so that the VNF continues to satisfy the SLAs for which it was provisioned.
Test Setup
Fail-over Convergence Test Topology (source: ETSI GS NFV-PER 001)
Provision a VNF for DUT with routing protocol functionality.
Provision two virtual test endpoints (VTA) connected to the virtual DUT. In order to avoid the performance impact of resource sharing between DUT and virtual test boxes, install the virtual test endpoints on a separate physical server.
Configure the pre-determined routing protocol on the two virtual test appliances and advertise same set of routes from both virtual test appliances. Use a preferred metric in the routes advertised from one of the virtual test appliances (primary next hop/path).
Configure L3 traffic between endpoints advertised by VTA 3 and endpoints advertised by VTA 1
Configure liveness detection mechanisms such as BFD protocol on the VNF as well as VTA1 and VTA2.
11. VNF Migration Testing
Each VNF deployed in the NFV environment requires independent testing.
VNF Testing shall cover:
Static testing: static testing involves reviewing requirements and specifications to ensure completeness or appropriateness for the VNF
Conformance testing: see VNF Compliance Criteria above.
VNF-Mano Integration:
VNF Instantiation
VNFC States
VNF Scaling Models
VNF Component Re-Use
VNF Update and Upgrade
Virtualization and Container Awareness
Elasticity
VNF Policy Management
Migration operations
VNF State
The VNF Descriptor
Start and Stop VNF
VNF Instance Configuration
Virtualised resource faults
VNF faults
VNF Internal functionality – testing specific functions for VNF
Scalability/Performance – run Performance testing and validate resources allocation
End-to-end/system test: Testing VNF performance together with other VNFs and validate the introduction of the new VNF doesn’t affect other VNFs functionality and performance