OpenStack Environment Architecture

September 23, 2016 Ricky Adams

OpenStack Environment Architecture

Fuel deploys an OpenStack Environment with nodes that provide a specific set of functionality. Beginning with Fuel 5.0, a single architecture model can support HA (High Availability) and non-HA deployments; you can deploy a non-HA environment and then add additional nodes to implement HA rather than needing to redeploy the environment from scratch.

The OpenStack environment consists of multiple physical server nodes (or an equivalent VM), each of which is one of the following node types:

Controller:

The Controller manages all activities in the environment. The nova-controller maintains the life cycle of the OpenStack controller.

Note

HA environment must consist of at least 3 controllers in order to achieve HA for MySQL/Galera cluster. And while two controllers could be enough for most of cases, such as HA for highly available OpenStack API services or reliable RabbitMQ AMQP messaging or resilient virtual IP addresses and load balancing, a third controller is required for quorum-based clusters, such as MySQL/Galera or Corosync/Pacemaker. The configuration for stateless and statefull services in HA differs a lot. HA environments also contain active/active and active/passive components. Please, see HA-guide for more details. Fuel configures all stateless OpenStack API services and RabbitMQ HA cluster as active/active. The MySQL/Galera cluster is configured as active/passive. For database clusters, active/active is sometimes referred to as multi-master environments. Such environments should be able to successfully handle multi-node writing conflicts. But OpenStack support for multi-node writing to MySQL/Galera nodes is not production ready yet. “The simplest way to overcome this issue from the operator’s point of view is to use only one writer node for these types of transactions”. That is why Fuel configures HAProxy frontend for MySQL/Galera to use only one active node, while the other nodes in the cluster are retained standby (passive) state. Mongo DB backend for Ceilometer is also configured as active/passive. Note that it is possible to configure MySQL/Galera HA with two controller nodes and a lightweight arbitrator service running at some other node, but this deployment layout is not supported in Fuel.

For more information about how Fuel deploys HA controllers, see Multi-node with HA Deployment.

Compute:

Compute servers are the workhorses of your installation; they are the servers on which your users’ virtual machines are created. nova-compute controls the life cycle of these VMs; Neutron Agent and Ceilometer Compute Agent may also run on Compute nodes.

Note

In environments that Fuel deploys using vCenter as the hypervisor, the Nova-compute service can run only on Controller nodes. Because of this, Fuel does not allow you to assign the “Compute” role to any node when using vCenter.

Storage:

OpenStack requires block and object storage to be provisioned. These can be provisioned as Storage nodes or as roles that run on Compute nodes. Fuel provides the following storage options out of the box:

Cinder LVM provides persistent block storage to virtual machines over iSCSI protocol. The Cinder Storage node runs a Cinder Volume.
Swift object store can be used by Glance to store VM images and snapshots; it may also be used directly by applications Swift is the default storage provider that is provisioned if another storage option is not chosen when the environment is deployed.
Ceph combines object and block storage and can replace either one or both of the above. The Ceph Storage node runs Ceph OSD.

The key principle is that your controller(s) are separate from the compute servers on which your user’s VMs run.

Multi-node with HA Deployment

High availability is recommended for production environments. This provides replicated servers to prevent single points of failure. An HA deployment must have at least three controllers as well as replicas of other servers. You can combine compute, storage, and network nodes to reduce the hardware requirements for the environment, although this may degrade the performance and robustness of the environment.

Details of Multi-node with HA Deployment

OpenStack services are interconnected by RESTful HTTP-based APIs and AMQP-based RPC messages. So redundancy for stateless OpenStack API services is implemented through the combination of Virtual IP (VIP) management using Pacemaker and load balancing using HAProxy. Stateful OpenStack components, such as the state database and messaging server, rely on their respective active/active and active/passive modes for high availability. For example, RabbitMQ uses built-in clustering capabilities, while the database uses MySQL/Galera replication.

Lets take a closer look at what an OpenStack deployment looks like, and what it will take to achieve high availability for an OpenStack deployment.

HA Logical Setup

An OpenStack Multi-node HA environment involves three types of nodes: controller nodes, compute nodes, and storage nodes.

Controller Nodes

The first order of business in achieving high availability (HA) is redundancy, so the first step is to provide multiple controller nodes.

The MySQL database uses Galera to achieve HA, and Galera is a quorum-based system. That means that you should have at least 3 controller nodes.

Every OpenStack controller runs HAProxy, which manages a single External Virtual IP (VIP) for all controller nodes and provides HTTP and TCP load balancing of requests going to OpenStack API services, RabbitMQ, and MySQL.

Note

OpenStack services use Oslo messaging and are directly connected to the RabbitMQ nodes and do not need HAProxy.

Note

Fuel deploys HAProxy inside its own dedicated network namespace. In order to achieve this, custom resource agent scripts for Pacemaker are used instead of classic heartbeat provider for VIP addresses.

When an end user accesses the OpenStack cloud using Horizon or makes a request to the REST API for services such as nova-api, glance-api, keystone-api, neutron-api, nova-scheduler or MySQL, the request goes to the live controller node currently holding the External VIP, and the connection gets terminated by HAProxy. When the next request comes in, HAProxy handles it, and may send it to the original controller or another in the environment, depending on load conditions.

Each of the services housed on the controller nodes has its own mechanism for achieving HA:

OpenStack services, such as nova-api, glance-api, keystone-api, neutron-api, nova-scheduler, cinder-api are stateless services that do not require any special attention besides load balancing.
Horizon, as a typical web application, requires sticky sessions to be enabled at the load balancer.
RabbitMQ provides active/active high availability using mirrored queues and is deployed with custom resource agent scripts for Pacemaker.
MySQL high availability is achieved through Galera deployment and custom resource agent scripts for Pacemaker. Please, note that HAProxy configures MySQL backends as active/passive because OpenStack support for multi-node writes to Galera nodes is not production ready yet.
Neutron agents are active/passive and are managed by custom resource agent scripts for Pacemaker.
Ceph monitors implement their own quorum based HA mechanism and require time synchronization between all nodes. Clock drift higher than 50ms may break the quorum or even crash the Ceph service.

Compute Nodes

OpenStack compute nodes are, in many ways, the foundation of your environment; they are the servers on which your users will create their Virtual Machines (VMs) and host their applications. Compute nodes need to talk to controller nodes and reach out to essential services such as RabbitMQ and MySQL. They use the same approach that provides redundancy to the end-users of Horizon and REST APIs, reaching out to controller nodes using the VIP and going through HAProxy.

Storage Nodes

Depending on the storage options you select for your environment, you may have Ceph, Cinder, and Swift services running on your storage nodes.

Ceph implements its own HA; all you need is enough controller nodes running the Ceph Monitor service to form a quorum, and enough Ceph OSD nodes to satisfy the object replication factor.

Swift API relies on the same HAProxy setup with VIP on controller nodes as the other REST APIs. If don’t expect too much data traffic in Swift, you can also deploy Swift Storage and Proxy services on controller nodes. For a larger production environment you’ll need dedicated nodes: two for Swift Proxy and at least three for Swift Storage.

Whether or not you’d want separate Swift nodes depends primarily on how much data you expect to keep there. A simple test is to fully populate your Swift object store with data and then fail one controller node. If replication of the degraded Swift objects between the remaining nodes controller generates enough network traffic, CPU load, or disk I/O to impact performance of other OpenStack services running on the same nodes, you should separate Swift from controllers.

If you select Cinder LVM as the block storage backend for Cinder volumes, you should have at least one Cinder LVM node. Unlike Swift and Ceph, Cinder LVM doesn’t implement data redundancy across nodes: if a Cinder node is lost, volumes stored on that node cannot be recovered from the data stored on other Cinder nodes. If you need your block storage to be resilient, use Ceph for volumes.

How HA with Pacemaker and Corosync Works

Corosync Settings

Corosync uses Totem protocol, which is an implementation of Virtual Synchrony protocol. It uses it in order to provide connectivity between cluster nodes, decide if cluster is quorate to provide services, to provide data layer for services that want to use features of Virtual Synchrony.

Corosync functions in Fuel as the communication and quorum service via Pacemaker cluster resource manager (crm). It’s main configuration file is located in/etc/corosync/corosync.conf.

The main Corosync section is the totem section which describes how cluster nodes should communicate:

totem {
  version:                             2
  token:                               3000
  token_retransmits_before_loss_const: 10
  join:                                60
  consensus:                           3600
  vsftype:                             none
  max_messages:                        20
  clear_node_high_bit:                 yes
  rrp_mode:                            none
  secauth:                             off
  threads:                             0
  interface {
    ringnumber:  0
    bindnetaddr: 10.107.0.8
    mcastaddr:   239.1.1.2
    mcastport:   5405
  }
}

Corosync usually uses multicast UDP transport and sets up a “redundant ring” for communication. Currently Fuel deploys controllers with one redundant ring. Each ring has it’s own multicast address and bind net address that specifies on which interface Corosync should join corresponding multicast group. Fuel uses default Corosync configuration, which can also be altered in Fuel manifests.

Pacemaker Settings

Pacemaker is the cluster resource manager used by Fuel to manage Neutron resources, HAProxy, virtual IP addresses and MySQL Galera cluster. It is done by use of Open Cluster Framework (see http://linux-ha.org/wiki/OCF_Resource_Agents) agent scripts which are deployed in order to start/stop/monitor Neutron services, to manage HAProxy, virtual IP addresses and MySQL replication. These are located at /usr/lib/ocf/resource.d/mirantis/ocf-neutron-[metadata|ovs|dhcp|l3]-agent,/usr/lib/ocf/resource.d/fuel/mysql, /usr/lib/ocf/resource.d/ocf/haproxy. Firstly, MySQL agent is started, HAproxy and virtual IP addresses are set up. Open vSwitch, metadata, L3, and DHCP agents are started as Pacemaker clones on all the nodes.

How Fuel Deploys HA

Fuel installs Corosync service, configures corosync.conf, and includes the Pacemaker service plugin into /etc/corosync/service.d. Then Corosync service starts and spawns corresponding Pacemaker processes. Fuel configures the cluster properties of Pacemaker and then injects resource configurations for virtual IPs, HAProxy, MySQL and Neutron agent resources.

The running configuration can be retrieved from an OpenStack controller node by running:

# crm configure show

MySQL and Galera

My SQL with Galera implements true active/active HA. Fuel configures MySQL/Galera to have a single active node that receives write operations and serves read operations. You can add one or two Galera slave nodes; this is recommended for environments that have six or more nodes:

Only one MySQL/Galera node is considered active at a time; the remaining cluster nodes are standby masters.
The standby masters do not have the “slave lag” that is typical for MySQL master/slave topologies because Galera employs synchronous replication and ensures that each cluster node is identical.
Mirantis OpenStack uses Pacemaker and HAProxy to manage MySQL/Galera:
- Pacemaker manages the individual MySQL+Galera nodes, HAProxy, and the Virtual IP Address (VIP).
- HAProxy runs in the dedicated network namespace and manages connections between the MySQL/Galera active master, backup masters, and the MySQL Clients connecting to the VIP.
Only one MySQL/Galera master is active in the VIP; this single direction synchronous replication usually provides better performance than other implementations.

The workflow is:

The node that is tied to the VIP serves new data updates and increases its global transaction ID number (GTID).
Each other node in the Galera cluster then synchronizes its data with the node that has a GTID greater than its current values.
If the status of any node falls too far behind the Galera cache, an entire replica is distributed to that node. This causes a master to switch to the Donor role, allowing an out-of-sync node to catch up.

VMware vSphere Integration

This section provides technical details about how vCenter support is implemented in Mirantis OpenStack.

See Preparing for vSphere Integration for information about planning the deployment;
Deploying vCenter gives instructions for creating and deploying a Mirantis OpenStack environment that is integrated with VMware vSphere.

VMware provides a vCenter driver for OpenStack. This driver enables the Nova-compute service to communicate with a VMware vCenter server that manages one or more ESXi host clusters. The vCenter driver makes management convenient from both the OpenStack Dashboard (Horizon) and from vCenter, where advanced vSphere features can be accessed.

This enables Nova-compute to deploy workloads on vSphere and allows vSphere features such as vMotion workload migration, vSphere High Availability, and Dynamic Resource Scheduling (DRS). DRS is enabled by architecting the driver to aggregate ESXi hosts in each cluster to present one large hypervisor entity to the Nova scheduler. This enables OpenStack to schedule to the granularity of clusters, then call vSphere DRS to schedule the individual ESXi host within the cluster. The vCenter driver also interacts with the OpenStack Image Service (Glance) to copy VMDK (VMware virtual machine) images from the back-end image store to a database cache from which they can be quickly retrieved after they are loaded.

The vCenter driver requires the Nova Network topology, which means that OVS (Open vSwitch) does not work with vCenter.

The Nova-compute service runs on a Controller node, not on a separate Compute node. This means that, in the Multi-node Deployment mode, a user has a single Controller node with both compute and network services running.

Unlike other hypervisor drivers that require the Nova-compute service to be running on the same node as the hypervisor itself, the vCenter driver enables the Nova-compute service to manage ESXi hypervisors remotely. This means that you do not need a dedicated Compute node to use the vCenter hypervisor; instead, Fuel puts the Nova-compute service on a Controller node.

Dual hypervisor support

Beginning with Fuel 6.1, you can deploy an environment with two hypervisors: vCenter, KVM/QEMU using availability zones.

_images/dual-hyperv-arch.png

Multi-node HA Deployment with vSphere integration

_images/vcenter-ha-architecture.png

On a highly-available Controller cluster (meaning that three or more Controller nodes are configured), the Nova-compute and Nova-network services can either run on the same or on different Controller nodes. If some service fails, it is restarted by Pacemaker several times; if service fails to start or the whole Controller node fails, service is started on one of the available Controllers.

Example of network topology

_images/vcenter-network-topology.png

This is an example of the default Fuel OpenStack network configuration that a user should have if the target nodes have at least two NICs and are connected to a Fuel Admin (PXE) network with eth0 interfaces.

The Nova-network service must serve DHCP requests and NAT translations of the VMs’ traffic, so the VMs on the ESXi nodes must be connected directly to the Fixed (Private) network. By default, this network uses VLAN 103 for the Nova-Network Flat DHCP topology. So, a user can create a tagged Port Group on the ESXi servers with VLAN 103 and connect the corresponding vmnic NIC to the same switch as the OpenStack Controller nodes.

The Nova Compute service must be able to reach the vCenter management IP from the OpenStack Public network in order to connect to the vSphere API.

Fuel running under vSphere

_images/Fuel_in_vCenter_networking.png

For information about configuring your vSphere environment so that you can install Fuel in it, see Preparing to run Fuel on vSphere.

Ceph Monitors

Ceph monitors (MON) manage various maps like MON map, CRUSH map, and others. The CRUSH map is used by clients to deterministically select the storage devices (OSDs) to receive copies of the data. Ceph monitor nodes manage where the data should be stored and maintain data consistency with the Ceph OSD nodes that store the actual data.

Ceph monitors implement HA using a master-master model:

One Ceph monitor node is designated the “leader.” This is the node that first received the most recent cluster map replica.
Each other monitor node must sync its cluster map with the current leader.
Each monitor node that is already sync’ed with the leader becomes a provider; the leader knows which nodes are currently providers. The leader tells the other nodes which provider they should use to sync their data.

Ceph Monitors use the Paxos algorithm to determine all updates to the data they manage. All monitors that are in quorum have consistent up-to-date data because of this.

You can read more in Ceph documentation.

Network Architecture

Logical Networks

For better network performance and manageability, Fuel places different types of traffic into separate logical networks. This section describes how to distribute the network traffic in an OpenStack environment.

Admin (PXE) network (“Fuel network”)

The Fuel Master node uses this network to provision and orchestrate the OpenStack environment. It is used during installation to provide DNS, DHCP, and gateway services to a node before that node is provisioned. Nodes retrieve their network configuration from the Fuel Master node using DHCP, which is why this network must be isolated from the rest of your network and must not have a DHCP server other than the Fuel Master running on it.

Public network

The word “Public” means that these addresses can be used to communicate with the cluster and its VMs from outside of the cluster (the Internet, corporate network, end users).

The public network provides connectivity to the globally routed address space for VMs. The IP address from the public network that has been assigned to a compute node is used as the source for the Source NAT performed for traffic going from VM instances on the compute node to the Internet.

The public network also provides Virtual IPs for public endpoints, which are used to connect to OpenStack services APIs.

Finally, the public network provides a contiguous address range for the floating IPs that are assigned to individual VM instances by the project administrator. Nova Network or Neutron services can then configure this address on the public network interface of the Network controller node. Environments based on Nova Network use iptables to create a Destination NAT from this address to the private IP of the corresponding VM instance through the appropriate virtual bridge interface on the Network controller node.

For security reasons, the public network is usually isolated from other networks in the cluster.

If you use tagged networks for your configuration and combine multiple networks onto one NIC, you should leave the Public network untagged on that NIC. This is not a requirement, but it simplifies external access to OpenStack Dashboard and public OpenStack API endpoints.

Storage network (Storage Replication)

Part of a cluster’s internal network. It carries replication traffic from Ceph or Swift. Ceph public traffic is dispatched through br-mgmt bridge (Management network).

Management network

Part of the cluster’s internal network. It is used to put tagged VLAN traffic from private tenant networks on physical NIC interfaces. This network can also be used for serving iSCSI protocol exchanges between Compute and Storage nodes. As to the Management, it serves for all other internal communications, including database queries, AMQP messaging, high availability services).

Private network (Fixed network)

The private network facilitates communication between each tenant’s VMs. Private network address spaces are not a part of the enterprise network address space; fixed IPs of virtual instances cannot be accessed directly from the rest of the Enterprise network.

Just like the public network, the private network should be isolated from other networks in the cluster for security reasons.

Internal Network

The internal network connects all OpenStack nodes in the environment. All components of an OpenStack environment communicate with each other using this network. This network must be isolated from both the private and public networks for security reasons. The internal network can also be used for serving iSCSI protocol exchanges between Compute and Storage nodes. The Internal Network is a generalizing term; it means that any network except for Public can be regarded as Internal: for example, Storage or Management. Do not confuse Internal with Private, as the latter is only related to the networks within a tenant, that provides communication between VMs within the specific tenant.

Note

If you want to combine another network with the Admin network on the same network interface, you must leave the Admin network untagged. This is the default configuration and cannot be changed in the Fuel UI although you could modify it by manually editing configuration files.

HA deployment for Networking

Fuel leverages Pacemaker resource agents in order to deploy highly avaiable networking for OpenStack environments.

Virtual IP addresses deployment details

Starting from the Fuel 5.0 release, HAProxy service and network interfaces running virtual IP addresses reside in separate haproxy network namespace. Using a separate namespace forces Linux kernel to treat connections from OpenStack services to HAProxy as remote ones, this ensures reliable failover of established connections when the management IP address migrates to another node. In order to achieve this, resource agent scripts for ocf:fuel:ns_haproxy and ocf:fuel:ns_IPaddr2 were hardened with network namespaces support.

Successfull failover of public VIP address requires controller nodes to perform active checking of the public gateway. Fuel configures the Pacemaker resourceclone_ping_vip__public that makes public VIP to migrate in case the controller can’t ping its public gateway.

TCP keepalive configuration details

Failover sometimes ends up with dead connections. The detection of such connections requires additional assistance from the Linux kernel. To speed up the detection process from the default of two hours to a more acceptable 3 minutes, Fuel adjusts kernel parameters for net.ipv4.tcp_keepalive_time, net.ipv4.tcp_keepalive_intvl,net.ipv4.tcp_keepalive_probes and net.ipv4.tcp_retries2.

Implementing Multiple Cluster Networks

Mirantis OpenStack supports configuring multiple network domains per single OpenStack environment. This feature is used for environments that deploy a large number of target nodes, to avoid the broadcast storms that can occur when all nodes share a single L2 domain. Multiple Cluster Networks can be configured for OpenStack environments that use an encapsulation protocol such as Neutron GRE and are deployed using Fuel 6.0 and later.

This section discusses how support for multiple cluster networks is implemented. Configuring Multiple Cluster Networks tells how to configure this feature for your Fuel environments.

The Multiple Cluster Network feature is based on Node Groups, which are groupings of nodes in the current cluster:

Each of the major logical network (public, management, storage, and fuelweb_admin) is associated with a Node Group rather than a cluster.
Each Node Group belongs to a cluster.
A default Node Group is created for each cluster. The default values are derived from Fuel Menu (for the Fuel Admin (PXE) network) and release metadata.
Each cluster can support multiple Node Groups.

Nailgun manages multiple cluster networks:

A node serializes its network information based on its relationship to networks in its Node Group.
Each node must have a Node Group; if it is not explicitly assigned to one, it is assumed to be a member of the default network group. If it is not configured properly, it will result in the cluster failing to deploy.
A set of default networks is generated when a Node Group is created. These networks are deleted when the Node Group is deleted.
Each logical network is associated with a Node Group rather than with a cluster.
Each fuelweb_admin network must have a DHCP network configured in the dnsmasq.template file.
DHCP requests can be forwarded to the Fuel Master node using either of the following methods:
- configure switches to relay DHCP
- using a relay client such as dhcp-helper

The nodegroups table stores information about all configured Node Groups. To view the contents of this table, issue the fuel nodegroup command

[root@nailgun ~]# fuel nodegroup

id | cluster | name
---|---------|---------------
1  | 1       | default
2  | 1       | alpha

The fields displayed are:

id:	Sequential ID number assigned when the Node Group is created and used as the Public Key for the Node Group.
cluster:	Cluster with which the Node Group is associated.
name:	Display name for the Node Group, assigned by the operator.

The network_groups table can be viewed in the network_1.yaml file.

Public and Floating IP address requirements

This section describes the OpenStack requirements for Public and Floating IP addresses that are available. Each network type (Nova-Network and Neutron) has distinct requirements.

Note

Public and Floating IP ranges must not intersect!

Nova-Network requirements

Both Public and Floating IP ranges should be defined within the same network segment (CIDR). If this is not possible, additional routing settings between these ranges are required on your hardware router to connect the two ranges.

Public range with Nova-Network requirements:

Each deployed node requires one IP address from the Public IP range. In addition, two extra IP addresses for the environment’s Virtual IPs and one for the default gateway are required.

Floating range with Nova-Network requirements:

Every VM instance connected to the external network requires one IP address from the Floating IP range. These IP addresses are assigned on demand and may be released from the VM and returned back to the pool of non-assigned Floating IP addresses.

Neutron requirements

Both Public and Floating IP ranges must be defined inside the same network segment (CIDR)! Fuel cannot configure Neutron with external workarounds at this time.

Public range with Neutron requirements:

Each deployed Controller node and each deployed Zabbix node requires one IP address from the Public IP range. This IP address goes to the node’s bridge to the external network (“br-ex”).
Two additional IP addresses for the environment’s Virtual IPs and one for the default gateway are required.

Note

For 5.1 and later Neutron environments, Public IP addresses can be allocated either to all nodes or just to Controllers and Zabbix servers. By default, IP addressess are allocated to Controllers and Zabbix servers only. To get them allocated to all nodes, Public network assignment -> Assign public network to all nodes should be selected on the Settings tab.
When using Fuel 6.1 to manage 5.0.x environments, the environment must conform to the 5.0.x practice, so each target node must have a Public IP assigned to it, even when using Neutron.
In Fuel 6.1, nodes that do not have Public IP addresses use Controllers to reach out the outside networks. There is a virtual router running on Controller nodes (controlled by Corosync), which utilizes a pair of Public and Management Virtual IPs to NAT traffic from Management to Public network. And nodes with no Public IPs assigned have the default gateway pointed to that Virtual IP from Management network.

Floating range with Neutron requirements:

Each defined tenant, including the Admin tenant, requires one IP address from the Floating range.
This IP address goes to the virtual interface of the tenant’s virtual router. Therefore, one Floating IP is assigned to the Admin tenant automatically as part of the OpenStack deployment process.
Each VM instance connected to the external network requires one IP address from the Floating IP range. These IP addresses are assigned on demand and may be released from the VM and returned back to the pool of non-assigned Floating IP addresses.

Example

Calculate the numbers of the required Public and Floating IP addresses using these formulas:

Neutron

for the Public IP range: [(X+Y) + N];
for the Floating range: [K+M].

Nova-Network

for the Public IP range: [(X+Y+Z) + N];
for the Floating IP range: [M].

Where:

Number of nodes:
- X = controller nodes
- Y = Zabbix nodes
- Z = other nodes (Compute, Storage, and MongoDB)
K = the number of virtual routers for all the tenants (on condition all of them are connected to the external network)
M = the number of virtual instances you want to provide the direct external access to
N = the number of extra IP addresses. It is 3 in total for the following:
- for environment’s virtual IP:
  - virtual IP address for a virtual router
  - public virtual IP address
- 1 for the default gateway

Lets consider the following environment:

X = 3 controller nodes
Y = 1 Zabbix node
Z = 10 compute + 5 Ceph OSD + 3 MongoDB nodes
K = 10 tenants with one router for each tenant connected to the external network
M = 100 VM instances with the direct external access required
N = 3 extra IP addresses

Your calculations will result in the following number of the required IP addresses:

Environment details	Neutron \| \| Nova-Network requirements for \| \| requirements for
Environment details	Public IPs	Floating IPs	Public IPs	Floating IPs
X = 3	✓		✓
Y = 1	✓		✓
Z = 18	✓*		✓
K = 10		✓	n/a	n/a
M = 100		✓		✓
N = 3	✓		✓
Total:	7/25*	110	25	100

Tip

✓* – it is the additional requirement for Public IP range for the 6.1 Neutron environment with Public network assignment -> Assign public network to all nodes set. In the example, it is [(X+Y+Z) + N] = 25.

n/a – this value is not applicable to Nova-Network environments.

Router

Your network must have an IP address in the Public Logical Networks set on a router port as an “External Gateway”. Without this, your VMs are unable to access the outside world. In many of the examples provided in these documents, that IP is 12.0.0.1 in VLAN 101.

If you add a new router, be sure to set its gateway IP:

_images/new_router.png _images/set_gateway.png

The Fuel UI includes a field on the networking tab for the gateway address. When OpenStack deployment starts, the network on each node is reconfigured to use this gateway IP address as the default gateway.

If Floating addresses are from another L3 network, then you must configure the IP address (or multiple IPs if Floating addresses are from more than one L3 network) for them on the router as well. Otherwise, Floating IPs on nodes will be inaccessible.

Consider the following routing recommendations when you configure your network:

Use the default routing via a router in the Public network
Use the the management network to access your management infrastructure (L3 connectivity if necessary)
The Storage and VM networks should be configured without access to other networks (no L3 connectivity)

Switches

You must manually configure your switches before deploying your OpenStack environment. Unfortunately the set of configuration steps, and even the terminology used, is different for different vendors; this section provides some vendor-agnostic information about how traffic should flow. We also provide sample switch configurations:

To configure your switches:

Configure all access ports to allow non-tagged PXE booting connections from each slave node to the Fuel Master node. This network is referred to as the Fuel network.
By default, the Fuel Master node uses the eth0 interface to serve PXE requests on this network, but this can be changed during installation of the Fuel Master node.
If you use the eth0 interface for PXE requests, you must set the switch port for eth0 on the Fuel Master node to access mode.
We recommend that you use the eth0 interfaces of all other nodes for PXE booting as well. Corresponding ports must also be in access mode.
Taking into account that this is the network for PXE booting, do not mix this L2 segment with any other network segments. Fuel runs a DHCP server, and, if there is another DHCP on the same L2 network segment, both the company’s infrastructure and Fuel’s are unable to function properly.
You must also configure each of the switch’s ports connected to nodes as an “STP Edge port” (or a “spanning-tree port fast trunk”, according to Cisco terminology). If you do not do that, DHCP timeout issues may occur.

As soon as the Fuel network is configured, Fuel can operate. Other networks are required for OpenStack environments, and currently all of these networks live in VLANs over the one or multiple physical interfaces on a node. This means that the switch should pass tagged traffic, and untagging is done on the Linux hosts.

Note

For the sake of simplicity, all the VLANs specified on the networks tab of the Fuel UI should be configured on switch ports, pointing to Slave nodes, as tagged.

Of course, it is possible to specify as tagged only certain ports for certain nodes. However, in the current version, all existing networks are automatically allocated for each node, with any role. And network check also checks if tagged traffic can pass, even if some nodes do not require this check. for example, Cinder nodes do not need fixed network traffic)

This is enough to deploy the OpenStack environment. However, from a practical standpoint, it is still not really usable because there is no connection to other corporate networks yet. To make that possible, you must configure uplink port(s).

One of the VLANs may carry the office network. To provide access to the Fuel Master node from your network, any other free physical network interface on the Fuel Master node can be used and configured according to your network rules (static IP or DHCP). The same network segment can be used for Public and Floating ranges. In this case, you must provide the corresponding VLAN ID and IP ranges in the UI. One Public IP per node is used to SNAT traffic out of the VMs network, and one or more floating addresses per VM instance are used to get access to the VM from your network, or even the global Internet. To have a VM visible from the Internet is similar to having it visible from corporate network – corresponding IP ranges and VLAN IDs must be specified for the Floating and Public networks. One current limitation of Fuel is that the user must use the same L2 segment for both Public and Floating networks.

Example configuration for one of the ports on a Cisco switch:

interface GigabitEthernet0/6               # switch port
description s0_eth0 jv                     # description
switchport trunk encapsulation dot1q       # enables VLANs
switchport trunk native vlan 262           # access port, untags VLAN 262
switchport trunk allowed vlan 100,102,104  # 100,102,104 VLANs are passed with tags
switchport mode trunk                      # To allow more than 1 VLAN on the port
spanning-tree portfast trunk               # STP Edge port to skip network loop
                                           # checks (to prevent DHCP timeout issues)
vlan 262,100,102,104                       # Might be needed for enabling VLANs

Neutron Network Topologies

Neutron (formerly Quantum) is a service which provides Networking-as-a-Service functionality in OpenStack. It has a rich tenant-facing API for defining network connectivity and addressing in the cloud, and gives operators the ability to leverage different networking technologies to power their cloud networking.

There are various deployment use cases for Neutron. Fuel supports the most common of them, called Per-tenant Routers with Private Networks. Each tenant has a virtual Neutron router with one or more private networks, which can communicate with the outside world. This allows full routing isolation for each tenant private network.

Neutron is not, however, required in order to run an OpenStack environment. If you don’t need (or want) this added functionality, it’s perfectly acceptable to continue using nova-network.

In order to deploy Neutron, you need to enable it in the Fuel configuration. Fuel sets up Neutron components on each of the controllers to act as a virtual Neutron router in HA (if deploying in HA mode).

Neutron versus Nova-Network

OpenStack networking with Neutron has some differences from Nova-network. Neutron is able to virtualize and manage both layer 2 (logical) and layer 3 (network) of the OSI network model, as compared to simple layer 3 virtualization provided by nova-network. This is the main difference between the two networking models for OpenStack. Virtual networks (one or more) can be created for a single tenant, forming an isolated L2 network called a “private network”. Each private network can support one or many IP subnets. Private networks can be segmented using one of two different topologies:

VLAN segmentation Ideally, “Private network” traffic is located on a dedicated network adapter that is attached to an untagged network port. It is, however, possible for this network to share a network adapter with other networks. In this case, you should use non-intersecting VLAN-ID ranges for “Private network” and other networks.
GRE segmentation In this mode of operation, Neutron does not require a dedicated network adapter. Neutron builds a mesh of GRE tunnels from each compute node and controller nodes to every other node. Private networks for each tenant make use of this mesh for isolated traffic.

Both Neutron topologies are based on OVS (Open vSwitch).

Neutron with VLAN segmentation and OVS

The following diagram shows the network isolation using OVS (Open vSwitch) and VLANs:

Note

You must have at least three network interfaces for this configuration

Neutron with GRE segmentation and OVS

A typical network configuration for Neutron with GRE segmentation might look like this:

Open vSwitch (OVS) GRE tunnels are provided through Management Network.

Note

This setup does not include physical Private network.

Neutron VLAN Segmentation Planning

Depending on the number of NICs you have in your node servers, you can use the following examples to plan your NIC assignment to the OpenStack Logical Networks. Note that you must have at least three NICS configured to use the Neutron VLAN topology.

3 NIC deployment

eth0 – untagged port for Administrative network
eth1 (br-eth1) – port for networks: Public/Floating, Management, Storage
eth2 (br-eth2) – port for Private network (where the number of VLANs depends on the number of tenant networks with a continuous range)

4 NIC deployment

eth0 – port for Administrative network
eth1 (br-eth1) – port for networks: Public/Floating, Management
eth2 (br-eth2) – port for Private network, with defined VLAN range IDs
eth3 (br-eth1) – port for Storage network

Routing recommendations

Use the default routing via a router in the Public network
Use the the management network to access to your management infrastructure (L3 connectivity if necessary)
The administrative network or only the Fuel server (via dedicated NIC) should have Internet access
The Storage and Private network (VLANs) should be configured without access to other networks (no L3 connectivity)

Neutron GRE Segmentation Planning

Depending on the number of NICs you have in your node servers, you can use the following examples to plan your NIC assignment:

2 NIC deployment

eth0 – untagged port for Administrative network
eth1 (br-eth1) – port for networks: Public/Floating, Management, Storage

3 NIC deployment

eth0 – untagged port for Administrative network
eth1 (br-eth1) – port for networks: Public/Floating, Management
eth2 (br-eth2) – port for Storage network

4 NIC deployment

eth0 – untagged port for Administrative network
eth1 (br-eth1) – port for Management network
eth2 (br-eth2) – port for Public/Floating network
eth3 (br-eth3) – port for Storage network

Routing recommendations

Use the default routing via router in the Public network
Use the management network access to your management infrastructure (L3 connectivity if necessary)
The administrative network or only Fuel server (via dedicated NIC) should have Internet access
The Storage and Private network (VLANs) should be configured without access to other networks (no L3 connectivity)

Known limitations

Neutron will not allocate a floating IP range for your tenants. After each tenant is created, a floating IP range must be created. Note that this does not prevent Internet connectivity for a tenant’s instances, but it would prevent them from receiving incoming connections. You, the administrator, should assign a floating IP addresses for the tenant. Below are steps you can follow to do this:
```
# get admin credentials:
source /root/openrc
# get admin tenant-ID:
keystone tenant-list
```
id name enabled

b796f91df6b84860a7cd474148fb2229 admin True

cba7b0ff68ee4985816ac3585c8e23a9 services True
```
# create one floating-ip address for admin tenant:
neutron floatingip-create --tenant-id=b796f91df6b84860a7cd474148fb2229 net04_ext
```
You can’t combine Private or Admin network with any other networks on one NIC.
To deploy OpenStack using Neutron with GRE segmentation, each node requires at least 2 NICs.
To deploy OpenStack using Neutron with VLAN segmentation, each node requires at least 3 NICs.

id	name	enabled
b796f91df6b84860a7cd474148fb2229	admin	True
cba7b0ff68ee4985816ac3585c8e23a9	services	True

NIC Assignment Example (Neutron VLAN)

The current architecture assumes the presence of 3 NICs, but it can be customized for two or 4+ network interfaces. Most servers are built with at least two network interfaces. In this case, let’s consider a typical example of three NIC cards. They are utilized as follows:

eth0:: The Admin (PXE) network, used for communication with Fuel Master for deployment.
eth1:: The public network and floating IPs assigned to VMs
eth2:: The private network, for communication between OpenStack VMs, and the bridge interface (VLANs)

The figure below illustrates the relevant nodes and networks in Neutron VLAN mode.

Nova Network Topologies

Nova-network offers two options for deploying private network for tenants:

FlatDHCP Manager
VLAN Manager

This section describes the Nova-network topologies. For more information about how the network managers work, you can read these blogs:

Nova-network FlatDHCP Manager

In this topology, a bridge (i.e. br100) is configured on every Compute node and one of the machine’s physical interfaces is connected to it. Once the virtual machine is launched, its virtual interface connects to that bridge as well. The same L2 segment is used for all OpenStack projects, which means that there is no L2 isolation between virtual hosts, even if they are owned by separate projects. Additionally, only one flat IP pool is defined for the entire environment. For this reason, it is called the Flat manager.

The simplest case here is as shown on the following diagram of the FlatDHCPManager used with the multi-host scheme. Here the eth1 interface is used to give network access to virtual machines, while the eth0 interface is the management network interface.

Fuel deploys OpenStack in FlatDHCP mode with the multi-host feature enabled. Without this feature enabled, network traffic from each VM would go through the single gateway host, which creates a single point of failure. In multi-host mode, each Compute node becomes a gateway for all the VMs running on the host, providing a balanced networking solution: if one of the Compute nodes goes down, the rest of the environment remains operational.

The current version of Fuel uses VLANs, even for the FlatDHCP network manager. On the Linux host, it is implemented in such a way that it is not the physical network interfaces that connects to the bridge, but the VLAN interface (i.e. eth0.102).

The following diagram illustrates FlatDHCPManager used with the single-interface scheme:

In order for FlatDHCPManager to work, one designated switch port where each Compute node is connected needs to be configured as a tagged (trunk) port with the required VLANs allowed (enabled, tagged). Virtual machines communicate with each other on L2 even if they are on different Compute nodes. If the virtual machine sends IP packets to a different network, they are routed on the host machine according to the routing table. The default route points to the gateway specified on the Network settings tab in the UI as the gateway for the Public network.

The following diagram describes network configuration when you use Nova-network with FlatDHCP Manager:

Nova-network VLAN Manager

The Nova-network VLANManager topology is more suitable for large scale clouds. The idea behind this topology is to separate groups of virtual machines owned by different projects into separate and distinct L2 networks. In VLANManager, this is done by tagging IP frames, identified by a given VLAN. It allows virtual machines inside a specific project to communicate with each other and not to see any traffic from VMs of other projects. Again, as with FlatDHCPManager, switch ports must be configured as tagged (trunk) ports to allow this scheme to work.

The following diagram describes network configuration when you use Nova-network with VLAN Manager:

Fuel Deployment Schema

OpenStack Compute nodes untag the IP packets using VLAN tagging on a physical interface packets and send them to the appropriate VMs. Simplifying the configuration of VLAN Manager, there is no known limitation that Fuel could add in this particular networking mode.

Configuring the network

Once you choose a networking topology (Nova-network FlatDHCP or VLAN), you must configure equipment accordingly. The diagram below shows an example configuration (with a router network IP 12.0.0.1/24).

Fuel operates with a set of logical networks. In this scheme, these logical networks are mapped as follows:

Admin (Fuel) network: untagged on the scheme
Public network: VLAN 101
Floating network: VLAN 101
Management network: VLAN 100
Storage network: VLAN 102
Fixed network: VLANs 103-200

Nova-network Planning Examples

Nova-network FlatDHCP

Depending on the number of NICs you have in your node servers, you can use the following examples to plan your NIC assignment:

1 NIC deployment

eth0 – VLAN tagged port for networks: Storage, Public/Floating, Private, Management and Administrative (untagged)

2 NIC deployment

eth0 – Management network (tagged), Storage network (tagged) and Administrative network (untagged)
eth1 – VLAN tagged port with VLANs for networks: Public/Floating, Private

3 NIC deployment

eth0 – untagged port for Administrative network
eth1 – VLAN tagged port with VLANs for networks: Public/Floating, Private, Management
eth2 – untagged port for Storage network

4 NIC deployment

eth0 – untagged port for Administrative network
eth1 – tagged port for networks: Public/Floating, Management
eth2 – untagged port for Private network
eth3 – untagged port for Storage network

Routing recommendations

Use the default routing via a router in the Public network
Use the the management network to access to your management infrastructure (L3 connectivity if necessary)
The administrative network or only the Fuel server (via dedicated NIC) should have Internet access
The Storage and Private network (VLANs) should be configured without access to other networks (no L3 connectivity)

Nova-network VLAN Manager

Depending on the number of NICs you have in your node servers, you can use the following examples to plan your NIC assignment to the OpenStack Logical Networks:

1 NIC deployment

eth0 – VLAN tagged port for networks: Storage, Public/Floating, Private (where the number of VLANs depends on the number of tenant networks with a continuous range), Management and Administrative network (untagged)

2 NIC deployment

eth0 – Management network (tagged), Storage network (tagged) and Administrative network (untagged)
eth1 – VLAN tagged port with minimum two VLANs for networks: Public/Floating, Private (where number of VLANs depend on number of tenant networks – continuous range)

3 NIC deployment

eth0 – untagged port for Administrative network
eth1 – VLAN tagged port with two VLANs for networks: Public/Floating, Management Private (where the number of VLANs depends on the number of tenant networks with a continuous range)
eth2 – untagged port for Storage network

4 NIC deployment

eth0 – untagged port for Administrative network
eth1 – tagged port for networks: Public/Floating, Management
eth2 – VLAN tagged port for Private network, with defined VLAN range IDs – continuous range
eth3 – untagged port for Storage network

Routing recommendations

Use the default routing via a router in the Public network
Use the the management network to access to your management infrastructure (L3 connectivity if necessary)
The administrative network or only the Fuel server (via dedicated NIC) should have Internet access
The Storage and Private network (VLANs) should be configured without access to other networks (no L3 connectivity)

Advanced Network Configuration using Open VSwitch

The Neutron networking model uses Open VSwitch (OVS) bridges and the Linux namespaces to create a flexible network setup and to isolate tenants from each other on L2 and L3 layers. Mirantis OpenStack also provides a flexible network setup model based on Open VSwitch primitives, which you can use to customize your nodes. Its most popular feature is link aggregation. While the FuelWeb UI uses a hardcoded per-node network model, the Fuel CLI tool allows you to modify it in your own way.

Note

When using encapsulation protocols for network segmentation, take header overhead into account to avoid guest network slowdowns from packet fragmentation or packet rejection. With a physical host MTU of 1500 the maximum instance (guest) MTU is 1430 for GRE and 1392 for VXLAN. When possible, increase MTU on the network infrastructure using jumbo frames. The default OpenVSwitch behavior in Mirantis OpenStack 6.0 and newer is to fragment packets larger than the MTU. In prior versions OpenVSwitch discards packets exceeding MTU. See the Official OpenStack documentation for more information.

Reference Network Model in Neutron

The FuelWeb UI uses the following per-node network model:

Create an OVS bridge for each NIC except for the NIC with Admin network (for example, br-eth0 bridge for eth0 NIC) and put NICs into their bridges
Create a separate bridge for each OpenStack network:
- br-ex for the Public network
- br-prv for the Private network
- br-mgmt for the Management network
- br-storage for the Storage network
Connect each network’s bridge with an appropriate NIC bridge using an OVS patch with an appropriate VLAN tag.
Assign network IP addresses to the corresponding bridges.

Note that the Admin network IP address is assigned to its NIC directly.

This network model allows the cluster administrator to manipulate cluster network entities and NICs separately, easily, and on the fly during the cluster life-cycle.

Adjust the Network Configuration via CLI

On a basic level, this network configuration is part of a data structure that provides instructions to the Puppet modules to set up a network on the current node. You can examine and modify this data using the Fuel CLI tool. Just download (then modify and upload if needed) the environment’s ‘deployment default’ configuration:

[root@fuel ~]# fuel --env 1 deployment default
directory /root/deployment_1 was created
Created /root/deployment_1/compute_1.yaml
Created /root/deployment_1/controller_2.yaml
[root@fuel ~]# vi ./deployment_1/compute_1.yaml
[root@fuel ~]# fuel --env 1 deployment --upload

Note

Please, make sure you read the Fuel CLI documentation carefully.

The part of this data structure that describes how to apply the network configuration is the ‘network_scheme’ key in the top-level hash of the YAML file. Let’s take a closer look at this substructure. The value of the ‘network_scheme’ key is a hash with the following keys:

interfaces – A hash of NICs and their low-level/physical parameters. You can set an MTU feature here.
provider – Set to ‘ovs’ for Neutron.
endpoints – A hash of network ports (OVS ports or NICs) and their IP settings.
roles – A hash that specifies the mappings between the endpoints and internally-used roles in Puppet manifests (‘management’, ‘storage’, and so on).
transformations – An ordered list of OVS network primitives.

Here is an example of a “network_scheme” section in a node’s configuration, showing how to change MTU parameters:

network_scheme:
 endpoints:
   br-ex:
     IP:
     - 172.16.0.7/24
     gateway: 172.16.0.1
   br-fw-admin:
     IP:
     - 10.20.0.7/24
   br-mgmt:
     IP:
     - 192.168.0.7/24
   br-prv:
     IP: none
   br-storage:
     IP:
     - 192.168.1.6/24
 interfaces:
   eth0:
     mtu: 1234
     L2:
       vlan_splinters: 'off'
   eth1:
     mtu: 4321
     L2:
       vlan_splinters: 'off'
   eth2:
     L2:
       vlan_splinters: 'off'

The “Transformations” Section

You can use four OVS primitives:

add-br – To add an OVS bridge to the system
add-port – To add a port to an existent OVS bridge
add-bond – To create a port in OVS bridge and add aggregated NICs to it
add-patch – To create an OVS patch between two existing OVS bridges

The primitives will be applied in the order they are listed.

Here are the the available options:

{
  "action": "add-br",         # type of primitive
  "name": "xxx"               # unique name of the new bridge
},
{
  "action": "add-port",       # type of primitive
  "name": "xxx-port",         # unique name of the new port
  "bridge": "xxx",            # name of the bridge where the port should be created
  "type": "internal",         # [optional; default: "internal"] a type of OVS
                              # interface # for the port (see OVS documentation);
                              # possible values:
                              # "system", "internal", "tap", "gre", "null"
  "tag": 0,                   # [optional; default: 0] a 802.1q tag of traffic that
                              # should be captured from an OVS bridge;
                              # possible values: 0 (means port is a trunk),
                              # 1-4094 (means port is an access)
  "trunks": [],               # [optional; default: []] a set of 802.1q tags
                              # (integers from 0 to 4095) that are allowed to
                              # pass through if "tag" option equals 0;
                              # possible values: an empty list (all traffic passes),
                              # 0 (untagged traffic only), 1 (strange behaviour;
                              # shouldn't be used), 2-4095 (traffic with this
                              # tag passes); e.g. [0,10,20]
  "port_properties": [],      # [optional; default: []] a list of additional
                              # OVS port properties to modify them in OVS DB
  "interface_properties": [], # [optional; default: []] a list of additional
                              # OVS interface properties to modify them in OVS DB
},
{
  "action": "add-bond",       # type of primitive
  "name": "xxx-port",         # unique name of the new bond
  "interfaces": [],           # a set of two or more bonded interfaces' names;
                              # e.g. ['eth1','eth2']
  "bridge": "xxx",            # name of the bridge where the bond should be created
  "tag": 0,                   # [optional; default: 0] a 802.1q tag of traffic which
                              # should be catched from an OVS bridge;
                              # possible values: 0 (means port is a trunk),
                              # 1-4094 (means port is an access)
  "trunks": [],               # [optional; default: []] a set of 802.1q tags
                              # (integers from 0 to 4095) which are allowed to
                              # pass through if "tag" option equals 0;
                              # possible values: an empty list (all traffic passes),
                              # 0 (untagged traffic only), 1 (strange behaviour;
                              # shouldn't be used), 2-4095 (traffic with this
                              # tag passes); e.g. [0,10,20]
  "properties": [],           # [optional; default: []] a list of additional
                              # OVS bonded port properties to modify them in OVS DB;
                              # you can use it to set the aggregation mode and
                              # balancing # strategy, to configure LACP, and so on
                              # (see the OVS documentation)
},
{
  "action": "add-patch",      # type of primitive
  "bridges": ["br0", "br1"],  # a pair of different bridges that will be connected
  "peers": ["p1", "p2"],      # [optional] abstract names for each end of the patch
  "tags": [0, 0] ,            # [optional; default: [0,0]] a pair of integers that
                              # represent an 802.1q tag of traffic that is
                              # captured from an appropriate OVS bridge; possible
                              # values: 0 (means port is a trunk), 1-4094 (means
                              # port is an access)
  "trunks": [],               # [optional; default: []] a set of 802.1q tags
                              # (integers from 0 to 4095) which are allowed to
                              # pass through each bridge if "tag" option equals 0;
                              # possible values: an empty list (all traffic passes),
                              # 0 (untagged traffic only), 1 (strange behavior;
                              # shouldn't be used), 2-4095 (traffic with this
                              # tag passes); e.g., [0,10,20]
}

A combination of these primitives allows you to make custom and complex network configurations.

NICs Aggregation

The NIC bonding allows you to aggregate multiple physical links to one link to increase speed and provide fault tolerance.

Documentation

The Linux kernel documentation about bonding can be found in Linux Ethernet Bonding Driver HOWTO
You can find shorter introduction to bonding and tips on link monitoring here
Cisco switches configuration guide
Switches configuration tips for Fuel can be found here

Types of Bonding

Open VSwitch supports the same bonding features as the Linux kernel. Fuel supports bonding either via Open VSwitch or by using Linux native bonding interfaces. Open vSwitch mode is supported in Fuel UI and should be used by default. You may want to fall back to Linux native interfaces if Open vSwitch bonding is not working for you or is not compatible with your hardware.

Linux supports two types of bonding: * IEEE 802.1AX (formerly known as 802.3ad) Link Aggregation Control Protocol (LACP). Devices on both sides of links must communicate using LACP to set up an aggregated link. So both devices must support LACP, enable and configure it on these links. * One side bonding does not require any special feature support from the switch side. Linux handles it using a set of traffic balancing algorithms.

One Side Bonding Policies:

Balance-rr – Round-robin policy. This mode provides load balancing and fault tolerance.
Active-backup – Active-backup policy: Only one slave in the bond is active.This mode provides fault tolerance.
Balance-xor – XOR policy: Transmit based on the selected transmit hash policy. This mode provides load balancing and fault tolerance.
Broadcast – Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.
balance-tlb – Adaptive transmit load balancing based on a current links’ utilization. This mode provides load balancing and fault tolerance.
balance-alb – Adaptive transmit and receive load balancing based on the current links’ utilization. This mode provides load balancing and fault tolerance.
balance-slb – Modification of balance-alb mode. SLB bonding allows a limited form of load balancing without the remote switch’s knowledge or cooperation. SLB assigns each source MAC+VLAN pair to a link and transmits all packets from that MAC+VLAN through that link. Learning in the remote switch causes it to send packets to that MAC+VLAN through the same link.
balance-tcp – Adaptive transmit load balancing among interfaces.

LACP Policies:

Layer2 – Uses XOR of hardware MAC addresses to generate the hash.
Layer2+3 – uses a combination of layer2 and layer3 protocol information to generate the hash.
Layer3+4 – uses upper layer protocol information, when available, to generate the hash.
Encap2+3 – uses the same formula as layer2+3 but it relies on skb_flow_dissect to obtain the header fields which might result in the use of inner headers if an encapsulation protocol is used. For example this will improve the performance for tunnel users because the packets will be distributed according to the encapsulated flows.
Encap3+4 – Similar to Encap2+3 but uses layer3+4.

Policies Supported by Fuel

Fuel supports the following policies: Active Backup, Balance SLB and LACP Balance TCP. These interfaces can be configured in Fuel UI when nodes are being added tho the environment or by using Fuel CLI and editing YAML configuration manually.

Network Verification in Fuel

Fuel has limited network verification capabilities when working with bonds. Network connectivity can be checked for the new cluster only (not for deployed one) so check is done when nodes are in bootstrap and no bonds are up. Connectivity between slave interfaces can be checked but not bonds themselves.

An Example of NIC Aggregation using Fuel CLI tools

Suppose you have a node with 4 NICs and you want to bond two of them with LACP enabled (“eth2” and “eth3” here) and then assign Private and Storage networks to them. The Admin network uses a dedicated NIC (“eth0”). The Management and Public networks use the last NIC (“eth1”).

To create bonding interface using Open vSwitch, do the following:

Create a separate OVS bridge “br-bond0” instead of “br-eth2” and “br-eth3”.
Connect “eth2” and “eth3” to “br-bond0” as a bonded port with property “lacp=active”.
Connect “br-prv” and “br-storage” bridges to “br-bond0” by OVS patches.
Leave all of the other things unchanged.

Here is an example of “network_scheme” section in the node configuration:

'network_scheme':
  'provider': 'ovs'
  'version': '1.0'
  'interfaces':
    'eth0': {}
    'eth1': {}
    'eth2': {}
    'eth3': {}
  'endpoints':
    'br-ex':
      'IP': ['172.16.0.2/24']
      'gateway': '172.16.0.1'
    'br-mgmt':
      'IP': ['192.168.0.2/24']
    'br-prv': {'IP': 'none'}
    'br-storage':
      'IP': ['192.168.1.2/24']
    'eth0':
      'IP': ['10.20.0.4/24']
  'roles':
    'ex': 'br-ex'
    'fw-admin': 'eth0'
    'management': 'br-mgmt'
    'private': 'br-prv'
    'storage': 'br-storage'
  'transformations':
  - 'action': 'add-br'
    'name': 'br-ex'
  - 'action': 'add-br'
    'name': 'br-mgmt'
  - 'action': 'add-br'
    'name': 'br-storage'
  - 'action': 'add-br'
    'name': 'br-prv'
  - 'action': 'add-br'
    'name': 'br-bond0'
  - 'action': 'add-br'
    'name': 'br-eth1'
  - 'action': 'add-bond'
    'bridge': 'br-bond0'
    'interfaces': ['eth2', 'eth3']
    'properties': ['lacp=active']
    'name': 'bond0'
  - 'action': 'add-port'
    'bridge': 'br-eth1'
    'name': 'eth1'
  - 'action': 'add-patch'
    'bridges': ['br-bond0', 'br-storage']
    'tags': [103, 0]
  - 'action': 'add-patch'
    'bridges': ['br-eth1', 'br-ex']
    'tags': [101, 0]
  - 'action': 'add-patch'
    'bridges': ['br-eth1', 'br-mgmt']
    'tags': [102, 0]
  - 'action': 'add-patch'
    'bridges': ['br-bond0', 'br-prv']

If you are going to use Linux native bonding, follow these steps:

Create a new interface “bond0” instead of “br-eth2” and “br-eth3”.
Connect “eth2” and “eth3” to “bond0” as a bonded port.
Add ‘provider’: ‘lnx’ to choose Linux native mode.
Add properties as a hash instead of an array used in ovs mode. Properties are same as options used during the bonding kernel modules loading. You should provide which mode this bonding interface should use. Any other options are not mandatory. You can find all these options in the Linux Kernel Documentation.

‘properties’:

‘mode’: 1
Connect “br-prv” and “br-storage” bridges to “br-bond0” by OVS patches.
Leave all of the other things unchanged.

'network_scheme':
  'provider': 'ovs'
  'version': '1.0'
  'interfaces':
    'eth0': {}
    'eth1': {}
    'eth2': {}
    'eth3': {}
  'endpoints':
    'br-ex':
      'IP': ['172.16.0.2/24']
      'gateway': '172.16.0.1'
    'br-mgmt':
      'IP': ['192.168.0.2/24']
    'br-prv': {'IP': 'none'}
    'br-storage':
      'IP': ['192.168.1.2/24']
    'eth0':
      'IP': ['10.20.0.4/24']
  'roles':
    'ex': 'br-ex'
    'fw-admin': 'eth0'
    'management': 'br-mgmt'
    'private': 'br-prv'
    'storage': 'br-storage'
  'transformations':
  - 'action': 'add-br'
    'name': 'br-ex'
  - 'action': 'add-br'
    'name': 'br-mgmt'
  - 'action': 'add-br'
    'name': 'br-storage'
  - 'action': 'add-br'
    'name': 'br-prv'
  - 'action': 'add-br'
    'name': 'br-bond0'
  - 'action': 'add-br'
    'name': 'br-eth1'
  - 'action': 'add-bond'
    'bridge': 'br-bond0'
    'interfaces': ['eth2', 'eth3']
    'provider': 'lnx'
    'properties':
      'mode': '1'
    'name': 'bond0'
  - 'action': 'add-port'
    'bridge': 'br-eth1'
    'name': 'eth1'
  - 'action': 'add-patch'
    'bridges': ['br-bond0', 'br-storage']
    'tags': [103, 0]
  - 'action': 'add-patch'
    'bridges': ['br-eth1', 'br-ex']
    'tags': [101, 0]
  - 'action': 'add-patch'
    'bridges': ['br-eth1', 'br-mgmt']
    'tags': [102, 0]
  - 'action': 'add-patch'
    'bridges': ['br-bond0', 'br-prv']

How Fuel upgrade works

Users running Fuel 6.0 can upgrade the Fuel Master Node to the latest release. See Upgrading and Updating from Earlier Releases for instructions. This section discusses the processing flow for the Fuel upgrade.

The upgrade is implemented with three upgrade engines (also called upgraders or upgrade stages). The engines are python modules that are located in a separate directory:

Host system engine — Copies new repositories to Fuel Master node, installs the fuel-6.1.0.rpm package and all the dependencies such as Puppet manifests, bootstrap images, provisioning images and so on.
Docker engine:
1. Point the supervisor to a new directory with the configuration files. Since it is empty, no containers will be started by the supervisor.
2. Stop old containers.
3. Upload new Docker images.
4. Run containers one by one, in the proper order.
5. Generate new supervisor configs.
6. Verify the services running in the containers.
OpenStack engine — Installs all data that is required for the OpenStack patching feature.
1. Adds new releases using the Nailgun REST API. This allows the full list of OpenStack releases to be displayed in the Fuel UI.

Design considerations:

The Docker engine does not use supervisord to run the services during upgrade because it can cause race conditions, especially if the iptables clean-up script runs at the same time. In addition, supervisord may not always be able to start all containers, which can result in NAT rules that have the same port number but different IP addresses.
Stopping containers during the upgrade process may interrupt non-atomic actions such as database migration in the Keystone container.
Running containers one by one prevents IP duplication problems that could otherwise occur during the upgrade because of a Docker IP allocation bug.
A set of pre-upgrade hooks are run before the upgrade engines to perform some necessary preliminary steps for upgrade. This is not the optimal implementation, but is required for Fuel to manage environments that were deployed with earlier versions that had a different design. For example, one of these hooks adds default login credentials to the configuration file before the upgrade process runs; this is required because earlier versions of Fuel did not have the authentication feature.

How the Operating System Role is provisioned

Fuel provisions the Operating System Role with either the CentOS or Ubuntu operating system that was selected for the environment but Puppet does not deploy other packages on this node or provision the node in any way.

The Operating System role is defined in the openstack.yaml file; the internal name is base-os. Fuel installs a standard set of operating system packages similar to what it installs on other roles; use the dpkg -l command on Ubuntu or the rpm -qa command on CentOS to see the exact list of packages that are installed.

A few configurations are applied to an Operating System role. For environments provisioned with the traditional tools, these configurations are applied by Cobbler snippets that run during the provisioning phase. When using image-based provisioning, cloud init applies these configurations. These include:

Disk partitioning. The default partitioning allocates a small partition (about 15GB) on the first disk for the root partition and leaves the rest of the space unallocated; users can manually allocate the remaining space.
The public key that is assigned to all target nodes in the environment
The Kernel parameters that are applied to all target nodes
Network settings configure the Admin logical networks with a static IP address. No other networking is configured.

The following configurations that are set in the Fuel Web UI have no effect on the Operating System role:

Mapping of logical networks to physical interfaces. All connections for the logical networks that connect this node to the rest of the environment need to be defined.
Debug logging
Syslog

See Configuring an Operating System node for information about configuring a provisioned Operating System role.

Two provisioning methods

There are two possible methods of provisioning an operating system on a node. They are:

Classic method — Anaconda or Debian-installer is used to build the operating system from scratch on each node using online or local repositories.
Image based method — A base image is created and copied to each node to be used to deploy the operating system on the local disks.

Starting with Mirantis Openstack 6.1, the image based method is used by default. It significantly reduces the time required for provisioning and it is more reliable to copy the same image on all nodes instead of building an operating system from scratch on each node.

Image Based Provisioning

Image based provisioning is implemented using the Fuel Agent. The image based provisioning process consists of two independent steps, which are:

Operating system image building.

This step assumes that we build an operating system image from a set of repositories in a directory which is then packed into the operating system image. The build script is run once no matter how many nodes one is going to deploy.

Currently, the CentOS image is built at the development stage and then this image is put into Mirantis OpenStack ISO and used for all CentOS based environments.

Ubuntu images are built on the master node, one operating system image per environment. We need to build different images for each environment because each environment has its own set of repositories. In order to deal with package differences between repository sets, we create an operating system image for each environment. When the user clicks the “Deploy changes” button, we check if the operating system package is already available for a particular environment, and if it is not, we build a new one just before starting the actual provisioning.

Copying of operating system image to nodes.

Operating system images that have been built can be downloaded via HTTP from the Fuel Master node. So, when a node is booted into the so called Bootstrap operating system, we can run an executable script to download the necessary operating system image and put it on a hard drive. We don’t need to reboot the node into the installer OS like we do when we use an Anaconda or Debian-installer. Our executable script in this case plays the same role. We just need it to be installed into the Bootstrap operating system.

For both of these steps we have a special program component which is called Fuel Agent. Fuel Agent is nothing more than just a set of data driven executable scripts. One of these scripts is used for building operating system images and we run this script on the master node passing a set of repository URIs and a set of package names to it. Another script is used for the actual provisioning. We run it on each node and pass provisioning data to it. These data contain information about disk partitions, initial node configuration, operating system image location, etc. So, this script being run on a node, prepares disk partitions, downloads operating system images and puts these images on partitions. It is necessary to note that when we say operating system image we actually mean a set of images, one per file system. If, for example, we want / and /boot be two separate file systems, then this means we need to separate the operating system images, one for / and another for /boot. Images in this case are binary copies of corresponding file systems.

Fuel Agent

Fuel Agent is a set of data driven executable scripts. It is written in Python. Its high level architecture is depicted below:

When we run one of its executable entry, we pass the input data to it where it is written what needs to be done and how. We also point out which data driver it needs to use in order to parse these input data. For example:

/usr/bin/provision --input_data_file /tmp/provision.json --data_driver nailgun

The heart of Fuel Agent is the manager fuel_agent/manager.py, which does not directly understand input data, but it does understand sets of Python objects defined infuel_agent/objects. Data driver is the place where raw input data are converted into a set of objects. Using this set of objects manager then does something useful like creating partitions, building operating system images, etc. But the manager implements only high-level logic for all these cases and uses a low-level utility layer which is defined in fuel_agent/utils to perform real actions like launching parted or mkfs commands.

The Fuel Agent config file is located in /etc/fuel-agent/fuel-agent.conf. There are plenty of configuration parameters that can be set and all these parameters have default values which are defined in the source code. All configuration parameters are well commented.

The Fuel Agent leverages cloud-init for the Image based deployment process. It also creates a cloud-init drive which allows for post-provisioning configuration. The config drive uses jinja2 templates which can be found in /usr/share/fuel-agent/cloud-init-templates. These templates are filled with values given from the input data.

Image building

When Ubuntu based environment is being provisioned, there is a pre-provisioning task which runs the /usr/bin/fa_build_image script. This script is one of the executable Fuel Agent entry points. This script is installed in the ‘mcollective’ docker container on the Fuel master node. As input data we pass a list of Ubuntu repositories from which an operating system image is built and some other metadata. When launched, Fuel Agent checks if there is a Ubuntu image available for this environment and if there is not, it builds an operating system image and puts this image in a directory defined in the input data so as to make it available via HTTP. See the sequence diagram below:

Operating system provisioning

The Fuel Agent is installed into a bootstrap ramdisk. An operating system can easily be installed on a node if the node has been booted with this ramdisk. We can simply run the/usr/bin/provision executable with the required input data to start provisioning. This allows provisioning to occur without a reboot unlike the classic provisioning method using Anaconda or Debian-installer.

The input data need to contain at least the following information:

Partitioning scheme for the node. This scheme needs to contain information about the necessary partitions and on which disks we need to create these partitions, information about the necessary LVM groups and volumes, about software raid devices. This scheme contains also information about on which disk a bootloader needs to be installed and about the necessary file systems and their mount points. On some block devices we are assumed to put operating system images (one image per file system), while on other block devices we need to create file systems using the mkfs command.
Operating system images URIs. Fuel Agent needs to know where to download the images and which protocol to use for this (by default, HTTP is used).
Data for initial node configuration. Currently, we use cloud-init for the initial configuration and Fuel Agent prepares the cloud-init config drive which is put on a small partition at the end of the first hard drive. Config drive is created using jinja2 templates which are to be filled with values given from the input data. After the first reboot, cloud-init is run by upstart or similar. It then finds this config drive and configures services like NTP, MCollective, etc. It also performs an initial network configuration to make it possible for Fuel to access this particular node via SSH or MCollective and run Puppet to perform the final deployment.

The sequence diagram is below:

Viewing the control files on the Fuel Master node

To view the contents of the bootstrap ramdisk, run the following commands on the Fuel Master node:

cd /var/www/nailgun/bootstrap
mkdir initramfs
cd initramfs
gunzip -c ../initramfs.img | cpio -idv

You are now in the root file system of the ramdisk and can view the files that are included in the bootstrap node. For example:

cat /etc/fuel-agent/fuel-agent.conf

Troubleshooting image-based provisioning

The following files provide information for analyzing problems with the Fuel Agent provisioning.

Bootstrap
- etc/fuel-agent/fuel-agent.conf — main configuration file for the Fuel Agent, defines the location of the provision data file, data format and log output, whether debugging is on or off, and so forth.
- tmp/provision.json — Astute puts this file on a node (on the in-memory file system) just before running the provision script.
- usr/bin/provision — executable entry point for provisioning. Astute runs this; it can also be run manually.
Master
- var/log/remote/node-N.domain.tld/bootstrap/fuel-agent.log — this is where Fuel Agent log messages are recorded when the provision script is run; <N> is the nodeID of the provisioned node.

Task-based deployment

Task schema

Tasks that are used to build a deployment graph can be grouped according to the common types:

- id: graph_node_id
  type: one of [stage, group, skipped, puppet, shell etc]
  role: [match where this tasks should be executed]
  requires: [requirements for a specific node]
  required_for: [specify which nodes depend on this task]

Stages

Stages are used to build a graph skeleton. The skeleton is then extended with additional functionality like provisioning, etc.

The deployment graph of Fuel 6.1 has the following stages:

- pre_deployment_start
- pre_deployment_end
- deploy_start
- deploy_end
- post_deployment_start
- post_deployment_end

Here is the stage example:

- id: deploy_end
  type: stage
  requires: [deploy_start]

Groups

In Fuel 6.1, groups are a representation of roles in the main deployment graph:

- id: controller
  type: group
  role: [controller]
  requires: [primary-controller]
  required_for: [deploy_end]
  parameters:
    strategy:
      type: parallel
        amount: 6

Note

Primary-controller should be installed when Controller starts its own execution. The execution of this group should be finished to consider deploy_end done.

Here is the full graph of groups, available in 6.1:

_images/groups.png

Strategy

You can also specify a strategy for groups in the parameters section. Fuel 6.1 supports the following strategies:

parallel – all nodes in this group will be executed in parallel. If there are other groups that do not depend on each other, they will be executed in parallel as well. For example, Cinder and Compute groups.
parallel by amount – run in parallel by a specified number. For example, amount: 6.
one_by_one – deploy all nodes in this group in a strict one-by-one succession.

Skipped

Making a task skipped will guarantee that this task will not be executed, but all the task’s depdendencies will be preserved:

- id: netconfig
  type: skipped
  groups: [primary-controller, controller, cinder, compute, ceph-osd,
           zabbix-server, primary-mongo, mongo]
  required_for: [deploy_end]
  requires: [logging]
  parameters:
    puppet_manifest: /etc/puppet/modules/osnailyfacter/other_path/netconfig.pp
    puppet_modules: /etc/puppet/modules
    timeout: 3600

Puppet

Task of type: puppet is the preferable way to execute the deployment code on nodes. Only mcollective agent is capable of executing code in background.

In Fuel 6.1, this is the only task that can be used in the main deployment stages, between deploy_start and deploy_end.

Example:

- id: netconfig
    type: puppet
    groups: [primary-controller, controller, cinder, compute, ceph-osd,
             zabbix-server, primary-mongo, mongo]
    required_for: [deploy_end]
    requires: [logging]
    parameters:
      puppet_manifest: /etc/puppet/modules/osnailyfacter/other_path/netconfig.pp
      puppet_modules: /etc/puppet/modules
      timeout: 3600

Shell

Shell tasks should be used outside of the main deployment procedure. Basically, shell tasks will just execute the blocking command on specified roles.

Example:

- id: enable_quorum
  type: shell
  role: [primary-controller]
  requires: [post_deployment_start]
  required_for: [post_deployment_end]
  parameters:
    cmd: ruby /etc/puppet/modules/osnailyfacter/modular/astute/enable_quorum.rb
    timeout: 180

Upload file

This task will upload data specified in data parameters to the path destination:

- id: upload_data_to_file
  type: upload_file
  role: '*'
  requires: [pre_deployment_start]
  parameters:
    path: /etc/file_name
    data: 'arbitrary info'

Sync

Sync task will distribute files from src direcory on the Fuel Master node to dst directory on target hosts that will be matched by role:

- id: rsync_core_puppet
  type: sync
  role: '*'
  required_for: [pre_deployment_end]
  requires: [upload_core_repos]
  parameters:
    src: rsync://<FUEL_MASTER_IP>:/puppet/
    dst: /etc/puppet
    timeout:

Copy files

Task with copy_files type will read data from src and save it in the file specified in dst argument. Permissions can be specified for a group of files, as provided in example:

- id: copy_keys
  type: copy_files
  role: '*'
  required_for: [pre_deployment_end]
  requires: [generate_keys]
  parameters:
    files:
      - src: /var/lib/fuel/keys/{CLUSTER_ID}/neutron/neutron.pub
        dst: /var/lib/astute/neutron/neutron.pub
    permissions: '0600'
    dir_permissions: '0700'

API

If you want to change or add some tasks right on the Fuel Master node, just add the tasks.yaml file and respective manifests in the folder for the release that you are interested in. Then run the following command:

fuel rel --sync-deployment-tasks --dir /etc/puppet

If you want to overwrite the deployment tasks for any specific release/cluster, use the following commands:

fuel rel --rel <id> --deployment-tasks --download
fuel rel --rel <id> --deployment-tasks --upload

fuel env --env <id> --deployment-tasks --download
fuel env --env <id> --deployment-tasks --upload

After this is done, you will be able to run a customized graph of tasks. To do that, use a basic command:

fuel node --node <1>,<2>,<3> --tasks upload_repos netconfig

The developer will need to specify nodes that should be used in deployment and task IDs. The order in which these are provided does not matter. It will be computed from the dependencies specified in the database.

Note

The node will not be executed, if a task is mapped to Controller role, but the node where you want to apply the task does not have this role.

Skipping tasks

Use the skip parameter to skip tasks:

fuel node --node <1>,<2>,<3> --skip netconfig hiera

The list of tasks specified with the skip parameter will be skipped during graph traversal in Nailgun.

If there are task dependencies, you may want to make use of a “smarter” traversal – you will need to specify the start and end nodes in the graph:

fuel node --node <1>,<2>,<3> --end netconfig

This will deploy everything up to the netconfig task, including it. This means, that this commands will deploy all tasks that are a part of pre_deployment: keys generation, rsync manifests, sync time, upload repos, including such tasks as hiera setup, globals computation and maybe some other basic preparatory tasks:

fuel node --node <1>,<2>,<3> --start netconfig

Start from netconfig task (including it), deploy all the tasks that are a part of post_deployment.

For example, if you want to execute only the netconfig successors, use:

fuel node --node <1>,<2>,<3> --start netconfig --skip netconfig

You will also be able to use start and end at the same time:

fuel node --node <1>,<2>,<3> --start netconfig --end upload_cirros

Nailgun will build a path that includes only necessary tasks to join these two points.

Graph representation

Beginning with Fuel 6.1, in addition to commands above, there also exists a helper that allows to download deployment graph in DOT DOT format and later render it.

Commands for downloading graphs

Use the following commands to download graphs:

To download a full graph for environment with id 1 and print it on the screen, use the command below. Note, that it will print its output to the stdout.
```
fuel graph --env <1> --download
```
To download graph and save it to the graph.gv file:
```
fuel graph --env <1> --download > graph.gv
```

It is also possible to specify the same options as for the deployment command. Point out start and end nodes in graph:

fuel graph --env <1> --download --start netconfig > graph.gv

fuel graph --env <1> --download --end netconfig > graph.gv

You can also specify both:

fuel graph --env <1> --download --start netconfig --end upload_cirros > graph.gv

To skip the tasks (they will be grayed out in the graph visualization), use:
```
fuel graph --env <1> --download --skip netconfig hiera  > graph.gv
```

To completely remove skipped tasks from graph visualization, use --remove parameter:

fuel graph --env <1> --download --start netconfig --end upload_cirros --remove skipped > graph.gv

To see only parents of a particular tasks:

fuel graph --env 1 --download --parents-for hiera  > graph.gv

Commands for rendering graphs

Downloaded graph in DOT format can be rendered. It requires additional packages to be installed:
- Graphviz using apt-get install graphviz or yum install graphviz commands.
- pydot-ng using pip install pydot-ng command or pygraphivz using pip install pygraphivz command.
After installing the packages, you can render the graph using the command below. It will take the contents of graph.gv file, render it as a PNG image and save asgraph.gv.png.
```
fuel graph --render graph.gv
```
To read graph representation from the st, use:
```
fuel graph --render -
```
To avoid creating an intermediate file when downloading and rendering graph, you can combine both commands:
```
fuel graph --env <1> --download | fuel graph --render -
```

FAQ

What can I use for deployment with groups?

In Fuel 6.1, it is possible to use only Puppet for the main deployment.

All agents, except for Puppet, work in a blocking way. The current deployment model cannot execute some tasks that are blocking and non-blocking.

In the pre_deployment and post_deployment stages, any of the supported task drivers can be used.

Is it possible to specify cross-dependencies between groups?

In Fuel 6.0 or earlier, there is no model that will allow to run tasks on a primary Controller, then run on a controlle with getting back to the primary Controller.

In Fuel 6.1, cross-dependencies are resolved by the post_deployment stage.

How I can end at the provision state?

Provision is not a part of task-based deployment in Fuel 6.1.

How to stop deployment at the network configuration state?

Fuel CLI call can be used: it will execute the deployment up to the network configuration state:

fuel node --node <1>,<2>,<3> --end netconfig

Additional task for an existing role

If you would like to add extra task for an existing role, follow these steps:

Add the task description to /etc/puppet/2014.2.2-6.1/modules/my_tasks.yaml file.

- id: my_task
type: puppet
groups: [compute]
required_for: [deploy_end]
requires: [netconfig]
parameters:
   puppet_manifest: /etc/puppet/modules/my_task.pp
   puppet_modules: /etc/puppet/modules
   timeout: 3600

Run the following command:

fuel rel --sync-deployment-tasks --dir /etc/puppet/2014.2.2-6.1

After syncing the task to nailgun database, you will be able to deploy it on the selected groups.

Skipping task by API or by configuration

There are several mechanisms to skip a certain task.

To skip a task, you can use one of the following:

Change the task’s type to skipped:

- id: horizon
type: skipped
role: [primary-controller]
requires: [post_deployment_start]
required_for: [post_deployment_end]

Add a condition that is always false:

 - id: horizon
type: puppet
role: [primary-controller]
requires: [post_deployment_start]
required_for: [post_deployment_end]
condition: 'true != false'

Do an API request:

fuel node --node <1>,<2>,<3> --skip horizon

Creating a separate role and attaching a task to it

To create a separate role and attach a task to it, follow these steps:

Create a file with redis.yaml with the following content:

meta:
  description: Simple redis server
  name: Controller
name: redis
volumes_roles_mapping:
  - allocate_size: min
    id: os

Create a role:

fuel role --rel 1 --create --file redis.yaml

After this is done, you can go to the Fuel web UI and check if a role redis is created.
You can now attach tasks to the role. First, install redis puppet module:
```
puppet module install thomasvandoren-redis
```
Write a simple manifest to /etc/puppet/modules/redis/example/simple_redis.pp and include redis.

Create a configuration for Fuel in /etc/puppet/modules/redis/example/redis_tasks.yaml:

# redis group
  - id: redis
    type: group
    role: [redis]
    required_for: [deploy_end]
    tasks: [globals, hiera, netconfig, install_redis]
    parameters:
      strategy:
        type: parallel

# Install simple redis server
  - id: install_redis
    type: puppet
    requires: [netconfig]
    required_for: [deploy_end]
    parameters:
      puppet_manifest: /etc/puppet/modules/redis/example/simple_redis.pp
      puppet_modules: /etc/puppet/modules
      timeout: 180

Run the following command:

fuel rel --sync-deployment-tasks --dir /etc/puppet/2014.2.2-6.1/

Create an enviroment. Note the following:
- configure public network properly since redis packages are fetched from the upstream.
- enable Assign public network to all nodes option on the Settings tab of the Fuel web UI.

Provision the redis node:

fuel node --node <1> --env <1> --provision

Finish the installation on install_redis (there is no need to execute all tasks from post_deployment stage):
```
fuel node --node <1> --end install_redis
```

Swapping a task with a custom task

To swap a task with a custom one, you should change the path to the executable file:

- id: netconfig
  type: puppet
  groups: [primary-controller, controller, cinder, compute, ceph-osd, zabbix-server, primary-mongo, mongo]
  required_for: [deploy_end]
  requires: [logging]
  parameters:
      # old puppet manifest
      # puppet_manifest: /etc/puppet/modules/osnailyfacter/netconfig.pp

      puppet manifest: /etc/puppet/modules/osnailyfacter/custom_network_configuration.pp
      puppet_modules: /etc/puppet/modules
      timeout: 3600

The Fuel Master node containers structure

Most services hosted on the Fuel Master node, require connectivity to PXE network. The services used only for internal Fuel processes (such as Nailgun and Postgres) are limited to local connections only.

Containers structure

Container	Ports	Allow connections from
Cobbler	TCP 80, 443 UDP 53, 69	PXE network only
Postgres	TCP 5432	the Fuel Master node only
RabbitMQ	TCP 5672,4369 15672,61613	PXE network only
Rsync	TCP 873	PXE network only
Astute	none	N/A
Nailgun	TCP 8001	the Fuel Master node only
OSTF	TCP 8777	the Fuel Master node only
Nginx	TCP 8000,8080	the Fuel Master node only
Rsyslog	TCP 8777,25150 UDP 514	PXE network only
MCollective	none	N/A
Keystone	TCP 5000,35357	PXE network only

Fuel Repository Mirroring

Starting in Mirantis OpenStack 6.1, the location of repositories now extends beyond just being local to the Fuel Master. It is now assumed that a given user will have Internet access and can download content from Mirantis and upstream mirrors. This impacts users with limited Internet access or unreliable connections.

Internet-based mirrors can be broken down into three categories:

Ubuntu
MOS DEBs
MOS RPMs

There are two command-line utilities, fuel-createmirror and fuel-package-updates, which can replicate the mirrors.

Use fuel-createmirror for Ubuntu and MOS DEBs packages.

Use fuel-package-updates for MOS RPMs packages.

fuel-createmirror is a utility that can be used as a backend to replicate part or all of an APT repository. It can replicate Ubuntu and MOS DEBs repositories. It uses rsync as a backend. See Downloading Ubuntu system packages.

fuel-package-updates is a utility written in Python that can pull entire APT and YUM repositories via recursive wget or rsync. Additionally, it can update Fuel environment configurations to use a given set of configuration.

Issue the following command to check the fuel-package-updates options:

fuel-package-updates -h

Note

If you change the default password (admin) in Fuel web UI, you will need to run the utility with the --password switch, or it will fail.

Mirantis OpenStack 6.1 Network Performance Changes and Results

Architecture in 6.1 as compared to 6.0

The network architecture in Mirantis OpenStack 6.1 has undergone considerable changes when compared to Mirantis OpenStack 6.0 and older releases.

In Mirantis OpenStack 6.0 (MOS 6.0) bridging, bonding, and VLAN segmentation were provided by Open vSwitch.

In Mirantis OpenStack 6.1 (MOS 6.1) Open vSwitch provides infrastructure only required for Neutron. All other networks, bridges, bonds are provided by native Linux means.

_images/6061network.png

Mirantis OpenStack 6.1 Network Performance Hardware

The following hardware was used to run the network performance tests:

Compute nodes:
- Node is a part of DELL microcloud with 10Gbps NICs.
- Node has 4xIntel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz, 32GB RAM.
Network infrastructure:
- 10GE network built by one Dell PowerConnect 8132 10GE switch (PCT8132).

Storage network performance

Storage network on the 10ge interface.

The following results were achieved for default MTU and no NIC tuning:

Centos/MOS-6.0 — 9.4 Gbit/s
Ubuntu/MOS-6.0 — 8.3 Gbit/s
Ubuntu/MOS-6.1 — 9.4 Gbit/s

The following results were achieved for MTU=9000 and NIC with offloading enabled:

CentOS/MOS-6.0 — 9.9 Gbit/s
Ubuntu/MOS-6.0 — 9.4 Gbit/s
Ubuntu/MOS-6.1 — 9.9 Gbit/s

Virtual network (VM to VM) performance (VLAN segmentation)

Private network on the 10ge interface.

The following results were achieved for default MTU and no NIC tuning:

CentOS/MOS-6.0 — 2.8 Gbit/s
Ubuntu/MOS-6.0 — 3.8 Gbit/s
Ubuntu/MOS-6.1 — 3.3 Gbit/s

The following results were achieved for MTU=9000 and NIC with offloading enabled:

CentOS/MOS-6.0 — 7.4 Gbit/s
Ubuntu/MOS-6.0 — 9.9 Gbit/s
Ubuntu/MOS-6.1 — 9.9 Gbit/s

Virtual network (VM to VM) performance (GRE segmentation)

The following results were achieved for Mirantis OpenStack 6.1 Ubuntu based environments:

Non-optimized network — 3.5 Gbit/s
Optimized network — 9.7 Gbit/s

source:
https://docs.mirantis.com/openstack/fuel/fuel-6.1/reference-architecture.html

http://sdnfv.blogspot.com/2015/10/openstack-environment-architecture.html