Abiquo scheduler

1 Design
2 Implementation
3 Undeploy
4 Deploy
- 4.1 Scheduler steps
5 Reconfigure

Design

The scheduler design enables concurrent scheduling requests without incurring trouble with stale data.

The scheduling model isolates the current state of the computational resources (CPU, RAM and storage capacity) in a table that is 'outside' the ORM.

In order to avoid race conditions and stale data, scheduling must be serial. This means that at a given time, only one scheduling operation can be in progress.

Most of the rules and filters to select a hypervisor for a VM have a very low rate of change. Many of them are set when configuring the infrastructure and never change. These rules will be stored in the DB and inside the ORM:

CPU total after over-subscription rules
RAM total after over-subscription rules
Storage size
Allocation rules (progressive, performance).

The write-intensive data will be outside the ORM and updates will be made using JDBC.

The platform must also manage:

the virtual infrastructure check that updates the resources
the infrastructure check
handlers that synchronize the outcome of the tasks with Abiquo DB, so they must update Redis.

All of the above must queue all requests into the same queue as the common hypervisor-related operations (undeploy, deploy, reconfigure).

This means that they require jobs in the UNDEPLOY, DEPLOY and RECONFIGURE tasks.

UNDEPLOY has FREE_RESOURCES to delete the relation between the VM and the hypervisor and datastore. This also frees the resources that were used by the VM.
DEPLOY has SCHEDULE to reserve the resources and set the relation between the VM and the hypervisor and datastore.
RECONFIGURE has UPDATE_RESOURCES to update the resource usage of the VM from the reconfigure task.

Implementation

To maintain data consistency at all times, the scheduler implementation is asynchronous and uses a queue to process the requests one by one in a single instance of the API.

All access to compute resource data is serialized using a JVM lock.

A queue serializes all scheduling task requests and two auxiliary queues set the priority of the processes (more on this later). All API nodes are producers to this queue and add messages to it. In a multi-node setup there is an API that is the Leader, which means it is in charge of consuming these messages.

Messages that are queued:

UNDEPLOY: All APIs in the cluster can queue this, but only one (the leader) will consume it.
DEPLOY: All APIs can queue.
RECONFIGURE: All APIs can queue.
UPDATES: These are special messages that are queued by the API that performs the periodic check. The infrastructure check must update the data in the DB and this is also a message in this queue.

Undeploy

The Undeploy queues receive the undeploy operations in two parts: DECONFIGURE and FREE_RESOURCES. The platform deletes the VM in the hypervisor, and then queues a FREE_RESOURCES job. This will be consumed eventually by the API that is in charge of the process, so there are no timeouts.

The task now looks like this:

POWER OFF (optional)
DECONFIGURE: Takes care of virtual factory and only performs the destroy and related operations in the hypervisor.
FREE_RESOURCES: The API that is in charge applies the changes in the Abiquo database. This involves deleting the relation between the VM and datastore (and related shared resources) and the VM, and updating the free resources.

Deploy

When a deploy is requested, Abiquo performs a check on the layer and limits. This is done again in the actual selection of the hypervisor but also worth doing in advance so the HTTP request can be marked and the client can act. When the queuing is successful, the leader selects the hypervisor according to the resources and business constraints.

The task looks like:

SCHEDULE: Select the hypervisor and reserve the resources.
CONFIGURE: Define the machine on the hypervisor (virtual factory).
POWER ON: Power on the machine on the hypervisor.

Scheduler steps

Rules in DB:
- Rack selection
- VLAN available
- Reserved physical machines
- Enterprise exclusion rules
- Layer
- Over subscription

Reconfigure

When a reconfigure is processed, the update of the resources is performed in an asynchronous fashion.

The virtual machine must be in state OFF.

The task:

RECONFIGURE