OpenStack Newton Design Summit CheatSheet Part 2: Wednesday Nova Sessions

This is the second in a series of Design Summit “Cheatsheet” posts that provide high level summaries and boiled down bullet points of the upcoming sessions. While these summaries are by no means comprehensive, they should provide enough high level context to get the gist of the discussions.

Part 2 covers select Nova Sessions hosted on Wednesday, April 27. The sessions are ordered chronologically per the schedule. Please see the link for the official session description and any additional documented provided by the session organizer.

Nova: Scheduler and resource tracking evolution


Note that this will be a double session and will take up two time slots.


The Nova scheduler decides which host should launch a VM. The default filter scheduler is the Nova filter scheduler (nova.scheduler.filter_scheduler.FilterScheduler). It uses weighting and filtering to make decisions.
When the Scheduler gets a resource request, it first applies filters to determine the list of available hosts. Filtering is a binary decision, either a host meets the filter criteria or it doesn’t. Weights are then applied to rank the available hosts and determine the most suitable one in response to the request. See the Configuration References for more details on how filters and weights can be configured.
Key Projects:
  • Resource Classes
    • Problem: Adding new types of resources causes potential downtime for OpenStack users because this requires database schema changes.
    • Solution: Specify a resource pools type (see below in “Goals for Newton” section)
  • Compute Node Inventory
    • Problem: Resource information reported back to end-users is inaccurate because Nova only calculates resource usage and availability for compute resources available in the Nova resources. External resource information is not stored in Nova’s database and is therefore excluded from these calculations.
    • Solution:
      • Store inventory data in its own table rather than in the compute_nodes table
      • Add a globally unique id for the resource provider (requires deployer downtime for db migration
        • all compute nodes are resource providers, but not all resource providers are compute nodes.
Further Reading:

 Goals for Newton

  • Add mechanism to migrate existing compute_nodes inventory data to the allocations table
    • move resource allocation amount data in the `compute_nodes`and `instance_extra` tables to the new `allocations` table via a method called _migrate_allocations() in nova.Object.ComputeNode
    • called when unmigrated compute node object is detected
    • Updates to Nova code for the new Inventories and Allocations tables
      • `vcpus_used`, `memory_mb_used`, and `local_gb_used` fields read via a single query to the `allocations` table
      • `nova.objects.Instance` object’s `save()` method handles allocation field writes
  • Create support for Resource Pools in Nova
    • Create a generic “resource pools” type
      • Thin object model layer over the resource_pools table
      • Allows querying for the aggregates associated with the resource pool along with the inventory and allocation records for the pool
    • Create a lookup table “resource_provider_aggregates” that maps an aggregate to one or more resource pools
    • Implement REST API interface for admin users and applications to interact with resource pools
Discussion Points:

Nova Cells v2

Note that this will be a double session and will take up two time slots.



Cells v1 is/was an experimental feature developed to address the scaling limitations of multiple compute nodes using the same messaging queue and database in the default nova setup. Hosts are partitioned into cells which are configured in a tree structure. The topmost, parent cell runs only the nova-api and the other cells running all nova services except for the api. Each individual cell has its own database and message queue.
Cells is referred to as Cells v1 and Cells v2. Cells v1 was the original implementation and new deployments using it are not encouraged. The Cells v2 effort seeks to implement the same concept but with improvements. The goal is to have Nova installs default to a “single cell” configuration using this model.
For more details, see the Cells developer documentation –
(UPDATE: it looks like that link was updated since I wrote this summary originally. I apologize for any confusion and have updated my summary to be a little more clear).

Goals for Newton

Nova: Live Migration



 Live Migration, in a nutshell, is the ability to move a host without interruption (ie, shut down).
Editor’s Note: There are a lot of articles around the web that provide step-by-step instructions but official OpenStack docs are pretty limited. I am still working on a good summary, so sorry for the brevity!
Further Reading:

Discussion Points

Editor’s Note: Some of these summaries are a little quick and dirty. I might come back and update them if I have time.
  • See
  • Improve Live Migration API usability
    • Problem 1: block_migration flag can conflict with the host flag
      • Block migration requires the source and destination hosts to be on different storage
        • Non-block-migration requires the source and destination hosts to be on the same shared storage
      • Host is a required flag that supports a value of “None”
        • When None is specified, the Scheduler picks a host
      • The user has no way of knowing whether the target host the Scheduler picks is using the same shared storage as the source host
    • Problem 2: The flag disk_over_commit exposes too much unnecessary information about the underlying hypervisor
      • If True, libvirt virt driver checks the image’s virtual size for usable disk space; if False, it checks the actual size
      • Most users don’t care about checking the actual vs the virtual size, they just want to use the same resource usage policy as the Scheduler
    • Solutions:
      • Make both the block_migration and host flags optional with None as the default value
      • Remove the disk_over_commit flag and the libvirt disk usage check
        • This should be using the ResourceTracker, for consistency, instead of doing its own thing anyways, but that is outside of the scope of this spec
      • Add a reference URL to the response header to allow the user to query for migration details
  • Accept a list of hosts to mitigate failure
  • Use Libvirt Storage Pool Methods to Migrate Libvirt Volumes
  • Improve communication between compute nodes during livemigration
  • Support live migration with macvtap SR-IOV
  • Update the networks’ VIF information when an instance is live migrated
  • Adds post-copy live migration support to Nova
  • Add instance availability profiles to Nova
  • Decrease required human intervention for Live Migrations
  • Allow Live Migration of Rescued Instances
    • Problem: The xml file that the libvirt driver creates when an instance is rescued is not necessary to restore the instance and reliance on that file prevents live migration of a rescued instance
      • The live migration process already preserves the current instance status whether the migration successful or rolled back.
    • Solution:
      • Change the libvirt driver rescue() method to not generate the xml file
      • Remove the attempt to read the file from the unrescue() method
      • Allow the live migration process to migrate instances in a rescued state




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s