The OpenStack Design Summit was held April 25-29, 2016, in Austin, TX. OpenStack contributors got together to discuss key issues and to plan for the Newton release. This is the first in a series providing a summary of key items that were discussed by the Nova team during that time.
Note that these recaps will be Nova-centric, focusing primarily on sessions and discussions pertaining to Nova interests. Also note that these are not comprehensive recaps, and links to additional resources will be provided later in the articles for reference.
The Nova PTL, Matt Riedemann, wrote up summaries for key Nova sessions. For each item, I’ll provide the link to the openstack-dev mailing list archive of his summary and some TL;DR bullet points.
Nova Newton Priorities Tracking
The Nova team maintains an etherpad with an updated list of items to review. This helps the team maintain focus on key issues during the cycle.
All Open Specifications
Here is a link to all open specifications in Nova that need reviewing. As you can see, there are a lot, so cross referencing with the Priorities etherpad linked above is useful to narrow things down.
The long term road map is to move the Scheduler out of Nova and to modularize it to allow it to use external placement decision libraries. This is a long process with incremental changes happening each cycle, keeping backwards compatibility in mind and minimizing the impact to end users as these changes happen.
PCI and NUMA database differences need to be addressed in order to move forward
- NUMA is stored differently than PCI in the database
- NUMA topology including compute node information (capacity, allocation, usage, etc) is stored in json blob in the same field in compute_nodes.NUMA_topology
- PCI devices stores information in a different table
- PCI request instance is stored in instance_extra
- Goal is to split those things into an inventories table and allocations table
- Caller should not be aware of any backend changes, resource tracker needs to still show the same values
The allocations/inventories table will go into the API database
- Deployers and operators were already unhappy with a new API db, so new db would make them even more unhappy
- Ultimately Scheduler will be moved into its own thing, but this is an interim solution
Capabilities are still undefined
- Proposal: a capability is a single value representing a specific feature
- Proposal: Create a set of enum classes that very distinctly describe what a particular capability is
- We need to consolidate the different values from different resources
- eg, libvirt returns a feature flag, vmware returns something else, we need to provide a unified value to provide to the caller
- This is what is currently proposed:
Completed in Mitaka
- Resource Providers Database Schema
- Online data migration for inventory (capacity, reserve amounts, cpu, disk, etc)
Slated for Newton
- Migrating allocation fields for instances
- Define what a capability is (and isn’t) and define a standard representation
- Generic Resource Pool Object Modeling + REST API
- blueprint: generic-resource-pools
- Migration of NUMA fields
One of the biggest challenges Nova faces is its own upper limit. The concept of “cells”, or the idea of many independent compute “containers” running simultaneously, was born as a solution this problem. Unfortunately the Cells v1 effort was not successful at addressing this issue but some important lessons were learned. Cells v2 is the current attempt to solve the compute scalability issue and to generally improve the Nova code base as a handy side-effect.
Andrew Laski gave a fantastic talk providing an overview of Cells v1 and the plan for Cells v2 in Newton. This talk provides some great context and high level architecture. You can view the talk on here.
One key difference between Cells v1 and Cells v2 is that Cells v1 implemented the cells architecture as an alternate path (with all the complexity maintaining an alternate path brings) whereas Cells v2 will be *the only* path. The default would be a single cell deployment with one Compute instance living in one cell.
- The API cell is the cell responsible for running the Nova API to handle requests to instances (even ones located in other cells)
- Has its own API Database
- The Scheduler lives in this cell and manages instance “scheduling” from here
- Cell 0 is a special cell that lives outside of the regular cell hierarchy.
- It is the default location in the event of instance “scheduling” failures
- Cell 0 is also the default cell in a single-cell deployment (ie, Devstack)
- Cell 0 can be combined with the API cell for simplicity
- In Cells v2, managing instance requests requires additional data in order to route messages to the appropriate cell.
- Instead of just looking up the hostname of the compute node where an instance lives (what we currently do), we will also need the database and queue connection information.
- Database is split between “local” and “global” data.
- Local – stuff that only the compute nodes within the cell need to know about
- Global – stuff the Nova API (living in the API cell) needs to know about to handle instance requests
Completed in Mitaka
- Created the API database
- Database connection switching – tell Nova which database connection to use
- Tools to help with upgrades
- Scheduling Interaction focused items
- Implement BuildRequest object + storage in database
- Persist instance data when a boot request is received by the API but the instance hasn’t been created and written to the database yet
- RequestSpec Object to persist instance details needed for allocation
- Used internally by the BuildRequest object and contains the specific “scheduling” details
- Make cell id nullable
- A null cell_id defines a state where a boot request was received, but the instance still needs to be allocated to a cell. Because there is no cell_id, information needs to come from the BuildRequest object rather than from the database.
Slated for Newton
- Data Migration to the API database
- Key Pairs
- Implementation of Cell 0
- Creation of additional upgrade tools
- Message Queue connection switching
- specify which message queue a message should go to
- Start work on multiple cells support
Editor’s Note: Please feel free to contact me or post comments with any corrections!