Using the number of CPUs as a default for the workers leads to problems with big setups

Bug #2063451 reported by Martin Morgenstern
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
Unassigned

Bug Description

Nova uses the CPU count as a default for the following worker settings, which is problematic for people deploying on machines with a large number of CPUs:

    [DEFAULT]
    osapi_compute_workers=
    metadata_workers=

    [conductor]
    workers=

    [scheduler]
    workers=

In our case, it is a setup with >100 CPUs where the huge number of workers lead to a lot of traffic to the cell1 database (MariaDB Galera) for an otherwise empty OpenStack cluster, which in turn quickly filled the database filesystem because of the growing MariaDB binlog. These problems disappeared as soon as we explicitely configured the workers for nova-scheduler and nova-conductor with a count of 8, each (we also lowered the other workers for the sake of consistency).

I suggest that nova should apply a limit for the default. I couldn't find guidelines for the worker counts in the nova docs – however, according to other OpenStack projects, there seems to be some kind of concensus of using a worker count way below 20:

* Kolla Ansible sets a maximum of 5 workers [1]
* puppet-openstacklib sets a maximum of 12 workers [2]

[1] https://github.com/openstack/kolla-ansible/blob/5a663aec1dc6ede45a860eecab84af05cd06b67f/ansible/group_vars/all.yml#L742
[2] https://github.com/openstack/puppet-openstacklib/blob/master/lib/facter/os_workers.rb#L45

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.