The very small 2-person DevOps team within Red Hat Performance/Scale Engineering has developed a set of Open Source Python-based systems and network automation provisioning tools designed to end-to-end automate the provisioning of large-scale systems and network switches using tools like Foreman, Ansible, and other Open Source bits.
QUADS – or “quick and dirty scheduler” allows a normally overburdened DevOps warrior to fully automate large swaths of systems and network devices based on a schedule, even set systems provisioning to fire off in the future so they can focus on important things like Netflix and popcorn or not reading your emails while your datacenter burns in an inferno of rapid, automated skynet provisioning. QUADS will also auto-generate up-to-date infrastructure documentation, track scheduling, systems assignments and more.
In this talk we’ll show you how we’re using QUADS (backed by Foreman) to empower rapid, meaningful performance and scale testing of Red Hat products and technologies. While QUADS is a new project and under constant development, the design approach to handling large-scale systems provisioning as well as the current codebase is consumable for others interested in improving the efficiency and level of automation within their infrastructure.
On the "what happens if a node fails" question:
QUADS automated network validation that will not release any machines in a group of machines (cloud assignment) until it all passes. We get alerted via email when this happens along with the name(s) of the trouble machines, and rechecks happen until they pass validation. On the systems/provisioning side this mostly catches issues too as they need a valid OS to perform the network tests which usually means they were provisioned as expected too. Sometimes however this fails, so we have a github issue open to implement system-level verification using the Foreman API:
https://github.com/redhat-performance/quads/issues/48
There is more granular detail on slide 20 if you're interested: https://hobosource.files.wordpress.com/2016/11/skynet_quads_europython_2017_wfoster.pdf