Jenkins, David (CfAI, Department of Physics, Durham University, DH1 3LE, UK), Basden, Alastair (CfAI, Department of Physics, Durham University, DH1 3LE, UK), Myers, Richard (CfAI, Department of Physics, Durham University, DH1 3LE, UK)
The next generation of ELTs massively increase both the computational demands and complexity of AO real-time control systems requiring investigation into architectures that are both flexible and computationally performant. Current technologies such as FPGAs, GPUs and standard CPUs are either complex to develop for or would require far too many individual processing units for the demands of the RTC. The Intel Xeon Phi is proposed as a platform to support the computational demands of the real-time control systems for ELT scale telescopes and to provide a standard CPU like development environment. Other architectures have also been considered such as the IBM Power8 and many-core ARM CPU systems. The large number of processor cores and high bandwidth memory of the Xeon Phi and Power8 allow for massive parallelisation of the workload whilst also allowing these systems to act as socketed host CPUs. Being the host and not an accelerator removes many of the latency and/or bandwidth concerns that come with offloading and allows existing software to be compiled and run without any modifications; certain optimisations then allow the software to take full advantage of the system's hardware. A single Knights Landing Xeon Phi can currently demonstrate ELT scale SCAO running at a frame-rate $> 1kHz$ (which we hope to improve on) showing promise not only for it to act as the sole RTC processor in an SCAO system but also as a competent computational node in a distributed computing architecture for more complex AO regimes such as ELT MCAO, LTAO, and MOAO where the computational demands are much higher and the data sharing is much more complex. We will present an overview of the current generation Knights Landing Xeon Phi along with an overview of the investigation into the software and hardware optimisations to improve the performance of an ELT scale RTC system for many-core architectures.
10.26698/AO4ELT5.0046- Proceeding PDF