The dense packing of components enables high levels of redundancy; there is space to fit extra components if the power budget is not exceeded. The cost of the extra components is minimized for bare die chips and chiplets since packaging material costs and yield factors are eliminated. Accordingly, an adaptable and resilient computer system may comprise redundant components and redundant tiles, with an infrastructure for intelligently switching them in and out as required. For example, recent machine learning applications may be switched between different tiles for training, inference, and analysis, where the power budget may be similar for each. Thus, the proposed architecture allows a single system to implement multiple workloads, saving space and reducing hardware cost at the expense of more complex software.
The block diagram shows the use of test/monitor chips (TMCs) and Power Distribution Devices (PDDs) under control of a system controller. Tiles are depicted as independently operable clusters of components on both the top side and the bottom side. Redundancy is supported via selectable devices, redundant selectable devices, and tiles networked together in reconfigurable arrangements. For example, CXL networking protocols may provide communication options for high capacity and/or low latency.