# Knocking at your back door (O.H.D.W.M.I.A.C.A.Y.S.) ### **ARM** Marc Zyngier <marc.zyngier@arm.com> ELCE16 October 13, 2016 @ ARM 2016 #### Content - Basics of an interrupt - Interrupt controllers - Linux's data structures - Chained interrupt controllers - Hierarchical interrupt controllers - Generic MSIs - • • - Profit! # Please interrupt me #### Talk you should not have missed - IRQs: the Hard, the Soft, the Threaded and the Preemptible - Alison Chaiken, Peloton Technology - Took place on Tuesday<sup>1</sup> - Covers the dynamic aspects of interrupt handling <sup>&</sup>lt;sup>1</sup>Use your TARDIS or wait for it to appear on some website #### What is an interrupt? - A hardware signal - Emited from a peripheral to a CPU - Indicating that a device-specific condition has been satisfied #### Multiplexing interrupts - Having a single interrupt for the CPU is usually not enough - Most systems have tens, hundreds of them - An interrupt controller allows them to be multiplexed - Very often architecture or platform specific - Offers specific facilities - Masking/unmasking individual interrupts - Setting priorities - SMP affinity - Exotic things like wake-up interrupts #### Multiplexing interrupts - Having a single interrupt for the CPU is usually not enough - Most systems have tens, hundreds of them - An interrupt controller allows them to be multiplexed - Very often architecture or platform specific - Offers specific facilities - Masking/unmasking individual interrupts - Setting priorities - SMP affinity - Exotic things like wake-up interrupts GIC-400, simplified view #### Interrupt triggers - Level triggered (high or low) - Indicates a persistent condition - An action has to be performed on the device to clear the interrupt - Edge triggered (rising or falling) - Indicates an event - May have happened once or more... - Some systems do not expose the trigger type to software - Either the interrupt is abstracted (virtualization) - Or this is more an exception than an interrupt... # "And now for something completely different..." Monty Python's Flying Circus #### How does Linux deal with interrupts - struct irq chip - A set of methods describing how to drive the interrupt controller - Directly called by core IRQ code - struct irgdomain - A pointer to the firmware node for a given interrupt controller (fwnode) - A method to convert a firmware description of an IRO into an ID local to this interrupt controller (hwird) - A way to retrieve the Linux view of an IRQ from the hwirq - struct irq desc - Linux's view of an interrupt - Contains all the core stuff - I:I mapping to the Linux interrupt number - struct irg data - Contains the data that is relevant to the irq chip managing this interrupt - Both the Linux IRO number and the hwird - A pointer to the irg chip - Embedded in irg desc (for now) #### In a nutshell - CPU gets an interrupt - Find out the hwirq from the interrupt controller - Usually involves reading some HW register - Look-up the irq\_desc into the irqdomain using the hwirq - Actually returns an IRQ number, which is equivalent to the irq\_desc - The core kernel then handles the interrupt #### Multiplexing more interrupts - Not enough interrupts lines? - Dedicate a single line for a secondary interrupt controller - And add more stuff to it! - Requires two level handling - First handle the interrupt on the primary interrupt controller - Then at the secondary one to find out which device has caused the interrupt - See irq\_set\_chained\_handler\_and\_data, chained\_irq\_enter, chained\_irq\_exit - Never treat this as a normal interrupt handler - Used in each and every x86 system - The infamous i8259 cascade - You can also share a single interrupt between devices - And that really stinks. Please avoid doing it if possible. #### Chained irqchips, the irqdomain view - Each interrupt controller has its own irqdomain - The kernel deals with two interrupts - and two interrupt handlers - the first one being a chained handler - convention is to stash a pointer to the secondary domain inside the top-level irq desc - We walk the interrupt chain in reverse order - Once we reach the last level irq\_desc, we can process the actual interrupt handler #### The DT view - A secondary irqchip points to the one implementing the first level - Use interrupts to describe the signal path between irqchips - The secondary chip owns the cascade interrupt - It doesn't appear in /proc/interrupts - Use interrupt-parent to point the device at the right interrupt controller ``` | interrupt-parent = <&gic>; gic: interrupt-controller@01c81000 compatible = "arm, cortex-a7-gic", "arm, cortex-a15-gic"; interrupt-controller; #interrupt-cells = <3>; interrupts = <GIC PPI 9 (GIC CPU MASK SIMPLE(4) | IRO TYPE LEVEL HIGH) >; 9 1: 10 II nmi intc: interrupt-controller@01c00030 12 compatible = "allwinner,sun7i-a20-sc-nmi"; 13 interrupt-controller: #interrupt-cells = <2>; 14 15 interrupts = <GIC SPI 0 IRQ TYPE LEVEL HIGH>; 16 }; 17 18 axp209: pmic@34 19 interrupt-parent = <&nmi intc>; 20 interrupts = <0 IRQ TYPE LEVEL LOW>; 21 }; ``` #### When multiplexing doesn't fit - There is more than just cascading irqchips - Some setups have a 1:1 mapping between input and output - Interrupt routers - Wake-up controllers - Programmable line inverters - Most of them are not interrupt controllers - Still, they do impact the interrupt delivery - We choose to represent them as irq\_chip - This is a hierarchical/stacked configuration - The chained irqchip paradigm doesn't match it #### Hierarchical (stacked) IRQ domains - We want the same irq\_desc to be valid across all irqchips - This ensures that the Linux IRQ number is unique for a given signal path - For a given irq\_desc, each irqchip should be responsible for the hwirq - This fits the irq\_data properties - Most of the data structures now have a parent field representing the hierarchy - The handling is done by walking the signal path in delivery order - A given irqchip can perform some local action before forwarding the request to its parent - Or even terminate the handling early #### Hierarchical domains, the DT view - Each intermediate irqchip points to its parent - Do not use interrupts to describe the signal path between irqchips - Use a device-specific property to decribe an interrupt range/space if necessary - The root irqchip points to itself - A DT oddity... - Devices can point to any element of the stack - The device interrupt specifiers must match the first irqchip in the signal path ``` | interrupt-parent = <&sysirg>; 3 sysing: intpol-controller@10200620 { interrupt-controller; #interrupt-cells = <3>; interrupt-parent = <&gic>; 7 }; 9 gic: interrupt-controller@10231000 { #interrupt-cells = <3>; 11 interrupt-parent = <&gic>; 12 interrupt-controller; 13 }; 14 15 uart0: serial@11002000 interrupts = <GIC SPI 91 IRQ TYPE LEVEL LOW>; 17 }; ``` ## "Message in a bottle" The Police, Reggatta de Blanc #### More than wired interrupts: MSIs Message Signaled Interrupts are an essential part of the interrupt infrastructure - A simple 32bit write (the message) from the device to a doorbell - The doorbell is usually the interrupt controller itself - The generated interrupt depends on the data being written - By definition edge triggered - Avoid the spider web syndrome - Routing interrupts to the periphery of a SoC is a constraint - MSIs allows the use of the same busses as the data - Having multiple interrupts per device costs nothing - Acts as a memory barrier w.r.t DMA - Avoid the "got an interrupt but data is not there yet" problem - Bus agnostic - Historically tied to PCI(e) - Now implemented on all kinds of busses... #### The goals of supporting MSIs in a generic way - We'd like to support MSIs on any bus - We want to cater for the weird and wonderful stuff - Intel's DMAR - ARM's GICv3 ITS - Freescale's MC bus - Platform devices - Hisilicon's MBIGEN - Must nicely cohabit with the current PCI/MSI implementation - Hierarchical domains are a good solution for this<sup>2</sup> - Entirely implemented as part of the core IRQ code (kernel/irq/msi.c) - Per-bus front-ends - drivers/pci/msi.c - drivers/base/platform-msi.c - drivers/staging/fsl-mc/bus/mc-msi.c <sup>&</sup>lt;sup>2</sup>Please trust me on that one... #### Generic MSI - irq\_chip grows two new methods - irq\_compose\_msi\_msg: populate a msi\_msg - Address of the doorbell + data to be written - Implemented by the MSI controller, bus agnostic - irq\_write\_msi\_msg - Write the content of the msi\_msg to a given device - Implemented by the bus front-end, bus specific - msi\_domain\_info to describe a MSI domain - A struct irq\_chip - Must at least contain a irq\_write\_msi\_msg method - A struct msi\_domain\_ops - A set of functions used to build an irqdomain - A set of flags (some bus specific), and allowing most of 24 the above to get sensible defaults - Bus specific irqdomain creation functions ``` * PCI/MSI setup static struct irg chip my msi irg chip = { .name = "MSI", .irg eoi = irg chip eoi parent, .irq write msi msg = pci msi domain write msg, 10 static struct msi domain info my msi dom info = { .flags = (MSI FLAG USE DEF DOM OPS MSI FLAG USE DEF CHIP OPS | MSI FLAG PCI MSIX), .chip = &mv msi ira chip, [...] * Build the PCI/MSI domain on top of the IRO domain * representing the MSI hardware pci domain = pci msi create irq domain(fwnode, &mv msi dom info. irg domain); ``` #### Generic MSI - irq chip grows two new methods - irq\_compose\_msi\_msg: populate a msi\_msg - Address of the doorbell + data to be written - Implemented by the MSI controller, bus agnostic - irq\_write\_msi\_msg - Write the content of the msi\_msg to a given device - Implemented by the bus front-end, bus specific - msi\_domain\_info to describe a MSI domain - A struct irq\_chip - Must at least contain a irq\_write\_msi\_msg method - A struct msi\_domain\_ops - A set of functions used to build an irqdomain - A set of flags (some bus specific), and allowing most of the above to get sensible defaults - Bus specific irqdomain creation functions ``` * platform-msi setup static struct irg chip my pmsi irg chip = { .name = "pMSI", 6 }: 8 static struct msi domain ops my pmsi ops = { static struct msi domain info my pmsi dom info = { .flags = (MSI FLAG USE DEF DOM OPS MSI FLAG USE DEF CHIP OPS), = &mv pmsi ops, .ops .chip = &my pmsi irq chip, 16 }; 17 [...] 19 /* * Build the platform-msi domain on top of the IRQ domain * representing the MSI hardware 22 23 plat domain = platform msi create irq domain(fwnode, &my pmsi dom info, 25 irg domain); ``` #### Generic MSI in pictures - At configuration time - The MSI controller irqchip composes the message - The bus-specific irqchip programs the device - Everything is just like the stacked irqchip scenario - The only notable difference is that we have a bus-specific irqdomain that doesn't correspond to any HW - Its main function is to cater for different programing interfaces at the device level #### A platform MSI special - There is no such thing as a "standard" platform device - No way to implement a irq\_write\_msi\_msg in a standard way - Worked around by providing it at allocation time - The function is per-device - Allows for any crazy stuff ``` static void arm smmu write msi msg(struct msi desc *desc, struct msi msg *msg) doorbell = (((u64)msg->address hi) << 32) | msg->address lo; writeg relaxed(doorbell, smmu->base + cfg[0]); writel relaxed(msg->data, smmu->base + cfg[1]); static void arm smmu setup msis(struct arm smmu device *smmu) 12 ret = platform msi domain alloc irqs(dev, nvec, arm smmu write msi msg); 15 for each msi entry(desc, dev) { switch (desc->platform.msi index) { /* request desc->irg */ 20 ``` ## "I'm going slightly mad" Queen, Innuendo #### The interrupt strikes back - Just as we thought we had fixed the world by giving MSIs to everyone... - People now build wired interrupt controllers... - ... that use MSI as their transport - Allows wired devices to be placed far away from the irqchip - Conveniently, one MSI per wire - Stacked domains to the rescue! - The irqchip is a MSI-capable device - We can give it its own irqdomain #### Wire-MSI bridges, the programatic view - At probe time, create a device-specific domain - Automatically attached to the device's msi-parent's own domain - When allocating its MSIs, place them in that domain - Dish out wired interrupts as a normal irqchip ``` static struct irg domain ons mbigen domain ons = 4 static int mbigen irg domain alloc(struct irg domain *domain, unsigned int virg. unsigned int nr irgs, void *args) struct ira fwsnec *fwsnec = aras: mbigen domain translate(domain, fwspec, &hwirg, &type); platform msi domain alloc(domain, virg, nr irgs); mgn chip = platform msi get host data(domain); for (i = 0; i < nr irgs; i++) irg domain set hwirg and chip (domain, virg + i, hwirg + i, &mbigen irg chip, mgn chip->base); static struct irq domain ops mbigen domain ops = { .alloc = mbigen irq domain alloc, 23 24 static int mbigen device probe(struct platform device *pdev) 26 27 domain = platform msi create device domain(&child->dev, num pins. mbigen write msg. &mbigen domain ops. mgn chip); 33 ``` ## Thank you! ### **ARM** The trademarks featured in this presentation are registered and/or unregistered trademarks of ARM limited (or its subsidiaries) in the EU and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners. Copyright © 2016 ARM Limited @ ARM 2016