Skip to content

Make Sabre generic over analysis type#15799

Open
jakelishman wants to merge 7 commits intoQiskit:mainfrom
jakelishman:sabre/lightweight-layout
Open

Make Sabre generic over analysis type#15799
jakelishman wants to merge 7 commits intoQiskit:mainfrom
jakelishman:sabre/lightweight-layout

Conversation

@jakelishman
Copy link
Member

This lets callers of Sabre routing (swap_map_trial) inject the type of analysis they want performing, at compile time. This in turn lets Sabre layout avoid storing the full rebuild information during its forwards-backwards passes, when all it actually cares about is the swap counting and final layout (and really, it doesn't even care about the swap counting, but that's effectively free).


Built on top of #14912.

On the HWB12 benchmark we recently merged to main, this PR (on top of #14912) shows a modest (~2-3%) performance improvement in SabreLayout, which isn't a huge amount, but it isn't nothing.

The documented behaviour of `SabreSwap` is that we assume that
multi-qubit gates have already been decomposed.  Previously, we silently
ignored multi-qubit gates, most likely because this initial handling
predated the introduction of the `directive` tracking in the Sabre
interaction graph, so it was necessary to silently pass through
barriers, which likely are multi-qubit.

The new Sabre interaction graph does more work to categorise the types
of interactions, with the intention of compressing interactions that can
statically be known to have no impact on the routing, to avoid multiple
Sabre trials from having to work through them.  This means that Sabre
now very clearly _sees_ a multi-qubit gate as it comes in, and can
clearly distinguish it from a directive.

The removed test case has always provided a nonsensical input to Sabre,
as it includes 3q gates without decomposing them.  The test was
originally trying to assert that measures that appeared in the
lexicographical output of the OpenQASM 2 exporter remained
lexicographically after any inserted swaps by routing (see Qiskitgh-691 [1]).
The swap mapper was not (in that case) violating the data-flow
requirements, and the general case of forcing measurements into the
final position was the inclusion of `BarrierBeforeFinalMeasurements`
anyway, it was just so happening that the conversion from data-flow
graph back to sequential data happened to go in a different order.

[1]: d277521: Measurements before swap mapping
This allows `NLayout` to be indexed by `VirtualQubit` and
`PhysicalQubit`, which is a shorter way of writing
`VirtualQubit::to_phys` and `PhysicalQubit::to_virt`.
This commit separates out the three logical components of the Sabre
routing tracking into three separate structs, where each struct groups
objects that have the same mutation tendency.

The previous Sabre state stored its problem description, internal
tracking, and output tracking altogether in the same flat structure.
Those three components have different tendencies to mutate: the problem
description never mutates, the internal tracking frequently does, and
the output tracking only occasionally does and has a lifetime validity
tied to that of the problem description.

Putting them together made it impossible to call methods that mutated
the state while passing an object that borrowed from the problem
description, such as when recursing into control-flow operations,
because the borrow checker could not validate it.  This applied
interface pressure to inline more into the same method, which made
code-reuse of separate concerns harder.
Sabre uses several objects that are logically maps from an index-like
newtype (like `NodeIndex` or `PhysicalQubit`) to some value, and are
implemented as fixed-slice `Vec`s for lookup efficiency.    The newtype
provides type safety while it's in use, but we have to cast it away to
index, which makes it easy to index slices with the wrong object.

This introduces a `VecMap` object, which provides a (minimal) slice-like
interface, but indexes using the relevant newtype.

The current implementation of Sabre does not use this _too_ much, but a
refactoring of the layer structures will have them store one slice
indexed by `PhysicalQubit` and one by `VirtualQubit`, which are
trivially easy to get switched (a frequent mistake that is the base
reason those new types were introduced in the first place).
The previous Sabre extended set was just the "next N" 2q gates
topologically on from the front layer, where Qiskit reliably used
`N = 20` ever since its introduction.  For small-width circuits (as were
common when the original Sabre paper was written, and when it was first
implemented in Qiskit), this could mean the extended set was reliably
several layers deep.  This could also be the case for star-like
circuits.  For the wider circuits in use now, at the 100q order of
magnitude, the 20-gate limit reliably means that denser circuits cannot
have their entire next layer considered by the lookahead set.

This commit modifies the lookahead heuristic to be based specifically on
layers.  This regularises much of the structure of the heuristic with
respect to circuit and target topology; we reliably "look ahead" by the
same "distance" as far as routing is concerned.  It comes with the
additional benefits:

- we can use the same `Layer` structure for both the front layer and the
  lookahead layers, which reduces the amount of scoring code

- the lookahead score of a swap can now affect at most two gates per
  layer, just like the front-layer scoring, and we can do this
  statically without loops

- we no longer risk "biasing" the lookahead heuristic in either case of
  long chains of dependent gates (e.g. a gate that has 10 predecessors
  weights the score the same as a gate with only 1) or wide circuits
  (some qubits have their next layer counted in the score, but others
  don't because the extended set reached capacity).

- applying a swap to the lookahead now has a time complexity that is
  constant per layer, regardless of the number of gates stored in it,
  whereas previously it was proportional to the number of gates stored
  (and the implementation in the parent of this commit is proportional
  to the number of qubits in the circuit).

This change alone is mostly a set up, which enables further
computational complexity improvements by modifying the lookahead layers
in place after a gate routes, rather than rebuilding them from scratch,
and subsequently only updating _swap scores_ based on routing changes,
rather than recalculating all from scratch.
Ever since Sabre was first introduced to Qiskit (in [1]), the extended
set has been recalculated from the front layer, every time the front
layer is updated.  Now that the extended set is tracked as a series of
layers, it is easily possible to keep the layer structure updated in the
same way as the front layer, by tracking separate points in the
topological iteration through the interaction graph.

To speed up synchronisation of the layers during updates, a new outer
tracker, `Layers`, now means that a removal can jump immediately to the
correct layer, without having to test multiples.  This outer structure
co-operates with the individual layers to effect faster lookups and
removals; rather than each layer storing its own `IndexMap`, they
instead store only a `Vec`, and the outer `Layers` tracks _where_ in the
`Vec` any given gate is.  On removal, the layer returns which, if any,
gate has moved to occupy the otherwise empty slot (since removals are by
swap, not shift).  This avoids all hash lookups in the layer structures;
everything is simple contiguous indexed access.

With the tracking done elsewhere, and with `VecMap` providing type-safe
indexing, each individual `Layer` now stores its contained gates in
terms of `[VirtualQubit; 2]`; this minimises the amount of data that
needs to be modified on a swap, at the cost that actions that need to
associate actual gates with the current physical qubits (like iterating
over active swaps) now need access to the `NLayout`, rather than the
layout being (implicitly) tracked by each `Layer` via `apply_swap`.
This is a performance improvement; as more overhead is reduced, the cost
of the hash lookups in swap application was ever more visible.

[1]: fab61d2: Sabre layout and routing transpiler passes (Qiskitgh-4537)
This lets callers of Sabre routing (`swap_map_trial`) inject the type of
analysis they want performing, at compile time.  This in turn lets Sabre
layout avoid storing the full rebuild information during its
forwards-backwards passes, when all it actually cares about is the swap
counting and final layout (and really, it doesn't even care about the
swap counting, but that's effectively free).
@jakelishman jakelishman added this to the 2.5.0 milestone Mar 12, 2026
@jakelishman jakelishman requested a review from a team as a code owner March 12, 2026 16:13
@jakelishman jakelishman added on hold Can not fix yet performance Changelog: Added Add an "Added" entry in the GitHub Release changelog. mod: transpiler Issues and PRs related to Transpiler labels Mar 12, 2026
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: Added Add an "Added" entry in the GitHub Release changelog. mod: transpiler Issues and PRs related to Transpiler on hold Can not fix yet performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants