Skip to content

Commit 0d964ae

Browse files
committed
CV-Mesh PC/PL draft
1 parent a8904b8 commit 0d964ae

File tree

3 files changed

+341
-0
lines changed

3 files changed

+341
-0
lines changed
Lines changed: 341 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,341 @@
1+
# Title of Project - "CORE-V CV-Mesh"
2+
# Project Concept Proposal
3+
## Date of proposal - 2023-07-24
4+
## Author(s) - Jonathan Balkind, Assistant Professor, UC Santa Barbara
5+
6+
## High Level Summary of project, project components, and deliverables
7+
8+
OpenPiton is a manycore processor design and research framework, in development since late 2013 and open-source since mid 2015. Its coherence system, known as P-Mesh, enables creation of large meshes of cores and other heterogeneous elements. A number of designs have been taped out including Piton (25 tiles/cores, using the OpenSPARC T1 core), CIFER (8 tiles: 22 cores, including 4 CVA6 cores, plus an eFPGA), and DECADES (108 tiles: 60 CVA6 cores, 24 accelerators, 23 intelligent storage tiles, plus an eFPGA).
9+
10+
The goal of this project is to bring the P-Mesh coherence system into the OpenHW ecosystem to enable users to build large meshes of OpenHW CORE-V and other cores and accelerators. We propose to name this CV-Mesh and to separate it from OpenPiton itself as an independent IP block.
11+
12+
### Features of P-Mesh (to be adopted as CV-Mesh)
13+
14+
* Directory-based coherence model
15+
* Three-level cachee hierarchy
16+
* MESI protocol
17+
* Support for heterogeneous cores via Transaction-Response Interface
18+
* Support for coherent LLC access from other heterogeneous elements
19+
* Can connect arbitrary point-to-point ordered NoCs
20+
* Open source (BSD license)
21+
* SystemVerilog
22+
23+
### Components
24+
25+
* Component 1: CV-Mesh protocol specification
26+
* Component 2: CV-Mesh user guide
27+
* Component 3: RTL implementation of the local private cache (L1.5 cache from OpenPiton P-Mesh) verified to TRL 5
28+
* Component 4: RTL implementation of the shared last-level cache (L2 cache from OpenPiton P-Mesh) verified to TRL 5
29+
* Component 5: RTL implementation of bridges to/from other data movement protocols (e.g. AXI-Lite, AXI) verified to TRL 3
30+
* Component 6: RTL implementation of physical 2D-mesh network-on-chip (dynamic node network from OpenPiton P-Mesh) verified to TRL 3
31+
32+
## Summary of market or input requirements
33+
### Known market/project requirements at PC gate
34+
35+
* OpenPiton designs have been taped out by a number of teams, including the following chips:
36+
* Piton (25 tiles/cores, using the OpenSPARC T1 core) in 32nm technology
37+
* CIFER (8 tiles: 22 cores, including 4 CVA6 cores, plus an eFPGA) in 12nm technology
38+
* DECADES (108 tiles: 60 CVA6 cores, 24 accelerators, 23 intelligent storage tiles, plus an eFPGA) in 12nm technology
39+
* Intel (8 tiles/cores, using the CVA6 core) in Intel 4 technology
40+
41+
### Potential future enhancements
42+
43+
44+
## Who would make use of OpenHW output
45+
46+
Those interested in building scalable clusters of OpenHW CORE-V and other cores.
47+
48+
## Initial Estimate of Timeline
49+
50+
* Separating coherence IP from OpenPiton repository (Q4 2023)
51+
* Connection to CV-HPDC (initial support and validation complete in Q1 2024)
52+
* Enhancing performance characteristics and parameterisation (Q2 2024)
53+
* Standalone verification environment (Q4 2024)
54+
* Improvement of documentation (Q4 2024)
55+
* User guide (Q4 2024)
56+
57+
## Explanation of why OpenHW should do this project
58+
59+
The P-Mesh coherence system as established in OpenPiton has been in development since late 2013 and open-source since mid 2015. The system already supports the CVA6 core and has seen significant adoption. The IP has been well validated for use with a number of cores and ISAs, as well as heterogeneous capabilities for integrating accelerators, FPGAs, and more, including large chips with 100s of tiles and billions of transistors. The project will extend OpenHW's move into HPC and combined with CV-HPDC will enable connection of higher performance cores in the near future.
60+
61+
## Industry landscape: description of competing, alternative, or related efforts in the industry
62+
63+
### BedRock
64+
65+
The BedRock coherence protocol was established for creation of coherent clusters of BlackParrot cores.
66+
67+
* Directory-based coherence model
68+
* Two-level cachee hierarchy
69+
* Capable of coherence protocols like MOESIF (and many subsets)
70+
* Uses a microcoded coherence engine
71+
* Open source (BSD license)
72+
* SystemVerilog
73+
74+
### ESP
75+
76+
ESP has a long history and focuses on accelerator-rich SoCs. Originally designed for LEON3 but today supports CVA6 and Ibex as host cores.
77+
78+
* Directory-based coherence model
79+
* Three-level cache hierarchy
80+
* MESI protocol or Spandex heterogeneous coherence
81+
* 32 bit physical addresses
82+
* Open source (Apache license)
83+
* SystemC & SystemVerilog
84+
85+
### TileLink 2
86+
87+
TileLink is the primary coherence protocol used among users of Rocket/BOOM, and was specified by SiFive. There are a number of configurable implementations so we elide the details here.
88+
89+
* Supports both snooping and directory-based coherence models
90+
* Open source (varies - BSD license for some IP)
91+
* Chisel (primarily)
92+
93+
### AMBA ACE/CHI/etc
94+
95+
Arm's AMBA protocols include ACE and CHI which enable coherent operation.
96+
97+
* Primarily snoop-based coherence model
98+
* Commercial protocols with some open source implementations
99+
* OpenHW ACE implementation in SystemVerilog: "CORE-V tightly-coupled cache coherence mechanism for CVA6"
100+
101+
## OpenHW Members/Participants committed to participate
102+
103+
Jonathan Balkind, Assistant Professor, UC Santa Barbara
104+
Miquel Moretó, Associate Researcher, Barcelona Supercomputing Center & Associate Professor, Universitat Politècnica de Catalunya (UPC)
105+
Lluc Alvarez, Established Researcher, Barcelona Supercomputing Center
106+
César Fuguet, CEA List, Grenoble
107+
108+
## Project Leader(s)
109+
### Technical Project Leader(s)
110+
111+
Jonathan Balkind, Assistant Professor, UC Santa Barbara
112+
113+
### Project Manager, if a PM is designated
114+
115+
N/A
116+
117+
<hr/>
118+
119+
120+
121+
# Title of Project - "CORE-V CV-Mesh"
122+
# Project Launch Proposal
123+
## Date of proposal - 2023-07-24
124+
## Author(s) - Jonathan Balkind, Assistant Professor, UC Santa Barbara
125+
126+
127+
## Summary of project
128+
129+
OpenPiton is a manycore processor design and research framework, in development since late 2013 and open-source since mid 2015. Its coherence system, known as P-Mesh, enables creation of large meshes of cores and other heterogeneous elements. A number of designs have been taped out including Piton (25 tiles/cores, using the OpenSPARC T1 core), CIFER (8 tiles: 22 cores, including 4 CVA6 cores, plus an eFPGA), and DECADES (108 tiles: 60 CVA6 cores, 24 accelerators, 23 intelligent storage tiles, plus an eFPGA).
130+
131+
The goal of this project is to bring the P-Mesh coherence system into the OpenHW ecosystem to enable users to build large meshes of OpenHW CORE-V and other cores and accelerators. We propose to name this CV-Mesh and to separate it from OpenPiton itself as an independent IP block.
132+
133+
### Components of the Project
134+
135+
* Component 1: CV-Mesh protocol specification
136+
* Component 2: CV-Mesh user guide
137+
* Component 3: RTL implementation of the local private cache (L1.5 cache from OpenPiton P-Mesh) verified to TRL 5
138+
* Component 4: RTL implementation of the shared last-level cache (L2 cache from OpenPiton P-Mesh) verified to TRL 5
139+
* Component 5: RTL implementation of bridges to/from other data movement protocols (e.g. AXI-Lite, AXI) verified to TRL 3
140+
* Component 6: RTL implementation of physical 2D-mesh network-on-chip (dynamic node network from OpenPiton P-Mesh) verified to TRL 3
141+
142+
#### Component 1 Description
143+
144+
OpenPiton provides a microarchitecture specification document which describes the P-Mesh coherence protocol. This document has fallen out of date versus a number of more recent changes and the sources are in latex. This component will be brought up to date and into a better open format.
145+
146+
#### Component 2 Description
147+
148+
The CV-Mesh user guide will describe the interfaces provided by the CV-Mesh caches, network, and bridges. It will provide users with the information needed to correctly instantiate these components to build their own system-on-chip, to complement the example design(s) provided in the Polara APU repository. It will also describe what types of requests and responses can be sent to/from the different caches and which protocols (or subsets thereof) are supported by the protocol bridges.
149+
150+
#### Component 3 Description
151+
152+
The local private cache (L1.5 cache in P-Mesh) generally acts as a second layer of cache. It communicates with the shared last-level cache to maintain cache coherence for cores and other agents. To decouple the core from the coherence protocol itself, the cache offers the Transaction-Response Interface (TRI) which is implemented by any core connected into the system. This includes the write-through L1 cache used in CVA6. Support for TRI in CV-HPDC will be developed as part of the project, as will a number of performance enhancements.
153+
154+
#### Component 4 Description
155+
156+
The shared last-level cache (L2 cache in P-Mesh) acts as the coherence directory and supports coherent and non-coherent access over the network-on-chip. Privates caches interact with the L2 to maintain the coherence protocol, but other agents may also communicate directly with the last-level cache to perform coherent reads or writes without participating in the entire coherence protocol.
157+
158+
#### Component 5 Description
159+
160+
As not every peripheral implements the P-Mesh coherence protocol, it provides bridges between P-Mesh and other protocols, most relevantly AXI and AXI-Lite. These enable interaction with a variety of peripherals, accelerators, DMAs, etc.
161+
162+
#### Component 6 Description
163+
164+
OpenPiton uses three physical networks-on-chip to maintain deadlock-free communication between the caches and main memory. The platform supports replacement of the network routers with others provided they maintain point-to-point ordering for messages. As a result, in CV-Mesh, the network routers from P-Mesh are only provided as an example with the recognition that users may replace them with other network routers.
165+
166+
## Summary of market or input requirements
167+
### Known market/project requirements at PL gate
168+
169+
* OpenPiton designs have been taped out by a number of teams, including the following chips:
170+
* Piton (25 tiles/cores, using the OpenSPARC T1 core) in 32nm technology
171+
* CIFER (8 tiles: 22 cores, including 4 CVA6 cores, plus an eFPGA) in 12nm technology
172+
* DECADES (108 tiles: 60 CVA6 cores, 24 accelerators, 23 intelligent storage tiles, plus an eFPGA) in 12nm technology
173+
* Intel (8 tiles/cores, using the CVA6 core) in Intel 4 technology
174+
175+
### Potential future enhancements for future project phases
176+
177+
## Who would make use of OpenHW output
178+
179+
Those interested in building scalable clusters of OpenHW CORE-V and other cores.
180+
181+
## Summary of Timeline
182+
183+
* Separating coherence IP from OpenPiton repository (Q4 2023)
184+
* Connection to CV-HPDC (initial support and validation complete in Q1 2024)
185+
* Enhancing performance characteristics and parameterisation (Q2 2024)
186+
* Standalone verification environment (Q4 2024)
187+
* Improvement of documentation (Q4 2024)
188+
* User guide (Q4 2024)
189+
190+
## Explanation of why OpenHW should do this project
191+
192+
The P-Mesh coherence system as established in OpenPiton has been in development since late 2013 and open-source since mid 2015. The system already supports the CVA6 core and has seen significant adoption. The IP has been well validated for use with a number of cores and ISAs, as well as heterogeneous capabilities for integrating accelerators, FPGAs, and more, including large chips with 100s of tiles and billions of transistors. The project will extend OpenHW's move into HPC and combined with CV-HPDC will enable connection of higher performance cores in the near future.
193+
194+
## Industry landscape: description of competing, alternative, or related efforts in the industry
195+
196+
### BedRock
197+
198+
The BedRock coherence protocol was established for creation of coherent clusters of BlackParrot cores.
199+
200+
* Directory-based coherence model
201+
* Two-level cachee hierarchy
202+
* Capable of coherence protocols like MOESIF (and many subsets)
203+
* Uses a microcoded coherence engine
204+
* Open source (BSD license)
205+
* SystemVerilog
206+
207+
### ESP
208+
209+
ESP has a long history and focuses on accelerator-rich SoCs. Originally designed for LEON3 but today supports CVA6 and Ibex as host cores.
210+
211+
* Directory-based coherence model
212+
* Three-level cache hierarchy
213+
* MESI protocol or Spandex heterogeneous coherence
214+
* 32 bit physical addresses
215+
* Open source (Apache license)
216+
* SystemC & SystemVerilog
217+
218+
### TileLink 2
219+
220+
TileLink is the primary coherence protocol used among users of Rocket/BOOM, and was specified by SiFive. There are a number of configurable implementations so we elide the details here.
221+
222+
* Supports both snooping and directory-based coherence models
223+
* Open source (varies - BSD license for some IP)
224+
* Chisel (primarily)
225+
226+
### AMBA ACE/CHI/etc
227+
228+
Arm's AMBA protocols include ACE and CHI which enable coherent operation.
229+
230+
* Primarily snoop-based coherence model
231+
* Commercial protocols with some open source implementations
232+
* OpenHW ACE implementation in SystemVerilog: "CORE-V tightly-coupled cache coherence mechanism for CVA6"
233+
234+
## OpenHW Members/Participants committed to participate
235+
236+
Jonathan Balkind, Assistant Professor, UC Santa Barbara
237+
Miquel Moretó, Associate Researcher, Barcelona Supercomputing Center & Associate Professor, Universitat Politècnica de Catalunya (UPC)
238+
Lluc Alvarez, Established Researcher, Barcelona Supercomputing Center
239+
César Fuguet, CEA List, Grenoble
240+
241+
## Project Leader(s)
242+
### Technical Project Leader(s)
243+
244+
Jonathan Balkind, Assistant Professor, UC Santa Barbara
245+
246+
### Project Manager, if a PM is designated
247+
248+
N/A
249+
250+
## Project Documents
251+
### Project Planning Documents
252+
253+
* PL document (this document)
254+
255+
### Project Output Documents
256+
257+
* CV-Mesh protocol specification
258+
* CV-Mesh user guide
259+
260+
## List of project technical outputs
261+
262+
* Enhanced versions of the five components which are already provided as inputs for the project
263+
264+
### Feature Requirements
265+
266+
#### Feature 1
267+
268+
* Connection to CV-HPDC
269+
270+
#### Feature 2
271+
272+
* Support for wider networks-on-chip
273+
274+
#### Feature 3
275+
276+
* Support for larger cache block sizes in the local private cache
277+
278+
#### Feature 4
279+
280+
* Improved parameterisation of caches
281+
282+
## External dependencies
283+
284+
* OpenPiton
285+
286+
## OpenHW TGs Involved
287+
288+
* TWG: Interconnect
289+
290+
## Resource Requirements
291+
292+
### Engineering resource supplied by members - requirement and availability
293+
294+
Team from UC Santa Barbara
295+
Team from Barcelona Supercomputing Center
296+
297+
### OpenHW engineering staff resource plan: requirement and availability
298+
299+
N/A
300+
301+
### Marketing resource - requirement and availability
302+
303+
N/A
304+
305+
### Funding for project aspects - requirement and availability
306+
307+
N/A
308+
309+
## Architecture and/or context diagrams
310+
311+
The following figure shows a top-level view of a 16-core CVA6 system previously demonstrated with P-Mesh.
312+
313+
![Top-Level View of a 16-core CVA6 system previously demonstrated with P-Mesh.](./cva6_16core_pmesh.png)
314+
315+
The following figure shows a system enabled by P-Mesh compared with a future system enabled by CV-Mesh. The CV-Mesh IP is made of the components shown within the orange box.
316+
317+
![A system enabled by P-Mesh compared with a future system enabled by CV-Mesh. The CV-Mesh IP is made of the components shown within the orange box.](./cvmesh_and_pmesh.png)
318+
319+
## Project license model
320+
321+
* 3-Clause BSD (following existing OpenPiton P-Mesh)
322+
323+
## Description of initial code contribution, if required
324+
325+
P-Mesh code is already hosted by OpenHW as part of the Polara APU GitHub repository. The P-Mesh component will be brought into its own repository as CV-Mesh.
326+
327+
## Repository Requirements
328+
329+
* Repository for CV-Mesh IP
330+
* Submodule linkage to branch(es) on Polara APU repository (which is an OpenPiton fork)
331+
332+
## Project distribution model
333+
334+
OpenHW GitHub repositories
335+
336+
## Preliminary Project plan
337+
*A full project plan is not required at PL. A preliminary plan, which can be for instance the schedule for completion of component or feature list, together with responsible resource, should be provided. Full details should be provided at PA gate.*
338+
339+
## Risk Register
340+
*A list of known risks, for example external dependencies, and any mitigation strategy*
341+
146 KB
Loading
190 KB
Loading

0 commit comments

Comments
 (0)