BlackParrot aims to be the default open-source, Linux-capable, cache-coherent, RV64GC multicore used by the world. Although originally developed by the University of Washington and Boston University, BlackParrot strives to be community-driven and infrastructure agnostic, a core which is Pareto optimal in terms of power, performance, area and complexity. In order to ensure BlackParrot is easy to use, integrate, modify and trust, development is guided by three core principles: Be Tiny, Be Modular, and Be Friendly. Development efforts have prioritized ease of use and silicon validation as first order design metrics, so that users can quickly get started and trust that their results will be representative of state-of-the-art ASIC designs. BlackParrot is ideal as the basis for a lightweight accelerator host, a standalone Linux core, or as a hardware research platform.
- Be TINY
- When deliberating between two options, consider the one with least hardware cost/complexity.
- Be Modular
- Prevent tight coupling between modules by designing latency insenstive interfaces.
- Be Friendly
- Combat NIH, welcome external contributions and strive for infrastructure agnosticism.
BlackParrot v 1.0 was released in March 2020 and has been up and quad core silicon has been running in the lab since April 2020. It supports configurations scaling up to a 16-core cache coherent multicore, including the baseline user and privilege mode functionality to run Linux. An optimized single core variant of BlackParrot (also Linux-capable) is also available. Currently, the core supports RV64IMAFD, with C support on the way!
Development of BlackParrot continues, and we are very excited about what we are releasing next!
A 12nm BlackParrot multicore chip was taped out in July 2019.
We presented BlackParrot at the December 2020 RISC-V Summit! slides
We presented BlackParrot at the ICS 2020 Workshop on RISC-V and OpenPOWER! slides
We first announced BlackParrot at FOSDEM 2020! slides video pdf
This RTL repo is intended to be used with a specific SDK and HDK depending on the simulation / FPGA / ASIC environment desired. For first-time users of BlackParrot, we recommend starting from the BlackParrot Simulation Environment, which packages the BlackParrot RTL and SDK in a compatible manner. We intend to release several examples of BlackParrot environments which package the RTL, SDK and HDK together for evaluation.
To set up your own BlackParrot environment, cloning a version of the BlackParrot SDK as 'sdk' in the same directory as the BlackParrot RTL is all that is strictly required, although the repositories are not guaranteed to be in sync after pulling from master of each.
Once you have a BlackParrot environment set up, you can follow the RTL evaluation guide here to test the core RTL: Evaluation Guide
Although the information in collected in this repo, it's recommended to look at these Slides for a quick overview of BlackParrot.
We welcome external contributions! Please join our mailing list at Google Groups and follow us on Twitter to discuss, ask questions or just tell us how you're using BlackParrot! For a smooth contribution experience, take a look at our Contribution Guide.
BlackParrot is written in standard SystemVerilog, using a subset of the language known to be both synthesizable and compatible with a wide variety of vendor tools. Details of these style choices both functional and aesthetic can be found in our Style Guide
BlackParrot is Linux-capable, so it is possible to run all programs which run on BusyBox. However, for more targeted benchmarks which don't want O/S management overheads (or the overheads of a long Linux boot time in simulation!), it is preferable to write for bare-metal. Additionally, some platform-specific features are only available at the firmware level. Developers looking to write low-level BlackParrot code, or optimize for the BlackParrot platform should look at our SDK.
Once you've built and validate your BlackParrot program and are ready to run on RTL, look at our TestBench Guide
Coming Soon!
BlackParrot heavily leverages the BaseJump STL library and builds upon many of the hardware design conventions from the corresponding BSG SystemVerilog Coding Guidelines which can aid in understanding how BlackParrot source code works.
BlackParrot is an aggressively modular design: communication between the components is performed over a set of narrow, latency-insensitive interfaces. The interfaces are designed to allow implementations of the various system components to change independently of one another, without worrying about cascading functional or timing effects. Read more about BlackParrot's standardized interfaces here: Interface Specification
The BedRock coherence system maintains cache coherence between the BlackParrot processor cores and attached coherent accelerators in a BlackParrot multicore system. Please see the BedRock Coherence Protocol page for more details on the coherence protocol and system.
A key feature of using BlackParrot is that it has been heavily validated in both silicon and FPGA implementations. All BlackParrot tapeouts and FPGA environments can be found at BlackParrot Examples. Taped out BlackParrot yourself and want to share tips and tricks? Let us know and we can add it to the collection! Looking to implement BlackParrot in a physical system? Take a look at our CAD Backend Guide.
Upon commit to the listed branch, a functional regression consisting of full-system tests and module level tests is run and checked for correctness. Additionally, the design is checked with Synopsys DC to verify synthesizability. Work is in progress to continuously monitor PPA.
Our goal with BlackParrot is to bootstrap a community-maintained RISC-V core, and we would love for you to get involved. Here are a few starter projects you could do to get your feet wet! Contact us more for details.
- Our integer divider could be parameterized to do 2 or more cycles per iteration, or to skip zeros. (Note: Currently somebody is working on this.)
- We would like to configure BlackParrot for ultra-tiny caches (e.g. 8-way set associative, 4 sets, 2 word = 16 byte cache lines)
- We could use a stream buffer (prefetcher) implementation for our L2 cache.
- Add a parameter to enable / disable FPU logic (including register file, bypass paths, FP divider and FMAC, etc.)
- Improve the mapping to FPGA
- We use a portability layer for FPGA that can be optimized, e.g.,
- Other mappings, such the multiplier to DSP48, could be improved.
- We have not looked at frequency tuning BP for FPGA at all. The ideal changes would not result in much ASIC/FPGA code bifurcation.
- We always appreciate pull requests to fix bugs in the documentation, or bug reports that instructions don't work correctly.
- The RISC-V GCC compiler has some inefficiencies that we have identified, if you have compiler experience you could raise the benchmark numbers for all RISC-V cores versus other ISA's!
- Our current L2 cache implementation (bsg_cache) is blocking. We would like a non-blocking implementation that supports the same interface and features as the current one, so that can be a configuration option for BlackParrot. It may even be possible to reuse the current code. Contact us to discuss possible implementation approaches! This is an advanced project, after you have already completed an intermediate project.
If used for academic research, please cite:
D. Petrisko, F. Gilani, M. Wyse, D. C. Jung, S. Davidson, P. Gao, C. Zhao, Z. Azad, S. Canakci, B. Veluri, T. Guarino, A. J. Joshi, M. Oskin, M. B. Taylor, "BlackParrot: An Agile Open Source RISC-V Multicore for Accelerator SoCs", in IEEE Micro Special Issue on Agile and Open-Source Hardware, July/August, 2020. doi: 10.1109/MM.2020.2996145