Kwxm/bls12-381/prototype (PLT-192, PLT-1557, PLT-1554, etc).#5231
Conversation
…/plutus into kwxm/BLS12_381/prototype
|
Thanks for the comments @effectfully, especially the one about has collisions. I think I've addressed all of them except for the one about explicit dictionaries: I might come back to that later. |
|
@angerman I think this PR is blocked because you requested changes because of the Windows build problem. I believe we decided that you'd look into that separately, so is it OK to merge this now? I had some trouble merging the nix files with those in |
|
@kwxm here's the haskell.nix bump PR: #5366 I've also tried to just bump haskell.nix on this PR, but that resulted in after I pulled the latest changes 😕 |
|
So this fails on windows with too. I don't think the Darwin -> Windows pipeline works, and I'm not sure why this is in CI. |
|
Notable I don't see that failure on master (with bumped haskell.nix): #5366 |
angerman
left a comment
There was a problem hiding this comment.
I think we should revert this back to the original for now.
That's weird: |
|
@kwxm yes, I'm also a bit surprised by this. Even more so that I don't see this in master 😅 |
|
Looks like it succeeded but the status isn't getting reported for some reason. |
|
I think this is good to merge. IDK why hydra complains. We could forcepush. I'll force push once. |
643900a to
646b7e6
Compare
|
Slightly premature, since I was just about to push some more changes! I'll open a new branch for that. |
…ctMBO#5231) * Initial version of BLS pairing builtins * WIP * WIP * WIP: implementations of GT operations * Tidy up instances * More-or-less complete implementation for UPLC * Initial costing benchmarks for BLS builtins * Update benchmarks * Update R code * Forgot source files * Wrong denotation for GT_mul * Partial updates to CreateBuiltinCostModel * Fix typos in function names * Update memory models for BLS12_381 builtins * Update memory models for BLS12_381 builtins * WIP * Rename BLS (de)serialise -> (un)compress * Reformat * inline-r workaround; corrections to cost model generation code * inline-r workaround * Update cost model tests for BLS functions * Update benching results and cost model file for BLS * Update benching results and cost model file for BLS * Update comment * Update comment * Add some extra stuff for the benefit of the QuickCheck shrinker * Add new builtins to plutus-tx * Add a few Haskell BLS examples * WIP * Update cabal file * Merge master * WIP * Update BLS throughput benchmarks * Update cost model for uncompress vs deserialise * Update BSL benchmark program * WIP * Nix weirdness * Updates after merge * Add Groth16 verification example * Tidying up; get rid of SourceSpans * Minor updates * Minor updates * Add proper Criterion benchmarks * Tidying up * Moved file * Forgot cabal file * Fix typo * Update comment * Fix cabal version constraints * Add missing cases for geqStep * Update deriving methods * WIP: property tests * Add Plutus versions of most of the property tests * Tidy up the test code a bit * Use folds for repeated addition; adjust sizes of test inputs * Better folding * Update FFI code to new version * Name change: millerLoop -> pairing * Reorganise files * Tidy up * Tidy up * More tidying up * Add comment * Abstraction for BLS property tests * Tidying up * WIP * Incorporate Inigo's updates * Banish Hedgehog * Add conformance tests for BLS12-381 constants * Add BLS12-381 addition conformance tests * Add BLS12-381 equality conformance tests * Add BLS12-381 negation conformance tests * Update comment on cofactors * More conformance tests * Correct test name * Add BLS12-381 scalar multiplication conformance tests * Remark about source of data for Groth16 verification example * Add BLS12-381 pairing operation conformance tests * Update comment in BLS12-381 peoperty tests * Update comment in BLS12-381 peoperty tests * Typo in file name * Fix types in bls-sizes executable * Update names in costing benchmark CSV file * Update names of built-in types and functions in plutus-core * Update names in conformance tests * Update BLS names in plutus-tx-plugin * Remove parser for MlResult; fix Groth16 example * Tidy up the Groth16 example * Update versions in plutus-benchmark.cabal * applyCode -> unsafeApplyCode after PLT-1552 * Update comment * Minor formatting updates * Make plutus-metatheory work with the BLS builtins to some extent * Fix incorrect test * Exclude failing BLS12-381 Agda tests * Exclude failing BLS12-381 Agda tests * Add property test for periodicity of scalar multiplication * Minor code rearrangement * Import scalarPeriod for tests * Add more property tests for BLS compression * Add conformance tests for BLS scalar mulitplication periodicity * Add descirptive comments to the BLS conformance tests * Improve printing of known builtins when parser encounters an unknown one * Reorganise files containing cryptographic functions * Reorganise Crypto files * hashToCurve -> hashToGroup * Adjust spacing in print-builtin-signatures * Justification * Attempt to update to work with iohk-nix version of libblst * Merge Mauro's metatheory updates * Update to new version of BLS bindings * Update to new version of BLS bindings * Update to new version of BLS bindings * Merge Mauro's metatheory improvements * Still trying to get libblst to work with nix * More informative BLS names in metatheory * Update some comments * Get BSL sizes from blst * Pairing.pairing -> Pairing.millerLoop * Backpatch cost model * Turn on immediate warnings in R * Restore golden budget results after cost model backpatch * Attempt to fix plututs-ledger-api version tests * Restore comment * Add comment * Fix name of plutus-tx-plugin-tests * Improve comments * Extend comments * Reformat comments * Add comments to ignore cbits * Comments on costing benchmarks * Tidying up * Add some more comments * Add changelog entries for BLS12-381 modifications * Remove unnecessary changelog directories from unversioned packages * Update plutus-core/plutus-core/src/PlutusCore/Parser/Builtin.hs Co-authored-by: Michael Peyton Jones <michael.peyton-jones@iohk.io> * Address some PR comments * Crypto -> PlutusCore.Crypto for stuff we've defined * Address some more PR comments * Update metatheory for package name change * Update metatheory for package name change * Update version numbers in cabal file * Update BLS branch to work with merged version of BLS bindings in cardano-base * Remove commented-out Haskell code in plutus-metatheory * Missing cases in metatheory * Missing cases in metatheory * Missing cases in metatheory * Missing cases in metatheory * Missing cases in metatheory * Reorder constructors in ledger-api cost model interface * Update the comments about wrapping BLS12-381 types * Remove Haskell property tests for BLS12-381 (tested in cardano-base) * Refactoring * Refactoring * Remove some stuff that was left in accidentally * Remove empty lines * Resolve merge problems * Fix comment * Delete unused boilerplate from changelog entry * Update cabal file * Fix alignment * Address some PR comments * Address some PR comments * fromIntegral -> fromSatInt * GHC.Tick -> GHC.HpcTick * Trying to get rid of wrong version of Expr.hs in plutus-tx-plugin * Add missing golden files * Fix weird results in TypeSynthesis/Golden/Bls12* * Try to fix blst * Add BLS builtins to metatheory * Fix Untyped/CEK.lagda; make Agda conformance tests pass * Correct spacing * Remove some remaining merge conflicts * Correct spacing * Correct formatting * Correct formatting * Some renaming * Update flake * Bump plutus-core at el from 1.5 -> 1.6 * Bump haskell.nix, iohk-nix, CHaP * bump cardano-base we needed to get proper blst discovery, until 2.2.0.0 is released. This also means we need to bring in cardano-mempool. * Fix missing blst symbols. This depends on IntersectMBO/cardano-base#412. * Agda fixes * liftCode -> liftCodeDef * Remove superlfuous dependencies * Add DST argument to hashToGroup builtins * Address a couple of PR comments * bump haskell.nix * Add NOINLINE to listOfSizedByteStrings * Address some PR comments * Bump haskell.nix To get input-output-hk/haskell.nix#1948 * Error type for overlong DSTs * Error type for overlong DSTs * Stuff about shrinking * Make CI happy. x is unused. * plutus windows cross 8.10 * Finish incomplete test * Typo * Address PR comments * Fix parser for bls12_381_mlresult type * WIP updating things * bump iohk-nix * Improve hash collision tests * Update benching.csv to new format to make merge easier * Resolve some remaining conflicts * Update ciJobs.nix * bump haskell.nix * Update nix/cells/automation/ciJobs.nix * Update nix/cells/automation/ciJobs.nix * Update plutus-metatheory/src/Algorithmic/Erasure.lagda --------- Co-authored-by: Michael Peyton Jones <michael.peyton-jones@iohk.io> Co-authored-by: Moritz Angermann <moritz.angermann@gmail.com> Co-authored-by: Hamish Mackenzie <Hamish.K.Mackenzie@gmail.com>
Introduction
This PR adds new built-in types and functions to Plutus Core and PlutusTx for pairings over BLS12-381, as specified in CIP-0381. These depend on Haskell bindings to the blst C library which are currently the subject of an open PR in
cardano-base. In order not to depend on a branch, the code for these has been coped intoplutus-core/cbitsandplutus-core/src/plutus-core/Crypto/External. We should be able to merge this PR as-is, but ideally we'd wait until thecardano-basePR is merged and remove the copied files fromplutus-core.Details
The PR introduces three new PLC types:
bls12_381_G1_elementbls12_381_G2_elementbls12_381_MlResultand seventeen new built-in functions:
The basic definitions of all of these things are in
plutus-core/plutus-core/src/PlutusCore/Crypto/BLS12_381/, in the filesG1.hs,G2.hs, andPairing.hs. The following sections will attempt to give an idea of what all of these do, but some mathematical background will be required first.Background
Abelian groups
An abelian group is a commutative monoid in which every element has an inverse. These are usually written in an additive style where the operation is
+, the identity element is0, and the inverse of an elementais-a, but they are sometimes written in a multiplicative style where the operation is*, the identity element is1, and the inverse of an elementaisa⁻¹. The types G1 and G2 above represent finite additive abelian groups andMlResultrepresents a multiplicative one.For any abelian group one can define a notion of scalar multiplication by integers. Given an element
aof an abelian group and an integer n,n*ais defined to be 0 if n=0,a+...+antimes ifn>0, and(-a) + ... + (-a)-ntimes ifn<0. In multiplicative notation this scalar "multiple" is written as a poweraⁿ.The
add,neg, andscalarMuloperations above represent these operations in the groups G1 and G2; in addition there is an equality operation, serialisation (compress) and deserialisation (uncompress) operations, and ahashToGroupoperation which takes an arbitrary bytestring and a domain separation tag and produces a hash in the form of an element of G1 or G2.This leaves three functions:
mulMlResult, which performs group multiplication in the groupMlResult;millerLoop, which takes an element ofG1and an element ofG2and produces an element ofMlResult(this is the actual pairing operation), andfinalVerify, which essentially compares two elements ofMlResultfor equality. The last two operations (millerLoopandfinalVerify) are very expensive and are typically only expected to be called a few times in on-chain code.To explain more of the details of the new functions we need some extra background.
Finite fields
A field
Fis a set equipped with two operations+and*and elements0and1such that(F, 0, +)is an abelian group(F, 1, *)is a commutative monoid such that every nonzero element ofFhas a multiplicative inverse (soF\{0}is an abelian group under multiplication).a*(b+c) = a*b + a*cfor all a,b,c in F.Familiar examples of fields are
ℚ,ℝ, andℂ(but notℤsince not every element has a multiplicative inverse inℤ).There are also finite fields, and these can be completely classified. It can be shown that for every prime number
pand every positive integernthere is a field of orderpⁿ(the order of a field or group is its cardinality, the number of elements it contains), and these account for all of the finite fields up to isomorphism. These fields are usually denoted by𝔽_{pⁿ}orGF(pⁿ)orGF(p,n)or something similar; I'll useGF(pⁿ)because it's easier to type.For
n=1,GF(p)is justℤₚwith the usual operations, the important point being that every nonzero element there has a multiplicative inverse modulo p, unlikeℤₙifnis not prime (what would be the inverse of 6 inℤ₂₄, for example?).For
n>1, we can regardGF(p,n)as consisting of n-tuples of elements ofℤₚ; addition is usual coordinatewise addition, but multiplication will be something much more convoluted (and expensive), where the i-th entry in a product will be a complicated combination of all of the elements of the two tuples being multiplied.Elliptic curves
For the purposes of this PR, an elliptic curve
Eover a fieldFcan be regarded as an equation of the formy² = x³+ax+b, whereaandbare constants fromF. The set of all pairs(x,y) ∈ F×Fwhich satisfy the equation is also referred to as "the curve". When the field isℝan elliptic curve is what you'd usually think of as a curve (see Google), but when F is a finite field the so-called curve is an apparently random set of pairs of field elements. However, although the geometric structure has vanished, there is an extremely rich algebraic structure that still makes sense.A fundamental property of elliptic curves is that they are also abelian groups: given two points
PandQon an elliptic curveEone can define another pointP+Qwhich also lies on the curve. There's a geometric picture for this overℝ(see Google again), but also an algebraic form which applies over any field. The coordinates ofP+Qare complicated algebraic functions of the coordinates ofPandQ. In particular, if you're given a pointPand the pointnP(=P+...+P) for some integer n, it's not easy to work out whatnis: this is the discrete logarithm problem on E and is believed to be computationally intractable in general, which is one reason why elliptic curves are useful in cryptography.Pairings
A pairing is a function
e: G₁ × G₂ → G_TwhereG₁,G₂andG_Tare abelian groups (the T's supposed to be a subscript, but I can't find a Unicode character for that). In our case,G₁andG₂are written additively, but to complicate matters,G_Tis written multiplicatively. A pairing is require to be bilinear: forp, p' ∈ G₁,q, q' ∈ G₂, andn ∈ ℤwe should havee(p+p',q) = e(p,q) * e(p',q)(remember that the operation inG_Tis multiplication)e(p,q+q') = e(p,q) * e(p,q')e(n*p,q) = e(p,q)ⁿ = e(p,n*q)The final property here is one reason that the concept of pairing useful in cryptography: it allows one person to perform a computation in
G₁and another to perform a computation inG₂and then the results can be compared inG_Twithout having to share information (or at least that's my simplistic interpretation: the details are complicated).Back to Plutus
The situation in BLS12-381 is as follows:
qwhere is a 381-bit prime (approximately 10^115)G₁is a subgroup of an elliptic curveE₁defined overGF(q)G₂is a subgroup of an elliptic curveE₂defined overGF(q²)G_Tis a subgroup of the multiplicative group of nonzero elements ofGF(q¹²)(the elements of this field are huge: you need 12x48 = 576 bytes to hold a single element)G₁,G₂, andG_Tare all isomorphic to each other and are of orderpwherepis a 255-bit prime (about 10^77). In fact, all of these groups are isomorphic to(ℤₚ, +), but in very obfuscated ways.p*a = 0for alla ∈ G₁, and similarly forG₂, so scalar multiplication is periodic modulop. This means that the efficiency of computing a scalar multiplen*acan be improved by reducingnmodulopbefore performing that multiplication (although scalar multiplication is still quite expensive since we've got up to 255 bits to multiply by). The reduction is performed automatically by thescalarMulfunctions.e: G₁ × G₂ → G_T, and combiningmillerLoopandfinalVerifyallows one to check whethere(a,b) = e(a',b')for elementsa,a' ∈ G₁andb,b' ∈ G₂. The fact that such a pairing exists is not obvious and depends on some rather complicated mathematics, which itself has undergone a lot of refinement to enable efficient computation.Serialisation and compression
We can now explain how elements of
G₁andG₂can be serialised as bytestrings.It follows from the facts above that every element of
G₁can be represented as a pair(x,y)withx,y ∈ GF(q), and each of x and y take up 381 bits, so each fits into 48 bytes with 3 bits left over. Thus(x,y)pairs can be serialised in this form, a pair taking up a total of 96 bytes. However, this is inefficient in terms of space: remember that the point satisfy an equationy² = x³ + ax +b, so if you know the x-coordinate of a point and can compute square roots inGF(q)then you can calculate the y-coordinate of the point and save yourself 48 bytes (with some complications that we'll see later). A similar situation holds for elements ofG₂, except that each coordinate requires 96 bytes. There are thus two possibilities for how group elements can be serialised into bytestrings:GF(q)has a square root, so your serialised value may not correspond to a point on the curve.The
blstlibrary supports both of these formats, using the final spare bit to indicate which serialisation method is intended (see the specification here) (the "spare" bits are the most significant (leftmost) bits in the bytestring encoding) . The first method is called "serialisation" and the second "compression". There's a trade-off here: the serialised format is space-inefficient but deserialisation is fast, whereas for the compressed format only half the space is required but uncompression is slower because of the need to extract a square root. In fact, experiments show that uncompression takes about 20 times longer than deserialisation forG₁and about 30 times longer forG₂. Despite this we have chosen to use the compressed format in Plutus Core, providing uncompression functionsbls12_381_G1_uncompressandbls12_381_G1_uncompress, along with corresponding compression functions. We may wish to rethink this at some point, possibly adding serialisation and deserialisation if the time penalty for uncompression is too great.Our representation is more restrictive than that of
blst:blstallows serialisation of any point on the curveE₁(orE₂), but during uncompression we reject any points which do not lie in the subgroupG₁(orG₂). The subgroups are much smaller than the entire curve group (|E₁|/|G₁| ≈ 7×10^37and|E₂|/|G₂| ≈ 3×10^152) so the chance of being able to successfully decompress a random bytestring to get a point in a subgroup is negligible.Concrete syntax for
G₁andG₂elements.We have also used the compressed representation for the concrete syntax for
G₁andG₂elements. A constant of typeG₁is represented by(con bls12_381_G1_element 0x...)where...consists of 96 hex digits, and aG₂element is similar but with 192 hex digits.Hashing
We also provide two functions
bls12_381_G1_hashToGroupandbls12_381_G2_hashToGroupwhich, as the names suggest, take an arbitrary bytestring and produce an element in the relevant group (not just on the curve). Hashing and uncompression are the only means of obtaining group elements directly. The other group operations preserve the property of being in the subgroup, so none of our functions can can give us arbitrary curve elements.Pairing operations and
bls12_381_mlresultIn the BLS12-381 situation, what one usually wishes to do is to calculate two pairings to get two elements of
G_Tand then compare these results for equality. This is an expensive computation and it is considerably more efficient (though still expensive) to calculate values of a so-called non-reduced pairing in an intermediate group closely related toG_Tand then at the end perform a calculation (final verification) that checks that two elements of the intermediate group become equal when lifted intoG_T.In Plutus Core, the intermediate group is called
bls12_381_mlresult, and the operations we can perform on this type are (deliberately) very limited. The only way to obtain a value of typebls12_381_mlresultis as a result of callingbls12_381_millerLoop(hence the name), or by multiplying together two existing elements of that type. We do not provide any concrete syntax, textual printing method, or serialisation format (includingflat) for values of typebls12_381_mlresult.Note also that the operations on
bls12_381_mlresultare quite expensive: the (constant) CPU costs are (currently) as follows:Fortunately this is in line with the expected on-chain usage patterns: in realistic applications one would tend to perform a few calls to
millerLoop, some multiplications, and one call tobls12_381_finalVerify, and costing benchmarks (cabal run bls-costs) suggest that this can be done in a reasonable time. For example, a realistic zk-snark verification takes about 23% of the on-chain script execution limit (memory costs are negligible).flatserialisationThe canonical storage format for Plutus Core scripts is the
flatformat (see Appendix E of the Plutus Core specification (pdf here)). We use the compression format described earlier for flat serialisation of elements of typebls12_381_G1_elementandbls12_381_G2_element: these are compressed to bytestrings which are then encoded using the normalflatencoding (but with tags 9 and 10 respectively). Deccoding can fail for a number of reasons: the bytestring can be the wrong length (it must be exactly 48 bytes long forbls12_381_G1_elementand 96 forbls12_381_G2_element); the encoding can be invalid because the three format bits are used improperly; the encoding may specify a valid field element but one which is not the x-coordinate of a point on the curve ("off-curve"); and the encoding may specify a genuine point on the curve but one which is not in the relevant subgroup ("out-of-group"). All of these are described in theBLSTErrortype defined in theblstHaskell bindings.There is no means of serialising
bls21_381_mlresultvalues. The tag 11 has been allocated to this type, but if it is encountered while decoding aflatfile then a failure occurs.Costing
Fortunately costing is quite simple for the BLS12-381 operations: costing benchmarks indicate that everything is constant cost apart from the hashing functions and the scalar multiplication functions, which are linear in the size of their input.