CakeML 是 ML 的证明实现,是个函数式编程语言,包含一个编译器和运行时系统。
CakeML 基于 Standard ML 实质性的子集,是高阶逻辑的特定语义。CakeML 的编译器算法也是在高阶逻辑有描述,已经证明可以把 CakeML 程序转换成语义等价的机器代码。CakeML 是自由软件。
CakeML 主页:https://cakeml.org/
GitHub 地址:https://github.com/CakeML/cakeml
-----------------------------------------------------------
CakeML: A Verified Implementation of ML
The CakeML project: https://cakeml.org
CakeML is a verified implementation of a significant subset of Standard ML.
The source and proofs for CakeML are developed in the HOL4 theorem prover. We use the latest development version of HOL4, which we build on PolyML 5.7.1. Example build instructions can be found in build-instructions.sh.
Building all of CakeML (including the bootstrapped compiler and its proofs) requires significant resources. Built copies of the compiler and resource usage for our regression tests are online.
The master branch contains the latest development version of CakeML. See the version2 or version1 branch for previous versions.
Directory structure
COPYING: CakeML Copyright Notice, License, and Disclaimer.
basis: Contains the beginnings of a standard basis library for CakeML, similar to the standard basis library of SML.
build-instructions.sh: This file describes how to install Poly/ML, HOL and CakeML.
candle: Verification of a HOL theorem prover, based on HOL Light (http://www.cl.cam.ac.uk/~jrh13/hol-light/), implemented in CakeML.
characteristic: A verified CakeML adaption of Arthur Charguéraud's "Characteristic Formulae for the Verification of Imperative Programs"
compiler: A verified compiler for CakeML, including:
- lexing and PEG parsing,
- type inference,
- compilation to ASM assembly language, and,
- code generation to x86, ARM, and more.
developers: This directory contains scripts for automating routine tasks, e.g., for generating README.md files.
examples: Examples of verified programs built using CakeML infrastructure.
how-to.md: This document introduces how to use the CakeML compiler, providing in particular:
- a description of how to invoke the CakeML compiler,
- a list of how CakeML differs from SML and OCaml, and,
- a number of small CakeML code examples.
icing: Main implementation directory for the RealCake development, presented in "Verified Compilation and Optimization of Floating-Point Programs"
misc: Auxiliary files providing glue between a standard HOL installation and what we want to use for CakeML development.
pancake: The Pancake compiler, i.e. a C-like compiler built from the lower parts of the CakeML compiler.
semantics: The definition of the CakeML language. The directory includes definitions of:
- the concrete syntax,
- the abstract syntax,
- big step semantics (both functional and relational),
- a small step semantics,
- the semantics of FFI calls, and,
- the type system.
translator: A proof-producing translator from HOL functions to CakeML.
tutorial: An extended worked example on using HOL and CakeML to write verified programs, that was presented as a tutorial on CakeML at PLDI and ICFP in 2017.
unverified: Various unverified tools, e.g. tools for converting OCaml to CakeML and an SML version of the CakeML register allocator.
from https://github.com/CakeML/cakeml
----------------------------------------------------
CakeML How To
This document introduces how to use the CakeML compiler, providing in particular:
- a description of how to invoke the CakeML compiler,
- a list of how CakeML differs from SML and OCaml, and,
- a number of small CakeML code examples.
This document is not meant to be an introduction to how to program in an ML-style language. For such a text, please refer to "ML for the Working Programmer" by L. C. Paulson, University of Cambridge.
This document is about using the verified CakeML compiler outside of the logic of a theorem prover. Readers interested in using CakeML to construct verified programs should develop their programs inside the logic of a theorem prover.
Running the CakeML compiler
The bootstrapped CakeML compiler can be downloaded from
github. Download the
tar.gz
file which contains among other things:
cake.S
— the machine code for the bootstrapped CakeML compilerbasis_ffi.c
— C code connecting the CakeML basis library to the OSMakefile
— for convenience of building binaries
Now let's run the compiler. Suppose you have a file called hello.cml
which contains:
print "Hello, World!\n";
The simplest way to compile and run this CakeML program, on GNU/Linux and
macOS, is to type make hello.cake
and then ./hello.cake
on the
command line as follows. On Windows, one types make hello.cake.exe
.
$ make hello.cake
$ ./hello.cake
The last line will print Hello, World!
on standard output.
By looking at what the make
does, you'll see that on the first run
it builds the CakeML compiler cake
, then it runs the CakeML compiler
on the input program. The CakeML compiler produces .S
files that
consist mostly of hex for machine code but also some wrapper code. We
use the system's C compiler to build basis_ffi.c
and to connect the
CakeML generated machine code with the C code that is accessed through
CakeML's foreign function interface (FFI).
A simple but complete program
The program above is too simple to be interesting. Below is a slightly more interesting program: this produces output based on command-line input, and prints a usage message if invoked incorrectly.
fun fac n = if n = 0 then 1 else fac (n-1) * n;
fun main () =
let
val arg = List.hd (CommandLine.arguments())
val n = Option.valOf (Int.fromString arg)
in
print_int (fac n) ; print "\n"
end
handle _ =>
TextIO.print_err ("usage: " ^ CommandLine.name() ^ " <n>\n");
main ();
If the code above is in a file called fac.cml
, then it can be
compiled and run as follows:
$ make fac.cake
$ ./fac.cake
usage: ./fac.cake <n>
$ ./fac.cake 5
120
$ ./fac.cake 50
30414093201713378043612608166064768844377641568960512000000000000
The last run illustrates that CakeML's integer type is the unbounded mathematical integers (arbitrary precision integers).
Configuring the CakeML compiler
The default configuration of the CakeML compiler is usually the right one to use. However, the CakeML compiler supports a number of command-line options that can be used to tweak the default configuration. Type the following for an explanation of the command-line options.
$ ./cake --help
When using the provided Makefile
, one can specify command-line
options by setting the CAKEFLAGS
variable. The following example
instructs the CakeML compiler to compile the factorial program with
the --gc=gen1000
option. This option tells the compiler to install
the generational version of CakeML's GC with a nursery size of 1000
machine words.
$ rm -f fac.cake ; export CAKEFLAGS='--gc=gen1000' ; make fac.cake
Several command-line options can be given at the same time: one can
specify that we want a 50000-word nursery and to use register
allocator settings 3
as follows.
$ rm -f fac.cake ; export CAKEFLAGS='--gc=gen50000 --reg_alg=3' ; make fac.cake
The default GC configuration is --gc=simple
.
Setting the size of the stack and heap
If program execution aborts with a message saying that the heap or stack space has been exhausted, then it might be worth trying to run the program with more heap or stack space.
The default heap and stack size is set to 1024 MB each. One can run the factorial program from above with 2048 MB of heap space and 512 MB of stack space by invoking it as follows:
$ export CML_HEAP_SIZE=2048 ; export CML_STACK_SIZE=512 ; ./fac.cake 50
Observe that these variables CML_HEAP_SIZE
and CML_STACK_SIZE
are
supplied to the compiled CakeML program fac.cake
rather than the
CakeML compiler. Note that, since the CakeML compiler is just another
CakeML program, the values of these environment variables also affect
the compiler's execution.
Alternatively, the allocated heap and stack size can be set in basis_ffi.c
by modifying these lines in the file.
Note that cml_heap_sz
and cml_stack_sz
are given in bytes here.
unsigned long sz = 1024*1024; // 1 MB unit
unsigned long cml_heap_sz = 1024 * sz; // Default: 1 GB heap
unsigned long cml_stack_sz = 1024 * sz; // Default: 1 GB stack
Basic profiling
On Linux systems, CakeML executables can be profiled using perf
.
As an example, we may profile the factorial program by running:
$ perf record ./fac.cake 30000
The report generated by perf
shows that most of the time is spent in
the bignum library (LongDiv
is used internally there):
$ perf report
Samples: 20K of event 'cycles', Event count (approx.): 15828778212
Overhead Command Shared Object Symbol
52.88% fac.cake fac.cake [.] cml__LongDiv_21
16.19% fac.cake fac.cake [.] cml__Bignum_50
12.10% fac.cake fac.cake [.] cml__Bignum_47
6.31% fac.cake fac.cake [.] cml__Replicate_9
...
How CakeML differs from SML and OCaml
The CakeML language is heavily based on Standard ML (SML), but CakeML differs in some aspects and takes inspiration from OCaml and Haskell. Below is a list of differences between CakeML and SML.
Syntactic differences
- CakeML has curried Haskell-style constructor syntax (see below)
- constructors in CakeML must begin with an uppercase letter
- constructors must be fully applied
- alpha-numeric variable and function names begin with a lowercase letter
- CakeML lacks SML's records, functors, open and (at present) signatures
- CakeML capitalises
True
,False
andRef
Semantic differences
- CakeML has right-to-left evaluation order
- CakeML has no equality types
- the semantics of equality in CakeML differs from those in SML and OCaml
- CakeML does not support let-polymorphism
Differences in conventions
- CakeML programmers should curry multi-argument functions
Basis library
The CakeML basis library is still developing and not aligned with SML or OCaml's standard libraries. To list the contents of the CakeML basis library, execute the following on the command line:
$ echo "" | ./cake --types
By invoking the compiler using ./cake --types
, one makes it run type
inference and then print the name and type of every top-level binding.
By supplying the compiler with the empty program, the top-level
bindings are only those from the basis library.
from https://github.com/CakeML/cakeml/blob/master/how-to.md#cakeml-how-to
No comments:
Post a Comment