summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorOri Bernstein <ori@eigenstate.org>2017-01-21 16:05:56 -0800
committerOri Bernstein <ori@eigenstate.org>2017-01-21 16:05:56 -0800
commit21eb277c5a5d3178eed36c23466cd4abdede5122 (patch)
tree616f22909cda7dfc134859bc6f32eaec9e2e12db /doc
parent8775024dce46b5ba730b8f892111d9d7b7c56a66 (diff)
downloadmc-21eb277c5a5d3178eed36c23466cd4abdede5122.tar.gz
Shuffle around documentation for leaf types.
Diffstat (limited to 'doc')
-rw-r--r--doc/lang.txt407
1 files changed, 183 insertions, 224 deletions
diff --git a/doc/lang.txt b/doc/lang.txt
index f1cedf4..ce60ec6 100644
--- a/doc/lang.txt
+++ b/doc/lang.txt
@@ -362,132 +362,131 @@ TABLE OF CONTENTS:
must be explicitly cast if you want to convert, and the casts must
be of compatible types, as will be described later.
- 4.1.1. Primitive types:
+ 4.1.1. Primitive types:
- void
- bool char
- int8 uint8
- int16 uint16
- int32 uint32
- int64 uint64
- int uint
- long ulong
- float32 float64
-
- 'void' is a type and a value although for the sake of
- genericity, you can assign between void types, return values
- of void, and so on. This allows generics to not have to
- somehow work around void being a toxic type. The void value is
- named `void`.
-
- It is interesting to note that these types are not keywords,
- but are instead merely predefined identifiers in the type
- namespace.
-
- bool is a type that can only hold true and false. It can be
- assigned, tested for equality, and used in the various boolean
- operators.
+ void
+ bool char
+ int8 uint8
+ int16 uint16
+ int32 uint32
+ int64 uint64
+ int uint
+ long ulong
+ flt32 flt64
- char is a 32 bit integer type, and is guaranteed to hold
- exactly one Unicode codepoint. It can be assigned integer
- literals, tested against, compared, and all the other usual
- numeric types.
+ 'void' is a type and a value although for the sake of
+ genericity, you can assign between void types, return values
+ of void, and so on. This allows generics to not have to
+ somehow work around void being a toxic type. The void value is
+ named `void`.
+
+ It is interesting to note that these types are not keywords,
+ but are instead merely predefined identifiers in the type
+ namespace.
- The various [u]intXX types hold, as expected, signed and
- unsigned integers of the named sizes respectively.
- Similarly, floats hold floating point types with the
- indicated precision.
+ bool is a type that can only hold true and false. It can be
+ assigned, tested for equality, and used in the various boolean
+ operators.
- var x : int declare x as an int
- var y : float32 declare y as a 32 bit float
+ char is a 32 bit integer type, and is guaranteed to hold
+ exactly one Unicode codepoint. It can be assigned integer
+ literals, tested against, compared, and all the other usual
+ numeric types.
+ The various [u]intXX types hold, as expected, signed and
+ unsigned integers of the named sizes respectively.
+ Similarly, floats hold floating point types with the
+ indicated precision.
- 4.1.2. Composite types:
+ var x : int declare x as an int
+ var y : float32 declare y as a 32 bit float
- pointer
- slice array
- Pointers are, as expected, values that hold the address of
- the pointed to value. They are declared by appending a '#'
- to the type. Pointer arithmetic is not allowed. They are
- declared by appending a '#' to the base type
+ 4.1.2. Composite types:
- Arrays are a group of N values, where N is part of the type.
- Arrays of different sizes are incompatible. Arrays in
- Myrddin, unlike many other languages, are passed by value.
- They are declared by appending a '[SIZE]' to the base type.
+ pointer
+ slice array
- Slices are similar to arrays in many contemporary languages.
- They are reference types that store the length of their
- contents. They are declared by appending a '[,]' to the base
- type.
+ Pointers are, as expected, values that hold the address of
+ the pointed to value. They are declared by appending a '#'
+ to the type. Pointer arithmetic is not allowed. They are
+ declared by appending a '#' to the base type
- foo# type: pointer to foo
- foo[123] type: array of 123 foo
- foo[,] type: slice of foo
+ Arrays are a group of N values, where N is part of the type,
+ meaning that different sizes are incompatible. They are
+ passed by value. Their size must be a compile time constant.
- 4.1.3. Aggregate types:
+ Slices are similar to arrays in many contemporary languages.
+ They are reference types that store the length of their
+ contents. They are declared by appending a '[,]' to the base
+ type.
- tuple struct
- union
+ foo# type: pointer to foo
+ foo[N] type: array size N of foo
+ foo[:] type: slice of foo
- Tuples are the traditional product type. They are declared
- by putting the comma separated list of types within square
- brackets.
+ 4.1.3. Aggregate types:
- Structs are aggregations of types with named members. They
- are declared by putting the word 'struct' before a block of
- declaration cores (ie, declarations without the storage type
- specifier).
+ tuple struct
+ union
- Unions are the traditional sum type. They consist of a tag
- (a keyword prefixed with a '`' (backtick)) indicating their
- current contents, and a type to hold. They are declared by
- placing the keyword 'union' before a list of tag-type pairs.
- They may also omit the type, in which case, the tag is
- sufficient to determine which option was selected.
+ Tuples are the traditional product type. They are declared
+ by putting the comma separated list of types within square
+ brackets.
- [int, int, char] a tuple of 2 ints and a char
+ Structs are aggregations of types with named members. They
+ are declared by putting the word 'struct' before a block of
+ declaration cores (ie, declarations without the storage type
+ specifier).
- struct a struct containing an int named
- a : int 'a', and a char named 'b'.
- b : char
- ;;
+ Unions are the traditional sum type. They consist of a tag
+ (a keyword prefixed with a '`' (backtick)) indicating their
+ current contents, and a type to hold. They are declared by
+ placing the keyword 'union' before a list of tag-type pairs.
+ They may also omit the type, in which case, the tag is
+ sufficient to determine which option was selected.
- union a union containing one of
- `Thing int int or char. The values are not
- `Other float32 named, but they are tagged.
- ;;
+ [int, int, char] a tuple of 2 ints and a char
+ struct a struct containing an int named
+ a : int 'a', and a char named 'b'.
+ b : char
+ ;;
- 4.1.4. Generic types:
+ union a union containing one of
+ `Thing int int or char. The values are not
+ `Other float32 named, but they are tagged.
+ ;;
- tyvar typaram
- tyname
-
- A tyname is a named type, similar to a typedef in C, however
- it genuinely creates a new type, and not an alias. There are
- no implicit conversions, but a tyname will inherit all
- constraints of its underlying type.
- A typaram is a parametric type. It is used in generics as
- a placeholder for a type that will be substituted in later.
- It is an identifier prefixed with '@'. These are only valid
- within generic contexts, and may not appear elsewhere.
+ 4.1.4. Generic types:
- A tyvar is an internal implementation detail that currently
- leaks in error messages out during type inference, and is a
- major cause of confusing error messages. It should not be in
- this manual, except that the current incarnation of the
- compiler will make you aware of it. It looks like '@$type',
- and is a variable that holds an incompletely inferred type.
+ tyvar typaram
+ tyname
- type mine = int creates a tyname named
- 'mine', equivalent to int.
+ A tyname is a named type, similar to a typedef in C, however
+ it genuinely creates a new type, and not an alias. There are
+ no implicit conversions, but a tyname will inherit all
+ constraints of its underlying type.
+ A typaram is a parametric type. It is used in generics as
+ a placeholder for a type that will be substituted in later.
+ It is an identifier prefixed with '@'. These are only valid
+ within generic contexts, and may not appear elsewhere.
- @foo creates a type parameter
- named '@foo'.
+ A tyvar is an internal implementation detail that currently
+ leaks in error messages out during type inference, and is a
+ major cause of confusing error messages. It should not be in
+ this manual, except that the current incarnation of the
+ compiler will make you aware of it. It looks like '@$type',
+ and is a variable that holds an incompletely inferred type.
+
+ type mine = int creates a tyname named
+ 'mine', equivalent to int.
+
+
+ @foo creates a type parameter
+ named '@foo'.
4.2. Type Inference:
The myrddin type system is a system similar to the Hindley Milner
@@ -503,161 +502,113 @@ TABLE OF CONTENTS:
It begins by initializing all leaf nodes with the most specific
known type for them as follows:
- 4.6.1 Types for leaf nodes:
-
- Variable Type
- ----------------------
- var foo $t
-
- A type variable is the most specific type for a declaration
- or function without any specified type
-
- var foo : t t
-
- If a type is specified, that type is taken for the
- declaration.
-
- "asdf" byte[:]
-
- String literals are byte arrays.
-
-
- 'a' char
-
- Char literals are of type 'char'
-
- void void
-
- void is a literal value of type void.
-
- true bool
- false bool
-
- true/false are boolean literals
-
- 123 $t::(integral,numeric)
-
- Integer literals get a fresh type variable of type with
- the constraints for int-like types.
-
- 123.1 $t::(floating,numeric)
-
- Float literals get a fresh type variable of type with
- the constraints for float-like types.
-
- {a,b:t; } ($a,t -> $b)
-
- Function literals get the most specific type that can
- be determined by their signature.
-
-
- num-binop:
-
- + - * / %
- += -= *= /= %
-
- Number binops require the constraint 'numeric' for both the
-
- num-unary:
- - +
- Number binops require the constraint 'numeric'.
-
- int-binop:
- | & ^ << >>
- |= &= ^= <<= >>
- int-unary:
- ~ ++ --
-
- bool-binop:
- || && == !=
- < <= > >=
-
-5. VALUES AND EXPRESSIONS:
-
5.2. Literal Values
5.1.1. Atomic Literals:
- literal: strlit | chrlit | floatlit |
- boollit | voidlit | intlit |
+ literal: strlit | chrlit | intlit |
+ boollit | voidlit | floatlit |
funclit | seqlit | tuplit
strlit: \"(byte|escape)*\"
chrlit: \'(utf8seq|escape)\'
char: <any byte value>
+ boollit: "true"|"false"
+ voidlit: "void"
escape: <any escape sequence>
intlit: "0x" digits | "0o" digits | "0b" digits | digits
floatlit: digit+"."digit+["e" digit+]
- boollit: "true"|"false"
- voidlit: "void"
- Integers literals are a sequence of digits, beginning with a digit
- and possibly separated by underscores. They may be prefixed with
- "0x" to indicate that the following number is a hexadecimal value,
- 0o to indicate an octal value, or 0b to indicate a binary value.
- Decimal values are not prefixed.
+ 5.1.1.1. String Literals:
+
+ String literals represent a compact method of representing a
+ byte array. Any byte values are allowed in a string literal,
+ and will be spit out again by the compiler unmodified, with
+ the exception of escape sequences.
+
+ There are a number of escape sequences supported for both character
+ and string literals:
+ \n newline
+ \r carriage return
+ \t tab
+ \b backspace
+ \" double quote
+ \' single quote
+ \v vertical tab
+ \\ single slash
+ \0 nul character
+ \xDD single byte value, where DD are two hex digits.
+ \u{xxx} unicode escape, emitted as utf8.
+
+ String literals begin with a ", and continue to the next
+ unescaped ".
+
+ eg: "foo\"bar"
+
+ Multiple consecutive string literals are implicitly merged to create
+ a single combined string literal. To allow a string literal to span
+ across multiple lines, the new line characters must be escaped.
+
+ eg: "foo" \
+ "bar"
+
+ They have the type `byte[:]`
- eg: 0x123_fff, 0b1111, 0o777, 1234
+ 5.1.1.2. Character Literals:
- Integer literals have the type `@a::(numeric,integral)`.
+ Character literals represent a single codepoint in the
+ character set. A character starts with a single quote,
+ contains a single codepoint worth of text, encoded either as
+ an escape sequence or in the input character set for the
+ compiler (generally UTF8). They share the same set of escape
+ sequences as string literals.
- Floating-point literals are also a sequence of digits beginning with a
- digit and possibly separated by underscores. Floating point
- literals are always in decimal.
+ eg: 'א', '\n', '\u{1234}'
- eg: 123.456, 10.0e7, 1_000.
+ They have the type `char`.
- Floating point literals have the type `@a::(numeric,floating)`.
+ 5.1.1.3. Integer Literals
- String literals represent a compact method of representing a byte
- array. Any byte values are allowed in a string literal, and will be
- spit out again by the compiler unmodified, with the exception of
- escape sequences.
+ Integers literals are a sequence of digits, beginning with a digit
+ and possibly separated by underscores. They may be prefixed with
+ "0x" to indicate that the following number is a hexadecimal value,
+ 0o to indicate an octal value, or 0b to indicate a binary value.
+ Decimal values are not prefixed.
- There are a number of escape sequences supported for both character
- and string literals:
- \n newline
- \r carriage return
- \t tab
- \b backspace
- \" double quote
- \' single quote
- \v vertical tab
- \\ single slash
- \0 nul character
- \xDD single byte value, where DD are two hex digits.
- \u{xxx} unicode escape, emitted as utf8.
+ eg: 0x123_fff, 0b1111, 0o777, 1234
- String literals begin with a ", and continue to the next
- unescaped ".
+ They have the type `@a::(numeric,integral)
- eg: "foo\"bar"
+ 5.1.1.4: Boolean Literals:
- Multiple consecutive string literals are implicitly merged to create
- a single combined string literal. To allow a string literal to span
- across multiple lines, the new line characters must be escaped.
-
- eg: "foo" \
- "bar"
+ Boolean literals are spelled `true` or `false`.
+ Unsurprisingly, they evaluate to `true` or `false`
+ respectively.
+
+ eg: true, false
- String literals have the type `byte[:]`
+ They have the type `bool`
- Character literals represent a single codepoint in the character
- set. A character starts with a single quote, contains a single
- codepoint worth of text, encoded either as an escape sequence
- or in the input character set for the compiler (generally UTF8).
- They share the same set of escape sequences as string literals.
+ 5.1.1.4: Boolean Literals:
- eg: 'א', '\n', '\u{1234}'
+ Void literals are spelled `void`. They evaluate to the void
+ value, a value that takes zero bytes storage, and contains
+ only the value `void`. Like my soul.
- Character literals have the type `char`
+ eg: void
- Boolean literals are either the keyword "true" or the keyword
- "false".
+ They have type `void`.
- eg: true, false
+ 5.1.1.5: Floating point literals:
+
+ Floating-point literals are also a sequence of digits beginning with a
+ digit and possibly separated by underscores. Floating point
+ literals are always in decimal.
+
+ eg: 123.456, 10.0e7, 1_000.
+
+ They have type `@a::(numeric,floating)`
- Boolean literals have the type `bool`
5.1.2. Sequence and Tuple Literals:
@@ -707,6 +658,11 @@ TABLE OF CONTENTS:
Example: Tuple literals:
(1,), (1,'b',"three")
+ A tuple has the type of its constituent values grouped
+ into a tuple:
+
+ (@a, @b, @c, ..., @z)
+
5.1.3. Function Literals:
@@ -748,6 +704,9 @@ TABLE OF CONTENTS:
var b = {; a + 1}
}
+ A function literal has the arity of its argument list,
+ and shares their type if it is provided. Otherwise,
+ they are left generic. The same applies to the return type.
5.1.4: Labels: