summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/compiler.txt44
-rw-r--r--doc/lang.txt38
2 files changed, 39 insertions, 43 deletions
diff --git a/doc/compiler.txt b/doc/compiler.txt
index 20593ca..e296c0e 100644
--- a/doc/compiler.txt
+++ b/doc/compiler.txt
@@ -48,23 +48,23 @@ TABLE OF CONTENTS:
The compilation is divided into a small number of phases. The first phase
is parsing, where the source code is first tokenized, the abstract syntax
tree (AST) is generated, and semantically checked. The second phase is the
- machine dependent tree flattening. In this phase, the tree is decomposed
+ machine-dependent tree flattening. In this phase, the tree is decomposed
function by function into simple operations that are relatively close to
- the machine. Sizes are fixed, and all loops, if statements, etc are
- replaced with gotos. The next phase is a machine independent optimizer,
+ the machine. Sizes are fixed, and all loops, if statements, etc. are
+ replaced with gotos. The next phase is a machine-independent optimizer,
which currenty does nothing other than simply folding trees. In the final
phase, the instructions are selected and the registers are allocated.
So, to recap, the phases are as follows:
- parse Tokenize, parse and analyze the source.
+ parse Tokenize, parse, and analyze the source
flatten Rewrite the complex nodes into simpe ones
opt Optimize the flattened source trees
gen Generate the assembly code
- 1.1. Tree Structure.
+ 1.1. Tree Structure:
- File nodes (n->type == Nfile) represents the being compiled. The current
+ File nodes (n->type == Nfile) represent the files being compiled. The current
node is held in a global variable called, unsurprisingly, 'file'. The
global symbol table, export table, uses, and other compilation-specific
information is stored in this node. This implies that the compiler can
@@ -113,7 +113,7 @@ TABLE OF CONTENTS:
2.1. Lexing:
- Lexing occurs in parse/tok.c. Because we desire to use this lexer from
+ Lexing occurs in parse/tok.c. Because we want to use this lexer from
within yacc, the entry point to this code is in 'yylex()'. As required
by yacc, 'yylex()' returns an integer defining the token type, and
sets the 'tok' member of yylval to the token that was taken from the
@@ -122,7 +122,7 @@ TABLE OF CONTENTS:
allows yyerror to print the last token that was seen.
The tokens that are allowable are generated by Yacc from the '%token'
- definiitions in parse/gram.y, and are placed into the file
+ definitions in parse/gram.y, and are placed into the file
'parse/gram.h'. The lexer and parser code is the only code that
depends on these token constants.
@@ -142,7 +142,7 @@ TABLE OF CONTENTS:
2.2. AST Creation:
- The parser used is a traditional Yacc based parser. It is generated
+ The parser used is a traditional Yacc-based parser. It is generated
from the source in parse/gram.y. The starting production is 'file',
which fills in a global 'file' tree node. This 'file' tree node must
be initialized before yyparse() is called.
@@ -167,7 +167,7 @@ TABLE OF CONTENTS:
complete as possible, and making sure that the types of globals
actually match up with the exported types.
- The next step is the actual type inference. We do a bottom up walk of
+ The next step is the actual type inference. We do a bottom-up walk of
the tree, unifying types as we go. There are subtleties with the
member operator, however. Because the '.' operator is used for both
member lookups and namespace lookups, before we descend into a node
@@ -203,7 +203,7 @@ TABLE OF CONTENTS:
So, in the 'typesub()' function, we iterate over the entire tree,
replacing every instance of a non-concrete type with the final
mapped type. If a type does not map to a fully concrete type,
- this is where we error.
+ this is where we flag an error.
FIXME: DESCRIBE HOW YOU FIXED GENERICS ONCE YOU FIX GENERICS.
@@ -232,15 +232,15 @@ TABLE OF CONTENTS:
Usefiles are more or less files that consist of a single character tag
that tells us what type of tree to deserialize. Because serialized
- trees are compiler version dependant, so are usefiles.
+ trees are compiler version dependent, so are usefiles.
3. FLATTENING:
- This phase is invoked repeatedly on each top level declaration that we
+ This phase is invoked repeatedly on each top-level declaration that we
want to generate code for. There is a good chance that this flattening
- phase should be made machine independent, and passed as a parameter
+ phase should be made machine-independent, and passed as a parameter
a machine description describing known integer and pointer sizes, among
- other machine attributes. However, for now, it is machine dependent,
+ other machine attributes. However, for now, it is machine-dependent,
and lives in 6/simp.c.
The goal of flattening a tree is to take semantically involved constructs
@@ -277,7 +277,7 @@ TABLE OF CONTENTS:
3.2. Complex Expressions:
Complex expressions such as copying types larger than a single machine
- word, pulling members out of structures, emulated multiplication and
+ word, pulling members out of structures, emulating multiplication and
division for larger integers sizes, and similar operations are reduced
to trees that are expressible in terms of simple machine operations.
@@ -298,7 +298,7 @@ TABLE OF CONTENTS:
4.1. Constant Folding:
Expressions with constant values are simplified algebraically. For
- example, the expression 'x*1' is simplified to simply 'x', '0/n' is
+ example, the expression 'x*1' is simplified to 'x', '0/n' is
simplified to '0', and so on.
@@ -306,18 +306,18 @@ TABLE OF CONTENTS:
5.1. Instruction Selection:
- Instruction selection is done via a simple hand written bottom up pass
+ Instruction selection is done via a simple handwritten bottom-up pass
over the tree. Common patterns such as scaled or offset indexing are
- recognized by the patterns, but no attempts at finding an optimal
- tiling are made.
+ recognized by the patterns, but no attempts are made at finding an
+ optimal tiling.
5.2. Register Allocation:
Register allocation is done via the algorithm described in "Iterated
- Regster Coalescing", by Appel and George. As of the time of this
+ Regster Coalescing" by Appel and George. As of the time of this
writing, the register allocator does not yet implement overlapping
register classes. This will be done as described in "A generalized
- algorithm for graph-coloring register allocation", by Smith, Ramsey,
+ algorithm for graph-coloring register allocation" by Smith, Ramsey,
and Holloway.
6: TUTORIAL: ADDING A STATEMENT:
diff --git a/doc/lang.txt b/doc/lang.txt
index cab2b95..711d9b9 100644
--- a/doc/lang.txt
+++ b/doc/lang.txt
@@ -22,14 +22,14 @@ TABLE OF CONTENTS:
1. ABOUT:
- Myrddin is designed to be a simple, low level programming
+ Myrddin is designed to be a simple, low-level programming
language. It is designed to provide the programmer with
predictable behavior and a transparent compilation model,
while at the same time providing the benefits of strong
type checking, generics, type inference, and similar.
Myrddin is not a language designed to explore the forefront
- of type theory, or compiler technology. It is not a language
- that is focused on guaranteeing perfect safety. It's focus
+ of type theory or compiler technology. It is not a language
+ that is focused on guaranteeing perfect safety. Its focus
is on being a practical, small, fairly well defined, and
easy to understand language for work that needs to be close
to the hardware.
@@ -41,10 +41,10 @@ TABLE OF CONTENTS:
2. LEXICAL CONVENTIONS:
- The language is composed of several classes of token. There
+ The language is composed of several classes of tokens. There
are comments, identifiers, keywords, punctuation, and whitespace.
- Comments, begin with "/*" and end with "*/". They may nest.
+ Comments begin with "/*" and end with "*/". They may nest.
/* this is a comment /* with another inside */ */
@@ -80,32 +80,28 @@ TABLE OF CONTENTS:
the program. There are several literals implemented within the language.
These are fully described in section 3.2 of this manual.
- In the compiler, single semicolons (';') , semicolons and newlines (\x10)
+ In the compiler, single semicolons (';') and newline (\x10)
characters are treated identically, and are therefore interchangable.
They will both be referred to "endline"s thoughout this manual.
3. SYNTAX OVERVIEW:
- Myrddin syntax will likely have a familiar-but-strange taste
- to many people. Many of the concepts and constructions will be
- similar to those present in C, but different.
-
3.1. Declarations:
- A declaration consists of a declaration class (ie, one
+ A declaration consists of a declaration class (i.e., one
of 'const', 'var', or 'generic'), followed by a declaration
name, optionally followed by a type and assignment. One thing
you may note is that unlike most other languages, there is no
special function declaration syntax. Instead, a function is
- declared like any other value: By assigning its name to a
+ declared like any other value: by assigning its name to a
constant or variable.
const: Declares a constant value, which may not be
modified at run time. Constants must have
initializers defined.
var: Declares a variable value. This value may be
- assigned to, copied from, and
+ assigned to, copied from, and modified.
generic: Declares a specializable value. This value
has the same restricitions as a const, but
taking its address is not defined. The type
@@ -132,13 +128,13 @@ TABLE OF CONTENTS:
var y
- Declares a generic with type '@a', and assigns it the value
+ Declare a generic with type '@a', and assigns it the value
'blah'. Every place that 'z' is used, it will be specialized,
and the type parameter '@a' will be substituted.
generic z : @a = blah
- Declares a function f with and without type inference. Both
+ Declare a function f with and without type inference. Both
forms are equivalent. 'f' takes two parameters, both of type
int, and returns their sum as an int
@@ -164,9 +160,9 @@ TABLE OF CONTENTS:
eg: 0x123_fff, 0b1111, 1234
- Float literals are also a sequence of digits beginning with a
- digit and possibly separated by underscores. They are also of a
- generic type, and may be used whenever a floating point type is
+ Floating-point literals are also a sequence of digits beginning with
+ a digit and possibly separated by underscores. They are also of a
+ generic type, and may be used whenever a floating-point type is
expected. Floating point literals are always in decimal, and
as of this writing, exponential notation is not supported[2]
@@ -396,7 +392,7 @@ TABLE OF CONTENTS:
`Name x Union construction
Precedence 5:
- x casttto(type) Cast expression
+ x castto(type) Cast expression
Precedence 4:
x == x Equality
@@ -425,10 +421,10 @@ TABLE OF CONTENTS:
x <<= x Fused shl/assign Right assoc
x >>= x Fused shr/assign Right assoc
- Precedence 14:
+ Precedence 0:
-> x Return expression
- All expressions on integers act on complement-two values which wrap
+ All expressions on integers act on two's complement values which wrap
on overflow. Right shift expressions fill with the sign bit on
signed types, and fill with zeros on unsigned types.