Blog

Compiler1

compiler1 by Tibor 'Igor2' Palinkas on 2020-02-27	Tags: 2010, ec, 2010ec, solver, tools, generator, eval, compiler, vm, lang, scaled
node source

Abstract: n/a

Datasheet

Task	Compiler (compiler1)
When	2010 EC G
Description	html
Input pack	zip

Compiler1 background

We already had some background in low level (assembly) programming different small CPUs/MCUs. We knew one of the interesting questions is register allocation, especially if one writes a compiler for an arch that has only a few registers available.

Our original idea was to present a task focusing on register allocation. However this required some additional layers around it so that there is something the registers are alloced for. The task writer team agreed that it'd be a good change to introduce such not strictly algorithmic tasks to the EC, to make the EC more diverse, a bit more similar to the finals. At the end solving this task required teams to:

understand a minimalistic virtual CPU architecture, designed with somewhat realistic instruction set
write a compiler, register allocation included, to compile the postfix program into asm
optionally: write a VM (because the submission system tracing was artificially slowed down)

This problem is scaled, which means solutions are compared among teams. It is not very hard to write a trivial solver that blindly translates the postfix to verbose asm code. The real hard part is to generate code that is a bit more optimal than other teams' code.

This is one of the tasks where the setup cost is rather high (in time): although it's relatively easy to develop the tooling, it does take time. We designed the scoring function to reflect this: if a team could write the trivial solver with no optimization whatsoever, and ran it for all 10 inputs successfully, that was worth 20% of the score. Note: strategy matters a lot here. It may be cheaper to hand-optimize the code for the first few inputs (which are small) than to spend time on implementing an optimizing compiler.

Our tooling

Our tooling is available as a tarball that requires Linux. It depends on make, a C compiler and python. Once unpacked, run make to get the basic tools compiled.

Input files

We wrote our input files manually in infix form: inputs/*.prg. We used the precompiler to produce the problem input files. Run make in inputs/ to get the problem files compiled.

There is on input files that was not hand written: 10.asm is generated by a Haskell script (inputs/nfa/).

Tools

There are three tools in tools/.

The precompiler reads a problem input file on its input, parses the infix format and prints a simple postfix expression to stdout. This can be used for the input of a compiler that then doesn't need to implement an infix parser. Usage:


cd tools
./precompiler < ../inputs/1.prg

The assembler is a python script that reads the asm file on stdin as and prints "machine language" code to stdout that is suitable for the virtual machine (VM) to run. Usage:


cd tools
python asm.py < example.asm > example.vm

Finally the Virtual Machine (VM) reads the "machine language" code in stdin, executes it and prints the result. Optionally, if the -t argument is passed, it also prints trace messages to stdout. Usage:


cd tools
./vm -t < example.vm

Reference solver

The reference solver lives in solver2/. It takes the path of a problem input file and prints the solution to stdout (and some local temp file). The solution is rather dumb, with no much optimization. Usage:


cd solver2
./solver.sh ../inputs/1.prg

The reference solver uses the precompiler so that it doesn't need to parse infix. The resulting asm file is what could have been submitted.

In the same directory run.sh can be used to execute the solver's output asm with trace. It depends on asm.py and the vm. Usage:


cd solver2
./solver.sh ../inputs/1.prg
./run.sh 1.asm

Eval

The scripts we used for the automated evaluation are in eval/; one for the normal, for-score submissions (evalreal.sh) and one for the debug trace service (evaltrace.sh). They call inputs/test.sh and inputs/testtrace.sh. In turn they depend on the asm and the vm tools.

Since the input of an eval script is following the eval protocol, the asm file can not be fed into the eval script directly: two lines need to be inserted on top. The first line should be a single integer, the number of the input file (1..10 inclusive), the second line needs to be 0.

Notes

We used completely different input values for tracing and scoring submissions. That was because we did not want teams to use the tracer to figure the input values, do the calculations offline and then write a very short program that just printed the right output. The input values used are inputs/*.real.in and inputs/*.trace.in