|
|
Blog
Behind the scenes: state.txt
state_txt by Tibor 'Igor2' Palinkas on 2020-02-12 | Tags: behind, 2013, management |
Abstract: n/a
Pulling an event was a big feat on all levels. Preparations normally started in September. First target was preparing for the pre-EC and EC, which was held in early February around 2013. Then an even more intense few weeks to prepare for the finals, usually held in early May.
On the task writer side, we had a core team of about 6 programmers to do the bulk of the work, plus a variable number of helpers, usually 1..3 more programmers. With this team we had to come up with tasks that are interesting, solvable, not too easy, not too difficult. But having the initial idea was only the first step.
A well working task that we felt confident with in the problem set of an EC or finals had to be "flawless". Of course we had a few bugs left in the tasks from time to time, but from 2010 we employed the following QA procedure to make sure a task won't break:
- We had 1 or 2 task writes responsible for the task specification; this included that they had to make sure the task really works with Challenge24, the input/output format is specified.
- We had 1 or 2 task writers to create the input files; this often meant writing an input generator.
- We had at least one solution (solver), but we tried to go for at least 2, independently developed solvers. We furthermore tried to allocate task writers so that the solvers were written by not the same people that were responsible for the task specification. This helped a lot in figuring if the specification was weak or even broken.
- An important aspect of Challenge24 is that teams submit solutions as output, not as source code. Then output is evaluated objectively . For vast majority of the tasks this means an evaluator program takes the submission and decides if it is valid and/or awards the score.
- The desc column is for the task description; this is the problem text in the problem set.
We kept track of the progress in a plain text file called state.txt placed
in our VCS. Below is our state.txt for EC 2013.
############################### -[ E C ]- #################################
# problem task inputs solvers eval desc sum
# type resp resp resp resp resp
###########################################################################
middle 100% 100% 100% cmp 100% 100% RDY
graph ngg csirke csirke/ngg - ?
offspring 100% 100% 200% cmp 100% 125% RDY
math ngg ngg ngg/csirke - ?
oor 100% 100% 200% kb wscmp 100% 125% RDY
img/coding igor2 igor2 igor2/sbagyi - igor2
cannon 100% 100% 100% cmpfloats 100% 100% RDY
math nsz/ngg nsz nsz/ngg - ?
connect pts 100% 100% 200% 90% 100% 122%
opt mnagy/estrica mnagy mnagy/ngg mnagy ?
stack 100% 100% 150% 50% 100% 100%
compiler ngg nsz/igor2 mnagy/ngg/nsz ngg ?
trains 100% 100% 100% cmp 100% 100% RDY
sound igor2 igor2 fulitomi/igor2 - ?
Our task writer nicknames in this file were: ngg, csirke, igor2, sbagyi, nsz, mnagy, estrica and fulitomi. Each task spans two lines, first line is the nickname of the task, second line is the type of the task. We kept track on type in state.txt to make sure we select tasks that are diverse and that we cover all major fields (at least math/graph, opt(imization) and coding/img/sound).
For each job (column) of a task we had a progress (first row, in percent) and the nicknames of the task writers responsible for that specific job. 100% meant we had one valid implementation fully finished. For solvers we aimed at 200%, which meant two full implementations. If a task got an RDY mark on the right side, that meant the task was ready for the event.
There were a few generic eval programs, for example "cmp" means the output is a plain text file that can be simply compared to a reference output, almost byte-to-byte. "Almost" is the keyword here: we wanted to make the life of contestants easier so we wrote the comparison to ignore some whitespace differences. In some cases we even ignored numeric format details (cmpfloat).
The above file is a typical snapshot a few days before the event.