Blog

Behind the scenes: state.txt

state_txt by Tibor 'Igor2' Palinkas on 2020-02-12	Tags: behind, 2013, management
node source

Abstract: n/a

Pulling an event was a big feat on all levels. Preparations normally started in September. First target was preparing for the pre-EC and EC, which was held in early February around 2013. Then an even more intense few weeks to prepare for the finals, usually held in early May.

On the task writer side, we had a core team of about 6 programmers to do the bulk of the work, plus a variable number of helpers, usually 1..3 more programmers. With this team we had to come up with tasks that are interesting, solvable, not too easy, not too difficult. But having the initial idea was only the first step.

A well working task that we felt confident with in the problem set of an EC or finals had to be "flawless". Of course we had a few bugs left in the tasks from time to time, but from 2010 we employed the following QA procedure to make sure a task won't break:

We had 1 or 2 task writes responsible for the task specification; this included that they had to make sure the task really works with Challenge24, the input/output format is specified.
We had 1 or 2 task writers to create the input files; this often meant writing an input generator.
We had at least one solution (solver), but we tried to go for at least 2, independently developed solvers. We furthermore tried to allocate task writers so that the solvers were written by not the same people that were responsible for the task specification. This helped a lot in figuring if the specification was weak or even broken.
An important aspect of Challenge24 is that teams submit solutions as output, not as source code. Then output is evaluated objectively . For vast majority of the tasks this means an evaluator program takes the submission and decides if it is valid and/or awards the score.
The desc column is for the task description; this is the problem text in the problem set.

We kept track of the progress in a plain text file called state.txt placed in our VCS. Below is our state.txt for EC 2013.


############################### -[ E C ]- #################################
# problem      task       inputs       solvers     eval      desc       sum
# type         resp       resp         resp        resp      resp
###########################################################################
middle         100%        100%        100%         cmp     100%       100% RDY
graph          ngg         csirke      csirke/ngg    -        ?

offspring      100%        100%        200%         cmp     100%       125% RDY
math           ngg         ngg         ngg/csirke    -        ?

oor            100%        100%        200%       kb wscmp  100%       125% RDY
img/coding     igor2       igor2     igor2/sbagyi    -      igor2

cannon         100%        100%        100%      cmpfloats  100%       100% RDY
math           nsz/ngg     nsz         nsz/ngg       -        ?

connect pts    100%        100%        200%         90%     100%       122%
opt          mnagy/estrica mnagy     mnagy/ngg     mnagy      ?

stack          100%        100%        150%         50%     100%       100%
compiler       ngg       nsz/igor2   mnagy/ngg/nsz  ngg       ?

trains         100%        100%        100%          cmp    100%       100% RDY
sound          igor2       igor2    fulitomi/igor2    -       ?

Our task writer nicknames in this file were: ngg, csirke, igor2, sbagyi, nsz, mnagy, estrica and fulitomi. Each task spans two lines, first line is the nickname of the task, second line is the type of the task. We kept track on type in state.txt to make sure we select tasks that are diverse and that we cover all major fields (at least math/graph, opt(imization) and coding/img/sound).

For each job (column) of a task we had a progress (first row, in percent) and the nicknames of the task writers responsible for that specific job. 100% meant we had one valid implementation fully finished. For solvers we aimed at 200%, which meant two full implementations. If a task got an RDY mark on the right side, that meant the task was ready for the event.

There were a few generic eval programs, for example "cmp" means the output is a plain text file that can be simply compared to a reference output, almost byte-to-byte. "Almost" is the keyword here: we wanted to make the life of contestants easier so we wrote the comparison to ignore some whitespace differences. In some cases we even ignored numeric format details (cmpfloat).

The above file is a typical snapshot a few days before the event.

Blog

Behind the scenes: state.txt

Navigation