| |
Proposed buffer format
Note
For this document, the word "block" only refers to disk blocks or
buffer file blocks. CSC/ RC5-64 blocks of keys are referred to as
"work units". I have also tried to use the word
"project" instead of "contest", because not every project we
take on will be a contest.
Overall Design
- The buffer file will contain incoming and outgoing work units for all
projects.
- Every block of the buffer file, including the header record, will be
exactly the same length. This will permit random-access file opens on
platforms which support it.
- The block size will be 512 bytes for the initial implementation, but
enough details will be present that a larger block size could be adopted in
the future. If the size grows in the future, it should continue to be
a multiple of 512 bytes so it will fall on even sector/cluster/track
boundaries.
- The header record will contain enough information that a client can
recognize if it can safely process the rest of the file. If the block
size increases, the initial 512 bytes of the header must still contain
enough information for all previous versions to make this determination.
- The header block will be the only block which does not follow the block
information.
- The buffer format will allow an older client to send and receive blocks
for all projects, if it can minimally determine it is safe to process the
file.
- Every block begins (or ends?) with information which allows a client to
identify which project it belongs to.
- The buffer file can be processed (uploaded, downloaded, repaired) without
specific knowledge of the contests contained.
- No project depends on the physical ordering of blocks within the
file. When a block is made available, a block from the end of the
buffer file may be moved into that location.
Header block information
- Bytes 0-31: Literal text: "DCTI Universal Buffer File\26\00"
- Bytes 32-33: Major buffer file format ID
- Byte 34: Size of blocks in 512's. (01=512, 04 = 2048)
- Byte 35: File Lock
- Bytes 36-43: computer id of locking client (may need to be longer)
- Bytes 44-51: Minimum client build id to access this file
- This is different than the version/build info used today.
- Each build of the client (beta or full) would have it's own build id,
which is a monotonically increasing number. This number wouldn't
even be shared across platforms.
- A client with a build-id smaller than this should refuse to process
the file and throw an error.
-
Block Common Information
- Byte 0: Block Type (values shown in hex)
- 00 = Unused
- 01/81 = RC5-64 incoming/outgoing
- 02/82 = CSC incoming/outgoing
- 03/83 = OGR-22 incoming/outgoing
- 04/84 = OGR-23 incoming/outgoing
- 70-7F = Reserved
- 80 = Reserved for possible extended header or project information
- F0-FF = Reserved
- Byte 1: Record Lock
- Reserved for possible future use.
- cyp: I know we talked about file-seek and performance. Doesn't
random-access file access get around that? If so, I think we
should build in the ability to lock blocks in the future.
- Bytes 2-9: ID of client locking this block
- Reserved for possible future use
- May need to be longer depending on length of the computer id
Stuff from cyp's doc which needs to be included in specific-project block
definitions
- time - absolute UTC block expiration date, after which time the block
should be discarded by the client.
- computer id (user defined) - for use in stats to cluster/group statistics
by individual machine.
- computer guid (machine computed) - unique identifier generated from
semi-persistent entropy data specific to the machine so as to reduce the
frequency this computed value changes. This value would allow unique
identification of machines, allowing for more accurate estimates of total
machine working, as well as isolation of invalid work computed by the same
machine.
- ticket - secure hash computed value generated by keymaster on block
creation (see opcodeauth.html) consisting of block specifics (keystart,
length) and a keymaster secret and a salt value.
- core id - which core was used to process a block, for client platforms
where alternate optimized cores are provided.
- client cpu - platform identifier that the block processing actually occurs
on (for machines that have alternate secondary processor boards of a
different type than the compile-time processor).
- test flag - Boolean indicator specifying if the block is a
"test" block.
- stuff from Dan
- hardware detect id. - raw hardware detection flags, very cpu specific
(vendor and processor stepping).
|