Proposed buffer format

Note

For this document, the word "block" only refers to disk blocks or buffer file blocks. CSC/ RC5-64 blocks of keys are referred to as "work units". I have also tried to use the word "project" instead of "contest", because not every project we take on will be a contest.

Overall Design

The buffer file will contain incoming and outgoing work units for all projects.
Every block of the buffer file, including the header record, will be exactly the same length. This will permit random-access file opens on platforms which support it.
The block size will be 512 bytes for the initial implementation, but enough details will be present that a larger block size could be adopted in the future. If the size grows in the future, it should continue to be a multiple of 512 bytes so it will fall on even sector/cluster/track boundaries.
The header record will contain enough information that a client can recognize if it can safely process the rest of the file. If the block size increases, the initial 512 bytes of the header must still contain enough information for all previous versions to make this determination.
The header block will be the only block which does not follow the block information.
The buffer format will allow an older client to send and receive blocks for all projects, if it can minimally determine it is safe to process the file.
Every block begins (or ends?) with information which allows a client to identify which project it belongs to.
The buffer file can be processed (uploaded, downloaded, repaired) without specific knowledge of the contests contained.
No project depends on the physical ordering of blocks within the file. When a block is made available, a block from the end of the buffer file may be moved into that location.

Header block information

Bytes 0-31: Literal text: "DCTI Universal Buffer File\26\00"
Bytes 32-33: Major buffer file format ID
Byte 34: Size of blocks in 512's. (01=512, 04 = 2048)
Byte 35: File Lock
Bytes 36-43: computer id of locking client (may need to be longer)
Bytes 44-51: Minimum client build id to access this file
- This is different than the version/build info used today.
- Each build of the client (beta or full) would have it's own build id, which is a monotonically increasing number. This number wouldn't even be shared across platforms.
- A client with a build-id smaller than this should refuse to process the file and throw an error.

Block Common Information

Byte 0: Block Type (values shown in hex)
- 00 = Unused
- 01/81 = RC5-64 incoming/outgoing
- 02/82 = CSC incoming/outgoing
- 03/83 = OGR-22 incoming/outgoing
- 04/84 = OGR-23 incoming/outgoing
- 70-7F = Reserved
- 80 = Reserved for possible extended header or project information
- F0-FF = Reserved
Byte 1: Record Lock
- Reserved for possible future use.
- cyp: I know we talked about file-seek and performance. Doesn't random-access file access get around that? If so, I think we should build in the ability to lock blocks in the future.
Bytes 2-9: ID of client locking this block
- Reserved for possible future use
- May need to be longer depending on length of the computer id

Stuff from cyp's doc which needs to be included in specific-project block definitions

time - absolute UTC block expiration date, after which time the block should be discarded by the client.
computer id (user defined) - for use in stats to cluster/group statistics by individual machine.
computer guid (machine computed) - unique identifier generated from semi-persistent entropy data specific to the machine so as to reduce the frequency this computed value changes. This value would allow unique identification of machines, allowing for more accurate estimates of total machine working, as well as isolation of invalid work computed by the same machine.
ticket - secure hash computed value generated by keymaster on block creation (see opcodeauth.html) consisting of block specifics (keystart, length) and a keymaster secret and a salt value.
core id - which core was used to process a block, for client platforms where alternate optimized cores are provided.
client cpu - platform identifier that the block processing actually occurs on (for machines that have alternate secondary processor boards of a different type than the compile-time processor).
test flag - Boolean indicator specifying if the block is a "test" block.
stuff from Dan
hardware detect id. - raw hardware detection flags, very cpu specific (vendor and processor stepping).


	Search WWW toomuchblue.com microsoft.com distributed.net

This site has no webmaster. Please contact the pagewrangler instead.