*** CPK *** Document revision: 1.3 *** Last updated: March 11, 2004 *** Contributors/sources: Andre Fachat This format, created by Andre Fachat, was not designed for the emulators specifically, but was made primarily for Andre's own purposes. It is a very basic format using simple RLE compression, with each file following in sequential order (as Andre put it, "its similar to a UNIX TAR file"). There is no central directory, none of the files are byte aligned, and it uses compression so every file will be different. 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F ASCII ----------------------------------------------- ---------------- 0000: 01 40 41 2E 41 4E 4C 2C 50 00 01 08 24 08 64 00 .@A.ANL,P?..$.d? 0010: 99 22 93 20 20 20 41 4E 4C 45 49 54 55 4E 47 20 ?"????ANLEITUNG? 0020: 5A 55 4D 20 40 41 53 53 45 4D 42 4C 45 52 00 4E ZUM?@ASSEMBLER?N 0030: 08 6E 00 99 22 11 40 41 53 53 20 49 53 54 20 45 .n??".@ASS?IST?E 0040: 49 4E 20 32 2D 50 41 53 53 2D 41 53 53 45 4D 42 IN?2-PASS-ASSEMB 0050: 4C 45 52 2E 20 44 45 52 00 78 08 78 00 99 22 11 LER.?DER?x.x??". The first byte of the file is the version byte. Presently, only $01 is supported. 0000: 01 .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ................ The filename follows, stored in standard PETASCII, and no padding characters ($A0) are included. 0000: .. 40 41 2E 41 4E 4C .. .. .. .. .. .. .. .. .. .@A.ANL......... The filetype is attached to the end of the filename in the form of ',x', where x is the filetype used (P,S,U), and it is in PETASCII upper case. The filename ends with a $00 (null terminated). REL files are *not* supported as there is no provision made for the RECORD size byte. Note that not *all* CPK files will have the ",x" extension added on. If it doesn't exist, assume that the file is a "PRG" type. 0000: .. .. .. .. .. .. .. 2C 50 00 .. .. .. .. .. .. .......,P?...... Following the filename, we get program data. 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F ASCII ----------------------------------------------- ---------------- 0000: .. .. .. .. .. .. .. .. .. .. 01 08 24 08 64 00 ............$.d? 0010: 99 22 93 20 20 20 41 4E 4C 45 49 54 55 4E 47 20 ?"????ANLEITUNG? ... 0270: 00 83 0A E6 00 99 22 11 20 20 31 32 33 F7 08 20 ??????".??123?.? 0280: 2D 44 45 5A 49 4D 41 4C 00 A4 0A F0 00 99 22 11 -DEZIMAL??????". 0290: 20 20 24 33 34 35 F7 07 20 2D 48 45 58 41 44 45 ??$345?.?-HEXADE The data requires some explanation as it uses RLE (Run Length Encoding) compression. When creating CPK files, data in the file to be compressed is scanned for runs of repeating bytes, and when a string of 3 or more (up to 255) is found, then the following sequence of bytes is output... $F7 $xx $yy - where F7 is the code used for "encoded sequence follows", $xx is the number of times to repeat the byte and $yy is the byte to repeat. Using the sample below, we see the F7 code, then a "repeat 7 times the number $20" 0290: .. .. .. .. .. .. F7 07 20 .. .. .. .. .. .. .. ......?.?....... Using $F7 as the encoder byte presents one problem: When encoding a file, and we encounter an $F7, what does the packer do? Simple, it gets encoded into $F7 $xx $F7 meaning repeat $F7 for as many times as is needed (if its only 1 $F7, then the value for $xx is $01). The code 'F7' was chosen because it is not a 6502 opcode, a BASIC token, or any commonly used byte, but *not* because it has the least statistical probability of occuring. The stored program ends when the string $F7 $00 is encountered, since this sequence can not occur in the file naturally. If you need to search through a CPK file for the filenames, do a hex search for all $F7 $00 sequences, since they preceed all filenames except the first. The end of a CPK file can be found two different ways: 1. When an EOF (end of file) occurs, after an $F7 $00 byte sequence. This is the normal method. 2. When a filename of $00 occurs, meaning there is no filename, just a null termination. This is not much used anymore. Using method #1 for ending the file is more common because it makes adding files to the CPK file very easy. All you have to do as append the new filename/data to the container. Using method #2 means you have to check and see if the last three characters are $F7 $00 $00, and start writing the new file into the container starting after the first $00. In order to extract *one* specific file, you would need to read the whole file until you find the filename you want, then output that file only. As this format has no central directory and no file location references, there is no other way to do it. This format has not been used for some time now, as when it came out D64 and T64 were also being developed and accepted into common use. It is unlikely you will find *any* files in this format. 64COPY V3.2 (and up) does support extraction of these files just in case any are encountered. |