*** Introduction to the various Emulator File Formats
*** Compiled by: Peter Schepers
*** Started: August 24, 1996
*** Last updated: Nov 27, 2005
---------------------------------------------------------------------------
There are always questions asked regarding the various file formats which
are commonly used on either the emulators or the real C64. Most often the
question involves conversion... "What do I do with LNX files?" or "How do I
make these files work on the C64s emulator?". These documents attempt to
explain their internal structure, what to do with them, and some of their
respective strengths and weaknesses.
These documents were compiled and written in an attempt to unify all the
other smaller files dealing with Commodore file types that are floating
about the net, or that exist with other programs. They are by no means
exhaustive (even though they look like it), but attempts will be made to
keep them up-to-date, and correct anything which is wrong. If you spot
something that needs correcting please make sure to email the author so
that corrections can be made for future releases... the address is
contained later in this document.
Some of the information contained in these documents may not be accurate
as it could have been taken from inaccurate sources, and I have no
first-hand experience with said format. However, use these, pass them
around, upload them, whatever. Just be sure to leave them INTACT, don't
remove bits.
I have attempted to categorize the filetypes involved using three basic
categories: IMAGES, ARCHIVES and CONTAINERS. The definitions for each of
these categories can be found at the bottom of this document.
Also, plenty of good information can be gleaned from the source code
contained in the archive CBMConvert, which is on the FTP.FUNET.FI FTP site.
Contained in it are the sources for UnZipCode, UnLNX, Ark, some LHA info,
etc. It is an invaluable set of utilities put together by both Marko Makela
and Paul Doherty.
So far, this document covers the following files:
* D64 images (1541 disks and some variants)
* X64 images (for the X64/Vice emulator)
* T64 containers (for the C64s emulator)
* T64 .FRZ (FRoZen Files, saved emulator sessions for C64s)
* PC64 containers (P/S/U/Rxx)
* PC64 .C64 (saved emulator sessions for PC64)
* D71 images (1571 disks)
* D81 images (1581 disks)
* D80 (8050) & D82 (8250) floppy images
* G64 images (GCR copy of a 1541 disk)
* D2M images (FD2000 disks)
* DNP images (CMD hard disk native partitions)
* F64 (not an image file, but a companion file to D64's)
* N64 (64NET's custom files)
* L64 (64LAN's custom files)
* C64 (PCLINK's custom files)
* CRT images (CCS64 ROM cartridges)
* 64x (PC64 ROM files)
* TAP images (for CCS64, sampled cassette tapes)
* VSF VICE snapshots (saved-emulator sessions for VICE)
* WAV Audio RIFF files for the PC
...as well as the following native C64 types, some of which are also
supported on the various emulators:
* Extensive disk file layout (how files are stored on 1541/71/81 disks)
* 4-file diskpacked ZipCode archives (or .Z64, 4 or 5 files, #!xxxxx)
* 6-file SixPack ZipCode images (or .Z64, #!!xxxx)
* Filepacked ZipCode archives (or .Z64, x files, x!xxxxx)
* LNX containers (LyNX)
* ARK containers & SRK archives (ARKive & compressed ARKive)
* LHA & LZH archives (header description only)
* SFX archives (SelF-eXtracting LHA/LZH)
* SDA archives (Self-Dissolving Archive)
* ARC archives (ARChive)
* ZIP archives (PKZIP)
* CKIT archives (Compression KIT)
* CPK containers
* WRA & WR3 archives (Wraptor, version 1 to 3)
* LBR containers (LiBRary, C64 only, not the C128 CP/M .LBR files)
* GEOS VLIR files (Variable Length Index Record)
* REL files (RELative)
* CVT files (GEOS ConVerT)
* SPY containers (SPYne)
* C128 Boot Sector layout
* Binary & PRG (ProGram, with load address)
Also included is a very basic look at some C64 graphic bitmap formats (in
BITMAP.TXT), and the saved session layout of the Macintosh-based C64
emulator "Power64" (in POWER64.TXT). Thanks to Peter Weighill for the above
info.
Joe Forster/STA has written up a description of how the various Commodore
drives (1541/1571/1581) allocate sectors and directory entries when saving
files (under normal mode and under GEOS). It is included as DISK.TXT
Right now there are several good utilities available to work with most of
the mentioned formats. The first is 64COPY, my own conversion program. The
second is Star Commander, by Joe Forster/STA. Included with his program are
many smaller utilities such as Star ARK, Star LHA and Star ZIP, which will
convert specific formats to D64 images.
Peter Schepers,
University of Waterloo.
Email: schepers@ist.uwaterloo.ca
---------------------------------------------------------------------------
Most recent changes:
Mar 9/04 - Updated D64.TXT with info on interleaves and average track size
in bytes.
Mar 10/04 - Updated the F64 document with a better description of the error
codes, and some of the limitations that FCOPY-PC has.
Mar 11/04 - Added a "Contributors/sources:" line to all documents. Changed
the version number on all documents to reflect this change.
Mar 19/04 - Finalized the F64.TXT document, especially regarding the 'F'
error code.
Mar 25/04 - Updated G64.TXT with accurate values on track sizes, and what
size values to use as a benchmark for creating 1541-compatible
G64 tracks.
Jun 2/04 - Changes to the D2M.TXT document.
Nov 27/05 - Renamed D2M to D2M-DNP.TXT & extensively updated
- Updated GEOS.TXT
- Updated D80-D82.TXT
- Updated CRT.TXT with new CRT types 19-22
---------------------------------------------------------------------------
*** Terms and acronyms
Many strange terms have come along with computers in general, and I will
not attempt to explain them all, but some of the ones in this document may
not be entirely clear. I will attempt to make things a little easier by
explaining some of the more common ones.
<CR> - Short form for a Carriage Return ($0D) symbol.
<LF> - Short form for a Line Feed ($0A) symbol.
ARCHIVE - A file format which contains other files, and in which
compression is an integral part of the design. Some examples
are ZIP, SRK, SDA, ARC, LHA, WRA.
ASCII - This is an acronym for "American Standard Code for Information
Interchange". The standard is a 7-bit code covering control
codes, punctuation, alphanumeric (A to Z, 0 to 9) as well as math
and a few other symbols. Since it is a 7-bit code, it ranges from
$00 to $7F (0-127). This leaves the top 128-255 definable by the
vendor. The PC world has corrupted this standard making it 8-bit.
BAM - An acronym for "Block Availability Map". Here is where the disk
operating system keeps track of what sectors are allocated (or
available) for each track.
BLOCK - This refers to sectors which on a logical level are grouped
together. On a 1541 disk, it could be a series of sectors linked
together in a file, or a partition on a 1581 disk. In the PC
world it represents a "cluster" of sectors. Generally if I'm
referring to a grouping of sectors thats *not* 256 bytes large,
then I talk in blocks.
BYTE - A group of 8 bits, the contents of a memory location.
CHAIN - A series of sectors linked together. One sector will have a
pointer to another, and that sector will point to another, until
the chain has no more forward pointers. A file stored on a 1541
disk would be considered a chain of sectors, but it also has a
directory entry explaining what the chain is for.
CONTAINER - A file format which simply contains other files, and no
compression takes place. Some examples are T64, P00, SPY,
ARK, LNX.
FILETYPE - In the Commodore world, this would be the kind of file, be it
SEQ, REL, PRG, USR, GEOS etc. In the DOS world, this would
possibly be the file extension, be it EXE, TXT, DOC. It tells
the user what file it is, making usage easier.
GCR - An acronym for "Group Code Recording". This is the encoding method
Commodore uses to physically store the information on most of the
5.25" disks (i.e. 1541). It encodes an 8 bit sequence (2 4-bit
nybbles) into a 10 bit sequence (2 5-bit nybbles) so that long
repeated sequences of 1's or 0's are avoided. These must be avoided
so that the timing of reading/writing to the disk won't become "out
of sync". As a user, you would not normally see the GCR information
since the drive does all the conversion to normal HEX data before
it gives it to you.
HIGH/LOW - The bytes here are stored backwards compared to the LOW/HIGH
method. See LOW/HIGH for more information.
IMAGE - A file format which is a PC equivalent of a physical Commodore
media. Some examples are D64 (1541), D71 (1571), D81 (1581), D2M
(FD2000), X64.
LINK - This is the track/sector values, stored in the first two bytes of
a sector, which point to (or "link" to) the t/s location of the
next sector. A series of these links comprise a "chain" of
sectors.
LOW/HIGH - This is how values are stored when they exceed one byte. A
good example of this is the sector count of a D64 file. To
calculate the actual value, take the second value, multiply it
by 256 and add the low value. You will now get the real
decimal value. i.e. (HIGH*256)+LOW=result.
If you look at is as a HEX value, swap the bytes around and
put them together for the 16-bit HEX value. i.e. $FE $03 would
be $03FE as a 16-bit HEX value.
LSB/MSB - See LOW/HIGH.
LSU - This is my own acronym meaning "last sector useage". It is the
value stored in byte position $01 (the "sector" value of the t/s
link) of the last sector of a file. This value is the offset into
the sector where the last byte is stored. It also represents the
byte count + 1, since a value of 255 would actually mean only 254
bytes of file data exists (full sector less the 2 bytes for t/s
chain). Without reasonable knowledge of the disk layout, this byte
can be confusing, and hard to explain.
NYBBLE - A grouping of 4 bits (half a byte), either the first or last 4
bits of an 8-bit binary number, or one half of a two-digit
hexadecimal number.
Typically, a byte will be broken down into two parts, the top 4
bits and the bottom 4 bits. These are referred to as the upper
and lower nybble respectively, and are represented by two
hexadecimal digits in base 16.
PETASCII - (or PETSCII) This is Commodore's version of ASCII (the PET
part of the name comes from the first computer to use the
code, the PET or Personal Electronic Transactor).
Most of the codes from 0-127 are the same as ASCII, but there
are differences, especially noticible when converting text
from a C64 to a DOS machine. Where ASCII has uppercase
characters, PETASCII has lower case ones, and vice versa.
Also, the top 128 characters (128 to 255) are quite different
from the PC "standard".
RLE - An acronym for "Run Length Encoding". This is a simple compression
method, employed by most compression programs, and also used by
some file formats (ZipCode, CPK). It encodes sequences (or "runs",
hence the name "RUN length...") of the same byte (i.e. 00 00 00 00
00 00) into a smaller string using a shorter code sequence, making
the resultant file smaller than the original. This is the simplest
form of file compression.
SECTOR - It is best described as the method that the drive uses to store
the smallest group of bytes physically on the disk. On the 1541
this refers to a group of 256 bytes stored together in a single
sector. On a PC disk, this is typically 512 bytes.
SIGNATURE - A group of bytes, usually near or at the front of the file,
which are used to identify the type of file. i.e. a PC64 file
will always have the signature string "C64file" contained at
the beginning of the file.
TAR - An acronym for "Tape ARchiver", a UNIX application, and method of
backing up information.
|