*** Binary/Raw files
*** Document revision: 1.3
*** Last updated: March 11, 2004
*** Contributors/sources: none

  This is really a non-format, as it has no specific layout whatsoever, but
instead is simply a data file. This does make it difficult to describe  any
general layout, as there are many  possibilities.  It  is  assumed  that  a
binary is  a  single  file,  usually  having  no  special  extension  (like
LNX/SFX/SDA, etc), sitting on a PC hard disk, not  inside  of  *any*  other
file (like in a D64/T64/LNX  etc).  By  looking  at  the  whole  file  (but
especially the beginning) with a HEX editor, and a  well-trained  eye,  you
can usually determine what file type it is by the BASIC code, load address,
and text strings contained within.

  Two possible file extensions that binary files  can  use  are  "BIN"  and
"PRG". There is a distinction in the 64 community between  these  two  file
types. A "BIN" (binary) file is one without a load address  preceeding  the
file data, and a "PRG" (program) file has a load address in the  first  two
bytes of the file.

  If the file is a normal C64 program, then the first  two  bytes  are  the
load address, stored in low/high byte format. If it is a BASIC program  (or
has a BASIC header), then the first two bytes might be $01 $08 (or $01  $04
for PET, $01 $1C for C128, the numbers  vary  for  different  systems  like
VIC-20, Plus4, C61, etc), and  the  rest  is  line  number  pointers,  line
numbers and BASIC code.

  If it is not a BASIC file (like part of a game loader), then it could  be
a machine language file, requiring a SYS call to  execute.  Many  secondary
files with multi-file games have load  addresses  other  than  $0801  (like
$C000, or $8000 for cartridges), and some  have  none  at  all,  where  the
loader program will actually specify where to load the  file.  Below  is  a
typical header for a game, using a SYS call to start  (in  this  case,  SYS
2064).

      00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F        ASCII
      -----------------------------------------------   ----------------
0000: 01 08 0E 08 E9 03 9E 20 28 32 30 36 34 29 00 00   ....?.??(2064)??
0010: 00 78 A2 FF 9A A0 00 84 01 A2 CC BD 57 08 9D 33   ?x??????.???W.?3

  One reason this  "non-image"  format  is  important  is  because  of  the
filename limitations imposed by  the  PC  architecture,  and  some  of  the
problems that it can cause. The PC has an 8.3 filename convention (assuming
DOS, and not OS/2 or Windows 95/NT long filenames), allowing for, at  most,
a filename of 11 characters. The  PC  also  has  character  limitations  on
filenames, most of which don't apply to the C64. When converting a C64 file
from any image/archive, you *cannot* use the C64 filename directly  on  the
PC since, first of all its likely too long, and  secondly  it  may  contain
characters illegal in  the  DOS  world  (i.e.  ASCII  46,  the  'extension'
separator).

  One other reason that binary is not a good way to keep C64 files is  that
you lose the C64 filetype. You *can* set the file  extension  to  show  the
type (xxx.PRG, xxx.SEQ), but  this  is  not  typically  done.  If  the  DOS
extension is not set, it is much more difficult to  know  what  format  the
file originally was.

  In order to convert a filename from C64 to  DOS,  the  use  of  translate
tables (translating illegal  characters  to  legal  ones)  as  well  as  an
algorithm to reduce the filesize down to useable  size  (typically  only  8
characters from 16) must be used. Included  within  the  PC64  distribution
archive is a 'C' program which does just what I have described. It contains
an algorithm which will remove illegal characters, then reduce the filename
down using several different rules, all to make the filename conform to the
DOS 8.3 naming standards.

  There are some people who feel that with Windows95 (or OS/2,  or  Windows
NT) long filename support, we should be able to have  C64  files  with  the
full 16 character filename on the PC hard disk. This is a  fallacy  as  the
useable character limitations of the C64 and those  of  any  OS  supporting
long filenames are not the same.  If  you  convert  a  C64  filename,  many
characters must be removed and or changed to allow it to be used in DOS (or
Win95).

  There is very little benefit or disadvantage to using binary.  About  the
only thing which is important to remember is you could  lose  most  of  the
original C64  filename,  which  *will*  cause  problems  when  you  try  to
reconstruct them. Binaries have no way of preserving the actual long names.

---------------------------------------------------------------------------

Overall Good/Bad of BINARY Files:

  Good
  ----
  * Easily determined filesize (the DOS filesize is the C64 filesize)

  * It is the *native* C64 file, so emulators can *easily* support it



  Bad
  ---
  * No 16 char filename, no real C64 filename at all 

  * No filetype, or special attribute bits 

  * No REL, GEOS or FRZ file support 

  * No signature or file description 

  * Difficult to support multi-file, as  original  C64  filenames  are  not
    retained

  * Since there is no filetype, you may not always be able to determine  if
    the file is a SEQ, USR or really a PRG without knowledge  of  the  file
    contents