🗐 Euphoria Database System (EDS)


 Introduction
 =============

 Many people have expressed an interest in accessing databases using Euphoria
 programs. Those people have either wanted to access a name-brand database
 management system from Euphoria, or they've wanted a simple, easy-to-use,
 Euphoria-oriented database for storing data. EDS is the latter. It provides a
 simple, extremely flexible, database system for use by Euphoria programs.


 Structure of an EDS database
 ============================

 In EDS, a database is a single file with ".edb" file type. An EDS database
 contains 0 or more tables. Each table has a name, and contains 0 or more
 records. Each record consists of a key part, and a data part. The key can be
 any Euphoria object - an atom, a sequence, a deeply-nested sequence, whatever.
 Similarly the data can be any Euphoria object. There are no constraints on the
 size or structure of the key or data. Within a given table, the keys are all
 unique. That is, no two records in the same table can have the same key part.

 The records of a table are stored in ascending order of key value. An
 efficient binary search is used when you refer to a record by key. You can
 also access a record directly, with no search, if you know its current record
 number within the table. Record numbers are integers from 1 to the length
 (current number of records) of the table. By incrementing the record number,
 you can efficiently step through all the records, in order of key. Note
 however that a record's number can change whenever a new record is inserted,
 or an existing record is deleted.

 The keys and data parts are stored in a compact form, but no accuracy is lost
 when saving or restoring floating-point numbers or any other Euphoria data.

 database.e will work as is, on Windows, DOS, Linux or FreeBSD. EDS database
 files can be copied and shared between programs running on Linux/FreeBSD and
 DOS/Windows. Be sure to make an exact byte for byte copy using "binary" mode
 copying, rather than "text" or "ASCII" mode which could change the line
 terminators.

 Example:
    database: "mydata.edb"
          first table: "passwords"
               record #1:  key: "jones"   data: "euphor123"
               record #2:  key: "smith"   data: "billgates"

          second table: "parts"
               record #1:  key: 134525    data: {"hammer", 15.95, 500}
               record #2:  key: 134526    data: {"saw", 25.95, 100}
               record #3:  key: 134530    data: {"screw driver", 5.50, 1500}


 It's up to you to interpret the meaning of the key and data. In keeping with
 the spirit of Euphoria, you have total flexibility. Unlike most other database
 systems, an EDS record is *not* required to have either a fixed number of
 fields, or fields with a preset maximum length.

 In many cases there will not be any natural key value for your records. In
 those cases you should simply create a meaningless, but unique, integer to be
 the key. Remember that you can always access the data by record number. It's
 easy to loop through the records looking for a particular field value.


 How to access the data
 ======================

 To reduce the number of parameters that you have to pass, there is a notion of
 the current database, and current table. Most routines use these "current"
 values automatically. You normally start by opening (or creating) a database
 file, then selecting the table that you want to work with.

 You can map a key to a record number using db_find_key(). It uses an efficient
 binary search. Most of the other record-level routines expect the record
 number as an argument. You can very quickly access any record, given it's
 number. You can access all the records by starting at record number 1 and
 looping through to the record number returned by db_table_size().


 How does storage get recycled?
 ==============================

 When you delete something, such as a record, the space for that item gets put
 on a free list, for future use. Adjacent free areas are combined into larger
 free areas. When more space is needed, and no suitable space is found on the
 free list, the file will grow in size. Currently there is no automatic way
 that a file will shrink in size, but you can use db_compress() to completely
 rewrite a database, removing the unused spaces.


 Security / Multi-user Access
 ============================

 This release provides a simple way to lock an entire database to prevent
 unsafe access by other processes.


 Scalability
 ===========

 Internal pointers are 4 bytes. In theory that limits the size of a database
 file to 4 Gb. In practice, the limit is 2 Gb because of limitations in various
 C file functions used by Euphoria. Given enough user demand, EDS databases
 could be expanded well beyond 2 Gb in the future.

 The current algorithm allocates 4 bytes of memory per record in the current
 table. So you'll need at least 4Mb RAM per million records on disk.

 The binary search for keys should work reasonably well for large tables.

 Inserts and deletes take slightly longer as a table gets larger.

 At the low end of the scale, it's possible to create extremely small databases
 without incurring much disk space overhead.


 Disclaimer
 ==========

 Do not store valuable data without a backup. RDS will not be responsible for
 any damage or data loss.


 Warning
 =======

 .edb files are binary files, not text files. You *must* use BINARY mode when
 transferring a .edb file via FTP from one machine to another. You must also
 avoid loading a .edb file into an editor and saving it. If you open a .edb
 file directly using Euphoria's open(), which is not recommended, you must use
 binary mode, not text mode. Failure to follow these rules could result in 10
 (line-feed) and 13 (carriage-return) bytes being changed, leading to subtle
 and not-so-subtle forms of corruption in your database.


 Database Routines
 =================

 In the descriptions below, to indicate what kind of object may be passed in
 and returned, the following prefixes are used:
 x     - a general object (atom or sequence)

 s     - a sequence

 a     - an atom

 i     - an integer

 fn    - an integer used as a file number

 st    - a string sequence, or single-character atom


 db_create       - create a new database

 db_open         - open an existing database

 db_select       - select a database to be the current one

 db_close        - close a database

 db_create_table - create a new table within a database

 db_select_table - select a table to be the current one

 db_rename_table - rename a table

 db_delete_table - delete a table

 db_table_list   - get a list of all the table names in a database

 db_table_size   - get the number of records in the current table

 db_find_key     - quickly find the record with a certain key value

 db_record_key   - get the key portion of a record

 db_record_data  - get the data portion of a record

 db_insert       - insert a new record into the current table

 db_delete_record- delete a record from the current table

 db_replace_data - replace the data portion of a record

 db_compress     - compress a database

 db_dump         - print the contents of a database

 db_fatal_id     - handle fatal database errors



 ----------------------------------<db_create>----------------------------------

 Syntax:      include database.e
              i1 = db_create(s, i2)

 Description: Create a new database. A new database will be created in the file
              with path given by s. i2 indicates the type of lock that will be
              applied to the file as it is created. i1 is an error code that
              indicates success or failure. The values for i2 can be either
              DB_LOCK_NO (no lock) or DB_LOCK_EXCLUSIVE (exclusive lock). i1 is
              DB_OK if the new database is successfully created. This database
              becomes the current database to which all other database
              operations will apply.

 Comments:    If the path, s, does not end in .edb, it will be added
              automatically.

              If the database already exists, it will not be overwritten.
              db_create() will return DB_EXISTS_ALREADY.

              A version number is stored in the database file so future
              versions of the database software can recognize the format, and
              possibly read it and deal with it in some way.

 Example:

              if db_create("mydata", DB_LOCK_NO) != DB_OK then
                  puts(2, "Couldn't create the database!\n")
                  abort(1)
              end if

 See Also:    db_open, db_close


 -----------------------------------<db_open>-----------------------------------

 Syntax:      include database.e
              i1 = db_open(s, i2)

 Description: Open an existing Euphoria database. The file containing the
              database is given by s. i1 is a return code indicating success or
              failure. i2 indicates the type of lock that you want to place on
              the database file while you have it open. This database becomes
              the current database to which all other database operations will
              apply.

              The return codes are:

                  global constant DB_OK = 0   -- success
                           DB_OPEN_FAIL = -1  -- couldn't open the file
                           DB_LOCK_FAIL = -3  -- couldn't lock the file in the
                                              --     manner requested

 Comments:    The types of lock that you can use are: DB_LOCK_NO (no lock),
              DB_LOCK_SHARED (shared lock for read-only access) and
              DB_LOCK_EXCLUSIVE (for read/write access). DB_LOCK_SHARED is only
              supported on Linux/FreeBSD. It allows you to read the database,
              but not write anything to it. If you request DB_LOCK_SHARED on
              WIN32 or DOS32 it will be treated as if you had asked for
              DB_LOCK_EXCLUSIVE.

              If the lock fails, your program should wait a few seconds and try
              again. Another process might be currently accessing the database.

              DOS programs will typically get a "critical error" message if
              they try to access a database that is currently locked.

 Example:

              tries = 0
              while 1 do
                  err = db_open("mydata", DB_LOCK_SHARED)
                  if err = DB_OK then
                      exit
                  elsif err = DB_LOCK_FAIL then
                        tries += 1
                        if tries > 10 then
                          puts(2, "too many tries, giving up\n")
                          abort(1)
                        else
                            sleep(5)
                        end if
                  else
                        puts(2, "Couldn't open the database!\n")
                        abort(1)
                  end if
              end while


 See Also:    db_create, db_close


 ----------------------------------<db_select>----------------------------------

 Syntax:      include database.e
              i = db_select(s)

 Description: Choose a new, already open, database to be the current database.
              Subsequent database operations will apply to this database. s is
              the path of the database file as it was originally opened with
              db_open() or db_create(). i is a return code indicating success
              (DB_OK) or failure.

 Comments:    When you create (db_create) or open (db_open) a database, it
              automatically becomes the current database. Use db_select() when
              you want to switch back and forth between open databases, perhaps
              to copy records from one to the other.

              After selecting a new database, you should select a table within
              that database using db_select_table().

 Example:

              if db_select("employees") != DB_OK then
                  puts(2, "Couldn't select employees database\n")
              end if

 See Also:    db_open


 ----------------------------------<db_close>-----------------------------------

 Syntax:      include database.e
              db_close()

 Description: Unlock and close the current database.

 Comments:    Call this procedure when you are finished with the current
              database. Any lock will be removed, allowing other processes to
              access the database file.

 See Also:    db_open


 -------------------------------<db_create_table>-------------------------------

 Syntax:      include database.e
              i = db_create_table(s)

 Description: Create a new table within the current database. The name of the
              table is given by the sequence of characters, s, and may not be
              the same as any existing table in the current database.

 Comments:    The table that you create will initially have 0 records. It
              becomes the current table.

 Example:

              if db_create_table("my_new_table") != DB_OK then
                  puts(2, "Couldn't create my_new_table!\n")
              end if

 See Also:    db_delete_table


 -------------------------------<db_select_table>-------------------------------

 Syntax:      include database.e
              i = db_select_table(s)

 Description: The table with name given by s, becomes the current table. The
              return code, i, will be DB_OK if the table exists in the current
              database, otherwise you'll get DB_OPEN_FAIL.

 Comments:    All record-level database operations apply automatically to the
              current table.

 Example:

              if db_select_table("salary") != DB_OK then
                  puts(2, "Couldn't find salary table!\n")
                  abort(1)
              end if

 See Also:    db_create_table, db_delete_table


 -------------------------------<db_rename_table>-------------------------------

 Syntax:      include database.e
              db_rename_table(s1, s2)

 Description: Rename a table in the current database. The current name of the
              table is given by s1. The new name of the table is s2.

 Comments:    The table to be renamed can be the current table, or some other
              table in the current database. An error will occur if s1 is not
              the name of a table in the current database, or if s2 is the name
              of an existing table in the current database.

 See Also:    db_create_table db_select_table db_delete_table


 -------------------------------<db_delete_table>-------------------------------

 Syntax:      include database.e
              db_delete_table(s)

 Description: Delete a table in the current database. The name of the table is
              given by s.

 Comments:    All records are deleted and all space used by the table is freed
              up. If the table is the current table, the current table becomes
              undefined.

              If there is no table with the name given by s, then nothing
              happens.

 See Also:    db_create_table db_select_table


 --------------------------------<db_table_list>--------------------------------

 Syntax:      s = db_table_list()

 Description: Return a sequence of all the table names in the current database.
              Each element of s is a sequence of characters containing the name
              of a table.

 Example:

              sequence names

              names = db_table_list()
              for i = 1 to length(names) do
                  puts(1, names[i] & '\n')
              end for

 See Also:    db_create_table


 --------------------------------<db_table_size>--------------------------------

 Syntax:      include database.e
              i = db_table_size()

 Description: Return the current number of records in the current table.

 Example:

              -- look at all records in the current table
              for i = 1 to db_table_size() do
                  if db_record_key(i) = 0 then
                        puts(1, "0 key found\n")
                        exit
                  end if
              end for

 See Also:    db_select_table


 ---------------------------------<db_find_key>---------------------------------

 Syntax:      include database.e
              i = db_find_key(x)

 Description: Find the record in the current table with key value x. If found,
              the record number will be returned. If not found, the record
              number that key would occupy, if inserted, is returned as a
              negative number.

 Comments:    A fast binary search is used to find the key in the current
              table. The number of comparisons is proportional to the log of
              the number of records in the table.

              You can select a range of records by searching for the first and
              last key values in the range. If those key values don't exist,
              you'll at least get a negative value showing where they would be,
              if they existed. e.g. Suppose you want to know which records have
              keys greater than "GGG" and less than "MMM". If -5 is returned
              for key "GGG", it means a record with "GGG" as a key would be
              inserted as record number 5. -27 for "MMM" means a record with
              "MMM" as its key would be inserted as record number 27. This
              quickly tells you that all records, >= 5 and < 27 qualify.

 Example:

              rec_num = db_find_key("Millennium")
              if rec_num > 0 then
                  ? db_record_key(rec_num)
                  ? db_record_data(rec_num)
              else
                  puts(2, "Not found, but if you insert it,\n")
                  printf(2, "it will be #%d\n", -rec_num)
              end if

 See Also:    db_record_key, db_record_data, db_insert


 --------------------------------<db_record_key>--------------------------------

 Syntax:      include database.e
              x = db_record_key(i)

 Description: Return the key portion of record number i in the current table.

 Comments:    Each record in a Euphoria database consists of a key portion and
              a data portion. Each of these can be any Euphoria atom or
              sequence.

 Example:

              puts(1, "The 6th record has key value: ")
              ? db_record_key(6)

 See Also:    db_record_data


 -------------------------------<db_record_data>--------------------------------

 Syntax:      include database.e
              x = db_record_data(i)

 Description: Return the data portion of record number i in the current table.

 Comments:    Each record in a Euphoria database consists of a key portion and
              a data portion. Each of these can be any Euphoria atom or
              sequence.

 Example:

              puts(1, "The 6th record has data value: ")
              ? db_record_data(6)

 See Also:    db_record_key


 ----------------------------------<db_insert>----------------------------------

 Syntax:      include database.e
              i = db_insert(x1, x2)

 Description: Insert a new record into the current table. The record key is x1
              and the record data is x2. Both x1 and x2 can be any Euphoria
              data objects, atoms or sequences. The return code i1 is DB_OK if
              the record is inserted.

 Comments:    Within a table, all keys must be unique. db_insert() will fail
              with DB_EXISTS_ALREADY if a record already exists with the same
              key value.

 Example:

              if db_insert("Smith", {"Peter", 100, 34.5}) != DB_OK then
                  puts(2, "insert failed!\n")
              end if

 See Also:    db_find_key, db_record_key, db_record_data


 ------------------------------<db_delete_record>-------------------------------

 Syntax:      include database.e
              db_delete_record(i)

 Description: Delete record number i from the current table.

 Comments:    The record number, i, must be an integer from 1 to the number of
              records in the current table.

 Example:

              db_delete_record(55)

 See Also:    db_insert, db_table_size


 -------------------------------<db_replace_data>-------------------------------

 Syntax:      include database.e
              db_replace_data(i, x)

 Description: In the current table, replace the data portion of record number
              i, with x. x can be any Euphoria atom or sequence.

 Comments:    The record number, i, must be from 1 to the number of records in
              the current table.

 Example:

              db_replace_data(67, {"Peter", 150, 34.5})

 See Also:    db_delete_record


 ---------------------------------<db_compress>---------------------------------

 Syntax:      include database.e
              i = db_compress()

 Description: Compress the current database. The current database is copied to
              a new file such that any blocks of unused space are eliminated.
              If successful, i will be set to DB_OK, and the new compressed
              database file will retain the same name. The current table will
              be undefined. As a backup, the original, uncompressed file will
              be renamed with an extension of .t0 (or .t1, .t2 ,..., .t99). In
              the highly unusual case that the compression is unsuccessful, the
              database will be left unchanged, and no backup will be made.

 Comments:    When you delete items from a database, you create blocks of free
              space within the database file. The system keeps track of these
              blocks and tries to use them for storing new data that you
              insert. db_compress() will copy the current database without
              copying these free areas. The size of the database file may
              therefore be reduced.

              If the backup filenames reach .t99 you will have to delete some
              of them.

 Example:

              if db_compress() != DB_OK then
                  puts(2, "compress failed!\n")
              end if

 See Also:    db_create


 -----------------------------------<db_dump>-----------------------------------

 Syntax:      include database.e
              db_dump(fn, i)

 Description: Print the contents of an already-open Euphoria database. The
              contents are printed to file or device fn. All records in all
              tables are shown. If i is non-zero, then a low-level byte-by-byte
              dump is also shown. The low-level dump will only be meaningful to
              someone who is familiar with the internal format of a Euphoria
              database.

 Example:

              if db_open("mydata", DB_LOCK_SHARED) != DB_OK then
                  puts(2, "Couldn't open the database!\n")
                  abort(1)
              end if
              fn = open("db.txt", "w")
              db_dump(fn, 0)


 See Also:    db_open


 ---------------------------------<db_fatal_id>---------------------------------

 Syntax:      include database.e
              db_fatal_id = i

 Description: You can catch certain fatal database errors by installing your
              own fatal error handler. Simply overwrite the global variable
              db_fatal_id with the routine id of one of your own procedures.
              The procedure must take a single argument which is a sequence.
              When certain errors occur your procedure will be called with an
              error message string as the argument. Your procedure should end
              by calling abort().

 Example:

              procedure my_fatal(sequence msg)
                  puts(2, "A fatal error occurred - " & msg & '\n')
                  abort(1)
              end procedure

              db_fatal_id = routine_id("my_fatal")

 See Also:    db_close