📄 Euphoria Database System (EDS)
Introduction
=============
Many people have expressed an interest in accessing databases using Euphoria
programs. Those people have either wanted to access a name-brand database
management system from Euphoria, or they've wanted a simple, easy-to-use,
Euphoria-oriented database for storing data. EDS is the latter. It provides a
simple, extremely flexible, database system for use by Euphoria programs.
Structure of an EDS database
============================
In EDS, a database is a single file with ".edb" file type. An EDS database
contains 0 or more tables. Each table has a name, and contains 0 or more
records. Each record consists of a key part, and a data part. The key can be
any Euphoria object - an atom, a sequence, a deeply-nested sequence, whatever.
Similarly the data can be any Euphoria object. There are no constraints on the
size or structure of the key or data. Within a given table, the keys are all
unique. That is, no two records in the same table can have the same key part.
The records of a table are stored in ascending order of key value. An
efficient binary search is used when you refer to a record by key. You can
also access a record directly, with no search, if you know its current record
number within the table. Record numbers are integers from 1 to the length
(current number of records) of the table. By incrementing the record number,
you can efficiently step through all the records, in order of key. Note
however that a record's number can change whenever a new record is inserted,
or an existing record is deleted.
The keys and data parts are stored in a compact form, but no accuracy is lost
when saving or restoring floating-point numbers or any other Euphoria data.
database.e will work as is, on Windows, DOS, Linux or FreeBSD. EDS database
files can be copied and shared between programs running on Linux/FreeBSD and
DOS/Windows. Be sure to make an exact byte for byte copy using "binary" mode
copying, rather than "text" or "ASCII" mode which could change the line
terminators.
Example:
database: "mydata.edb"
first table: "passwords"
record #1: key: "jones" data: "euphor123"
record #2: key: "smith" data: "billgates"
second table: "parts"
record #1: key: 134525 data: {"hammer", 15.95, 500}
record #2: key: 134526 data: {"saw", 25.95, 100}
record #3: key: 134530 data: {"screw driver", 5.50, 1500}
It's up to you to interpret the meaning of the key and data. In keeping with
the spirit of Euphoria, you have total flexibility. Unlike most other database
systems, an EDS record is *not* required to have either a fixed number of
fields, or fields with a preset maximum length.
In many cases there will not be any natural key value for your records. In
those cases you should simply create a meaningless, but unique, integer to be
the key. Remember that you can always access the data by record number. It's
easy to loop through the records looking for a particular field value.
How to access the data
======================
To reduce the number of parameters that you have to pass, there is a notion of
the current database, and current table. Most routines use these "current"
values automatically. You normally start by opening (or creating) a database
file, then selecting the table that you want to work with.
You can map a key to a record number using db_find_key(). It uses an efficient
binary search. Most of the other record-level routines expect the record
number as an argument. You can very quickly access any record, given it's
number. You can access all the records by starting at record number 1 and
looping through to the record number returned by db_table_size().
How does storage get recycled?
==============================
When you delete something, such as a record, the space for that item gets put
on a free list, for future use. Adjacent free areas are combined into larger
free areas. When more space is needed, and no suitable space is found on the
free list, the file will grow in size. Currently there is no automatic way
that a file will shrink in size, but you can use db_compress() to completely
rewrite a database, removing the unused spaces.
Security / Multi-user Access
============================
This release provides a simple way to lock an entire database to prevent
unsafe access by other processes.
Scalability
===========
Internal pointers are 4 bytes. In theory that limits the size of a database
file to 4 Gb. In practice, the limit is 2 Gb because of limitations in various
C file functions used by Euphoria. Given enough user demand, EDS databases
could be expanded well beyond 2 Gb in the future.
The current algorithm allocates 4 bytes of memory per record in the current
table. So you'll need at least 4Mb RAM per million records on disk.
The binary search for keys should work reasonably well for large tables.
Inserts and deletes take slightly longer as a table gets larger.
At the low end of the scale, it's possible to create extremely small databases
without incurring much disk space overhead.
Disclaimer
==========
Do not store valuable data without a backup. RDS will not be responsible for
any damage or data loss.
Warning
=======
.edb files are binary files, not text files. You *must* use BINARY mode when
transferring a .edb file via FTP from one machine to another. You must also
avoid loading a .edb file into an editor and saving it. If you open a .edb
file directly using Euphoria's open(), which is not recommended, you must use
binary mode, not text mode. Failure to follow these rules could result in 10
(line-feed) and 13 (carriage-return) bytes being changed, leading to subtle
and not-so-subtle forms of corruption in your database.
Database Routines
=================
In the descriptions below, to indicate what kind of object may be passed in
and returned, the following prefixes are used:
x - a general object (atom or sequence)
s - a sequence
a - an atom
i - an integer
fn - an integer used as a file number
st - a string sequence, or single-character atom
db_create - create a new database
db_open - open an existing database
db_select - select a database to be the current one
db_close - close a database
db_create_table - create a new table within a database
db_select_table - select a table to be the current one
db_rename_table - rename a table
db_delete_table - delete a table
db_table_list - get a list of all the table names in a database
db_table_size - get the number of records in the current table
db_find_key - quickly find the record with a certain key value
db_record_key - get the key portion of a record
db_record_data - get the data portion of a record
db_insert - insert a new record into the current table
db_delete_record- delete a record from the current table
db_replace_data - replace the data portion of a record
db_compress - compress a database
db_dump - print the contents of a database
db_fatal_id - handle fatal database errors
----------------------------------<db_create>----------------------------------
Syntax: include database.e
i1 = db_create(s, i2)
Description: Create a new database. A new database will be created in the file
with path given by s. i2 indicates the type of lock that will be
applied to the file as it is created. i1 is an error code that
indicates success or failure. The values for i2 can be either
DB_LOCK_NO (no lock) or DB_LOCK_EXCLUSIVE (exclusive lock). i1 is
DB_OK if the new database is successfully created. This database
becomes the current database to which all other database
operations will apply.
Comments: If the path, s, does not end in .edb, it will be added
automatically.
If the database already exists, it will not be overwritten.
db_create() will return DB_EXISTS_ALREADY.
A version number is stored in the database file so future
versions of the database software can recognize the format, and
possibly read it and deal with it in some way.
Example:
if db_create("mydata", DB_LOCK_NO) != DB_OK then
puts(2, "Couldn't create the database!\n")
abort(1)
end if
See Also: db_open, db_close
-----------------------------------<db_open>-----------------------------------
Syntax: include database.e
i1 = db_open(s, i2)
Description: Open an existing Euphoria database. The file containing the
database is given by s. i1 is a return code indicating success or
failure. i2 indicates the type of lock that you want to place on
the database file while you have it open. This database becomes
the current database to which all other database operations will
apply.
The return codes are:
global constant DB_OK = 0 -- success
DB_OPEN_FAIL = -1 -- couldn't open the file
DB_LOCK_FAIL = -3 -- couldn't lock the file in the
-- manner requested
Comments: The types of lock that you can use are: DB_LOCK_NO (no lock),
DB_LOCK_SHARED (shared lock for read-only access) and
DB_LOCK_EXCLUSIVE (for read/write access). DB_LOCK_SHARED is only
supported on Linux/FreeBSD. It allows you to read the database,
but not write anything to it. If you request DB_LOCK_SHARED on
WIN32 or DOS32 it will be treated as if you had asked for
DB_LOCK_EXCLUSIVE.
If the lock fails, your program should wait a few seconds and try
again. Another process might be currently accessing the database.
DOS programs will typically get a "critical error" message if
they try to access a database that is currently locked.
Example:
tries = 0
while 1 do
err = db_open("mydata", DB_LOCK_SHARED)
if err = DB_OK then
exit
elsif err = DB_LOCK_FAIL then
tries += 1
if tries > 10 then
puts(2, "too many tries, giving up\n")
abort(1)
else
sleep(5)
end if
else
puts(2, "Couldn't open the database!\n")
abort(1)
end if
end while
See Also: db_create, db_close
----------------------------------<db_select>----------------------------------
Syntax: include database.e
i = db_select(s)
Description: Choose a new, already open, database to be the current database.
Subsequent database operations will apply to this database. s is
the path of the database file as it was originally opened with
db_open() or db_create(). i is a return code indicating success
(DB_OK) or failure.
Comments: When you create (db_create) or open (db_open) a database, it
automatically becomes the current database. Use db_select() when
you want to switch back and forth between open databases, perhaps
to copy records from one to the other.
After selecting a new database, you should select a table within
that database using db_select_table().
Example:
if db_select("employees") != DB_OK then
puts(2, "Couldn't select employees database\n")
end if
See Also: db_open
----------------------------------<db_close>-----------------------------------
Syntax: include database.e
db_close()
Description: Unlock and close the current database.
Comments: Call this procedure when you are finished with the current
database. Any lock will be removed, allowing other processes to
access the database file.
See Also: db_open
-------------------------------<db_create_table>-------------------------------
Syntax: include database.e
i = db_create_table(s)
Description: Create a new table within the current database. The name of the
table is given by the sequence of characters, s, and may not be
the same as any existing table in the current database.
Comments: The table that you create will initially have 0 records. It
becomes the current table.
Example:
if db_create_table("my_new_table") != DB_OK then
puts(2, "Couldn't create my_new_table!\n")
end if
See Also: db_delete_table
-------------------------------<db_select_table>-------------------------------
Syntax: include database.e
i = db_select_table(s)
Description: The table with name given by s, becomes the current table. The
return code, i, will be DB_OK if the table exists in the current
database, otherwise you'll get DB_OPEN_FAIL.
Comments: All record-level database operations apply automatically to the
current table.
Example:
if db_select_table("salary") != DB_OK then
puts(2, "Couldn't find salary table!\n")
abort(1)
end if
See Also: db_create_table, db_delete_table
-------------------------------<db_rename_table>-------------------------------
Syntax: include database.e
db_rename_table(s1, s2)
Description: Rename a table in the current database. The current name of the
table is given by s1. The new name of the table is s2.
Comments: The table to be renamed can be the current table, or some other
table in the current database. An error will occur if s1 is not
the name of a table in the current database, or if s2 is the name
of an existing table in the current database.
See Also: db_create_table db_select_table db_delete_table
-------------------------------<db_delete_table>-------------------------------
Syntax: include database.e
db_delete_table(s)
Description: Delete a table in the current database. The name of the table is
given by s.
Comments: All records are deleted and all space used by the table is freed
up. If the table is the current table, the current table becomes
undefined.
If there is no table with the name given by s, then nothing
happens.
See Also: db_create_table db_select_table
--------------------------------<db_table_list>--------------------------------
Syntax: s = db_table_list()
Description: Return a sequence of all the table names in the current database.
Each element of s is a sequence of characters containing the name
of a table.
Example:
sequence names
names = db_table_list()
for i = 1 to length(names) do
puts(1, names[i] & '\n')
end for
See Also: db_create_table
--------------------------------<db_table_size>--------------------------------
Syntax: include database.e
i = db_table_size()
Description: Return the current number of records in the current table.
Example:
-- look at all records in the current table
for i = 1 to db_table_size() do
if db_record_key(i) = 0 then
puts(1, "0 key found\n")
exit
end if
end for
See Also: db_select_table
---------------------------------<db_find_key>---------------------------------
Syntax: include database.e
i = db_find_key(x)
Description: Find the record in the current table with key value x. If found,
the record number will be returned. If not found, the record
number that key would occupy, if inserted, is returned as a
negative number.
Comments: A fast binary search is used to find the key in the current
table. The number of comparisons is proportional to the log of
the number of records in the table.
You can select a range of records by searching for the first and
last key values in the range. If those key values don't exist,
you'll at least get a negative value showing where they would be,
if they existed. e.g. Suppose you want to know which records have
keys greater than "GGG" and less than "MMM". If -5 is returned
for key "GGG", it means a record with "GGG" as a key would be
inserted as record number 5. -27 for "MMM" means a record with
"MMM" as its key would be inserted as record number 27. This
quickly tells you that all records, >= 5 and < 27 qualify.
Example:
rec_num = db_find_key("Millennium")
if rec_num > 0 then
? db_record_key(rec_num)
? db_record_data(rec_num)
else
puts(2, "Not found, but if you insert it,\n")
printf(2, "it will be #%d\n", -rec_num)
end if
See Also: db_record_key, db_record_data, db_insert
--------------------------------<db_record_key>--------------------------------
Syntax: include database.e
x = db_record_key(i)
Description: Return the key portion of record number i in the current table.
Comments: Each record in a Euphoria database consists of a key portion and
a data portion. Each of these can be any Euphoria atom or
sequence.
Example:
puts(1, "The 6th record has key value: ")
? db_record_key(6)
See Also: db_record_data
-------------------------------<db_record_data>--------------------------------
Syntax: include database.e
x = db_record_data(i)
Description: Return the data portion of record number i in the current table.
Comments: Each record in a Euphoria database consists of a key portion and
a data portion. Each of these can be any Euphoria atom or
sequence.
Example:
puts(1, "The 6th record has data value: ")
? db_record_data(6)
See Also: db_record_key
----------------------------------<db_insert>----------------------------------
Syntax: include database.e
i = db_insert(x1, x2)
Description: Insert a new record into the current table. The record key is x1
and the record data is x2. Both x1 and x2 can be any Euphoria
data objects, atoms or sequences. The return code i1 is DB_OK if
the record is inserted.
Comments: Within a table, all keys must be unique. db_insert() will fail
with DB_EXISTS_ALREADY if a record already exists with the same
key value.
Example:
if db_insert("Smith", {"Peter", 100, 34.5}) != DB_OK then
puts(2, "insert failed!\n")
end if
See Also: db_find_key, db_record_key, db_record_data
------------------------------<db_delete_record>-------------------------------
Syntax: include database.e
db_delete_record(i)
Description: Delete record number i from the current table.
Comments: The record number, i, must be an integer from 1 to the number of
records in the current table.
Example:
db_delete_record(55)
See Also: db_insert, db_table_size
-------------------------------<db_replace_data>-------------------------------
Syntax: include database.e
db_replace_data(i, x)
Description: In the current table, replace the data portion of record number
i, with x. x can be any Euphoria atom or sequence.
Comments: The record number, i, must be from 1 to the number of records in
the current table.
Example:
db_replace_data(67, {"Peter", 150, 34.5})
See Also: db_delete_record
---------------------------------<db_compress>---------------------------------
Syntax: include database.e
i = db_compress()
Description: Compress the current database. The current database is copied to
a new file such that any blocks of unused space are eliminated.
If successful, i will be set to DB_OK, and the new compressed
database file will retain the same name. The current table will
be undefined. As a backup, the original, uncompressed file will
be renamed with an extension of .t0 (or .t1, .t2 ,..., .t99). In
the highly unusual case that the compression is unsuccessful, the
database will be left unchanged, and no backup will be made.
Comments: When you delete items from a database, you create blocks of free
space within the database file. The system keeps track of these
blocks and tries to use them for storing new data that you
insert. db_compress() will copy the current database without
copying these free areas. The size of the database file may
therefore be reduced.
If the backup filenames reach .t99 you will have to delete some
of them.
Example:
if db_compress() != DB_OK then
puts(2, "compress failed!\n")
end if
See Also: db_create
-----------------------------------<db_dump>-----------------------------------
Syntax: include database.e
db_dump(fn, i)
Description: Print the contents of an already-open Euphoria database. The
contents are printed to file or device fn. All records in all
tables are shown. If i is non-zero, then a low-level byte-by-byte
dump is also shown. The low-level dump will only be meaningful to
someone who is familiar with the internal format of a Euphoria
database.
Example:
if db_open("mydata", DB_LOCK_SHARED) != DB_OK then
puts(2, "Couldn't open the database!\n")
abort(1)
end if
fn = open("db.txt", "w")
db_dump(fn, 0)
See Also: db_open
---------------------------------<db_fatal_id>---------------------------------
Syntax: include database.e
db_fatal_id = i
Description: You can catch certain fatal database errors by installing your
own fatal error handler. Simply overwrite the global variable
db_fatal_id with the routine id of one of your own procedures.
The procedure must take a single argument which is a sequence.
When certain errors occur your procedure will be called with an
error message string as the argument. Your procedure should end
by calling abort().
Example:
procedure my_fatal(sequence msg)
puts(2, "A fatal error occurred - " & msg & '\n')
abort(1)
end procedure
db_fatal_id = routine_id("my_fatal")
See Also: db_close