A Beginners Guide To Euphoria

An Introduction To Bits And Bytes

If you remember from our discussion on variables, we introduced the byte, a stored value between 0 and 255. You may have asked yourself, "why 255? why not a much easier to remember range like 0 ot 10, or 0 to 100, as in the metric system?". Well, remember that the only language a computer understands is a set of instructions composed of 1's and 0's, or binary language. If so, one would expect that computers would use a similar standard composed of 1's and 0's to represent numbers.

Because computers only understand the values of 0 and 1, they use a system called the "binary system" to represent stored numbers. To understand how it works, we need to go back to school to review how humans represent numbers. Humans use a numbering system called the decimal system, because we can relate to groups of 10 easily (no doubt because we have 10 fingers). The decimal system states each digit is the number of groups of 10, where each group is raised to a power based on the position from right to left:

                           5   4   1   3
5 groups of 1000 (103)-----^
4 groups of  100 (102)---------^
1 group of    10 (101)-------------^
3 groups of    1 (100)-----------------^

The exponent that raises 10 to a power starts at 0 from the rightmost digit, and increases by a value of 1 as you go left. With computers using the binary system, a base of 2, not 10, is used. This is because the computer only uses two digits in its numbering system. Yet the representation of a number under the binary system is the same as in the decimal system:

                           1  0  1  1  0
1 group of 16 (or 24)------^
0 groups of 8 (or 23)---------^
1 group of  4 (or 22)------------^
1 group of  2 (or 21)---------------^
0 groups of 1 (or 20)------------------^

For your information, the value 10110 is pronounced "one-zero-one-one-zero", not "ten-thousand-one-hundred-ten". The terms "thousand", "hundred" and "ten" describe groups in the decimal system. The binary number 10110 is equal to 22. You can determine the value of any binary number by adding up all the powers of 2 represented by the binary digit 1, also known as a BIT (BInary digiT). For example, 10100 is equal to 2 + 4 +16, which equals 22. You can also convert any decimal number to binary by following these instructions (you will need a calculator and a sheet of paper for this):

   1) Divide number n by 2.
   2) If the result of the division ends in .5, write down the value 1 on the sheet of
      paper, and then change the result to an integer (for example, 12.5 to 12).
      Otherwise, just write down the value 0.
   3) Take the results and make it the next number n to divide by 2, and go to step 1.
      Repeat these steps until you produce a value less than 1. When you are done,
      reverse the digits written on the paper.

For example, the value of 23 is 10111. We work it out by dividing 23 by 2 to get 11.5 (1), 11 by 2 to get 5.5 (1), 5 by 2 to get 2.5 (1), 2 by 2 to get 1 (0), 1 by 2 to get 0.5 (1). You then reverse the digits produced to change 11101 to 10111.
These bits are important when we talk about bytes. A byte, as defined by the American Standard Code for Information Interchange, is made up of 8 bits:

Binary Decimal Binary Decimal Binary Decimal
00000000 0 00000101 5 00001010 10
00000001 1 00000110 6 : :
00000010 2 00000111 7 11111101 253
00000011 3 00001000 8 11111110 254
00000100 4 00001001 9 11111111 255

It is at this point you understand why a byte has a value beween 0 and 255. 0 equals 00000000 and 255 equals 11111111. There are two problems with this. The first problem is the largest value a byte can hold is 255. In order to represent larger values, you have to use more than one byte. So, putting two bytes together makes a what is called a word. Using 16 bits, a word can represent values between 0 and 65536. There is also a double word, which is made up of four bytes (32 bits) and can represent extremely large values.
The other problem is that the values bytes, words, and double words can represent are unsigned, or positive only. This problem is solved by two approaches. First, the leftmost bit position is used only to show if the number is negative or positive. Second, a method called two's complement is used to represent negative numbers. Here is how a binary number is shown as negative using two's complement on a binary number 00010010 (34):

   1) First 00100010 is reversed to 11011101 (called one's complement)
   2) You then add binary 1 to the reversed value:

   11011101
  +00000001
   --------
   11011110  (-34 in decimal)

You'll notice that addition using binary numbers follows the same rules as with decimal numbers: when adding two numbers produces a sum larger than a single digit, you carry left to the next column. Adding 1 and 1 produces 10, where the 1 is carried over to the next column to the left. The use of two's complement to store negative values works with 8 bit, 16 bit, or 32 bit values. The only disadvantage of using two's complement is that the range of numbers supported drops for positive numbers. For example, a byte that can handle negative numbers now only represents numbers between -128 and +127, and a word can only represent numbers between -32768 and +32767, both inclusive. however, it is up to the programmer to decide whether or not to have signed values.

The understanding of how bits and bytes work is not mandatory in order to write programs in Euphoria. After all, Euphoria handles the storage and manipulation of binary values behind the scenes so you really do not see any use of bits, bytes, words, and double words, nor do you have to convert numbers to two's complement if you want them negative. So you probably ask, "what is the point of learning about bits and bytes if Euphoria handles all that for me automatically?"

Well, in addition to getting a better feel on how the computer really stores data, there are other benefits. First, because 8 bits make up a byte, you could use single byte to keep track of 8 different conditions in your program, where each 1 bit means a condition is true. Also, if you are interested in data compression, using less than 8 bits to represent values would be helpful. A good understanding of bits and bytes is also handy in graphics, where you want to merge images together, without any black area around either image being merged. This is rough territory for the person who has never programmed. If you are not ready to learn the Euphoria library routines that handle bits and bytes, use the remote to skip to "Creating Library Routines And Variable Types". Otherwise, just go to the next page!

ToC

Working With Bits

In this chapter, we will dig deeper into the theory of bits by actually using them in Euphoria programs. There are library routines that can convert integer values to a sequence representing binary values and back. Also, you will learn how boolean logic works with values at the binary digit level. You are already familiar with boolean logic when you learned about logical expressions. This chapter will expand a bit on this in order to handle bits.

If you plan to manipulate values at the bit level, probably for use as condition switches in your program, you need to actually see them. The best way to do this is have them shown as a sequence, where each element is an atom having a value of 1 or0. This way, you can perform element indexing of single bits or an entire range of them. Here is the library routine that lets you access bits in integer values:

   include machine.e
   rs = int_to_bits(a,i)

This returns a sequence value containing the rightmost number of bits (i) in integer value a, to be stored in receiving variable rs. The sequence is made up of atom elements representing bit values starting from the right. We use a instead of i as the integer value in case you want to work with integer values outside the range of -1073741824 and +1073741823. Only atom data objects can hold values outside that range. The returned sequence value is actually reversed in appearance, because the rightmost bits start at element 1. For example, bit 20 is element 1, bit 21 is element2, bit 22 is element 3, and so forth. int_to_bits() will return the rightmost bits of negative numbers too. Just remember that negative numbers always use the two's complement format.

The number of bits parameter depends on the size of the integer value you are accessing for bits. A byte-sized integer only needs to have a maximum of 8 bits returned, while word and double-word sized integers will require you to go as high as 16 or even 32 bits to return. A demo program shows how to return bits from different integer values.

Demo program 76
include machine.e
sequence actual_binary_number, binary_bits, series_of_values
clear_screen()
puts(1,"A Simple Example Of Using int_to_bits()\n")
puts(1,"======================================\n\n")
puts(1,"Decimal                 Binary\n")
puts(1,"=======    ================================\n\n")

series_of_values = {1,-1,500,-500}
for element = 1 to length(series_of_values) do
     binary_bits = int_to_bits(series_of_values[element],32)
     actual_binary_number = {}
     for bits = length(binary_bits) to 1 by -1 do
          actual_binary_number = actual_binary_number &
                                 (binary_bits[bits] + 48)
     end for
     printf(1,"%5d      %32s\n",
     {series_of_values[element],actual_binary_number})
end for
puts(1,"\n\n")
puts(1,"(Note: int_to_bits() returns the bits in reversed sequence.\n")
puts(1," The output displayed has been adjusted to show the bits as they\n")
puts(1," are meant to appear in a binary number)\n")

The opposite of this is to take a binary number and convert it into an integer value. The is approach might be taken when you are using bits to represent a list of conditions, and for efficient storage want to bundle them all into a single integer value. To convert a binary number into an integer value, you use the following library routine below:

   include machine.e
   ra = bits_to_int(s)

bits_to_int() takes sequence s representing a binary number and converts it to a positive integer value, which is stored in receiving variable ra. s is made up of atom elements that represent the bits of the binary number. Each element is either 0 or 1 in value. The elements in s representing bits appear in reverse order, where element 1 is the rightmost bit. For example, element 1 is bit 20, element 2 is bit 21, element 3 is bit 22 and so forth. The receiving variable is an atom and not an integer for the same reason mentioned in our discussion with int_to_bits(). You may want to produce integer values beyond the range of -1073741824 and +1073741823. Only atom variables can support integers beyond that range.

You will notice that bits_to_int() only produces positive integer values. This is because the leftmost bit (the last element in the sequence) is assumed to be a part of the integer value, and not a sign bit. This shouldn't be a problem, as there wouldn't be a reason to convert a list of binary digits arranged in two's complement format if you are using each bit as an outcome of a condition test. A demo program is available to show how bits_to_int() us used to store bit patterns into a single byte value.

Demo program 77
include machine.e
sequence bit_patterns
atom integer_value
clear_screen()
puts(1,"How bits_to_int() is used to convert a sequence of 8 bits into\n")
puts(1,"a single byte value.\n")
puts(1,"==============================================================\n\n")


bit_patterns = {{1,1,1,1,0,0,0,0},
                {1,0,1,0,1,0,1,0},
                {1,1,1,0,0,1,1,1},
                {1,0,0,0,0,0,0,1},
                {1,1,1,1,1,1,1,1}}

for bit_groups = 1 to length(bit_patterns) do
    integer_value = bits_to_int(bit_patterns[bit_groups])
    print(1,bit_patterns[bit_groups])
    printf(1," can be stored in a value of %3d\n",integer_value)

end for

You can also reference one or more bits by a process called "masking". Masking involves comparing a value (let's call it "A") against a second value (let's call it "B") in such a way that B's bit pattern either obtains or filters out specific bits in A. B is called the mask value. Boolean logic at the bit level is used in masking. There are three types of masks, with the first two being shown below:

11110000 - value           11010010 - value
00001111 - OR mask         01111110 - AND mask
--------                   --------
11111111 - OR result       01010010 - AND result

In an OR mask, the result bit positions only contain 1 if the value, the mask, or both, have matching bit positions containing 1. In and AND mask, the result bit positions only contain 1 if both the value and the mask have matching bit positions containing 1.

The third type of mask is called XOR (rhymes with "sore" but starting with a "Z" sound). It can best be described as a cross between an OR and something like a backwards AND where two 1 bits result in a 0. Here's how it works below:

11011100 - value
00011100 - XOR mask
--------
11000000 - XOR result

In an XOR mask, the result bit positions only contain 1 when either the value or the mask (but not both!) have matching bit positions containing 1.

With masks now understood, let's show some library routines that perform AND, OR and XOR bit operations in Euphoria.

   ro = and_bits(o1,o2)

and_bits() performs AND operations using values o1 and o2 to create a result that is stored in receiving variable ro. The bits in the result stored in ro will only be 1 if the matching bit positions in o1 and o2 are both 1. o1 and o2 can be atoms or sequences containing atom elements. If o1 is an atom and o2 is a sequence (or vice-versa), then the rule of mixing atoms and sequences in a binary expression (where the atom value is converted to a sequence having the same length as the other sequence value, and made up of elements having the value ofthe original atom) applies. and_bits() can handle any accepted values up to and including 32 bits in size. The result produced by and_bits() may be a negative value if the process causes the leftmost bit to be set to 1. This is because the leftmost bit is considered a sign bit. A demo program shows how and_bits() works with some atoms and sequences.

Demo program 78
include machine.e

atom single_value, ANDed_atom, work_value, ANDer_atom
sequence bunch_of_values, ANDed_sequence, returned_bits

clear_screen()

ANDer_atom = 484848

bunch_of_values = {222222,333333,444444}
single_value = 123456

printf(1,"ANDing %6d and %6d\n\n",{single_value,ANDer_atom})

ANDed_atom = and_bits(single_value,ANDer_atom)

work_value = single_value
returned_bits = int_to_bits(work_value,32)
printf(1,"%6d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n")

work_value = ANDer_atom
returned_bits = int_to_bits(work_value,32)
printf(1,"%6d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n")
puts(1,repeat('-',50) & "\n")

work_value = ANDed_atom
returned_bits = int_to_bits(work_value,32)
printf(1,"%6d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n\n")

puts(1,"\nPress Any Key To Continue.......\n\n")
while get_key() = -1 do
end while

clear_screen()

ANDed_sequence = and_bits(bunch_of_values,ANDer_atom)
puts(1,"ANDing ")
print(1,bunch_of_values)
printf(1," and %6d\n\n",ANDer_atom)

for element = 1 to length(bunch_of_values) do
     work_value = bunch_of_values[element]
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%6d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n")

     work_value = ANDer_atom
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%6d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n")
     puts(1,repeat('-',50) & "\n")

     work_value = ANDed_sequence[element]
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%6d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n\nPress Any Key To Continue.......\n\n")
     while get_key() = -1 do
     end while
end for
puts(1,"Result is ")
print(1,ANDed_sequence)
puts(1,"\n")

   ro = or_bits(o1,o2)

or_bits() performs OR operations using values o1 and o2 to create a result that is stored in receiving variable ro. The bits in the result stored in ro will only be 1 if the matching bit positions in either o1, o2, or both, is a value of 1. o1 and o2 can be atoms or sequences containing atom elements. if o1 is an atom and o2 is a sequence (or vice-versa), then the rule of mixing atoms and sequences in a binary expression (where the atom value is converted to a sequence having the same length as the other sequence value, and made up of elements having the value of the original atom) applies. or_bits() can handle any accepted values up to and including 32 bits in size.

The result produced by or_bits() may be a negative value if the process causes the leftmost bit to be set to 1. This is because the leftmost bit is considered a sign bit. A demo program shows how or_bits() works with some atoms and sequences.

Demo program 79
include machine.e

atom single_value, ORed_atom, work_value, ORer_atom
sequence bunch_of_values, ORed_sequence, returned_bits

clear_screen()

ORer_atom = 545454

bunch_of_values = {414141,707070,312312}
single_value = 321321

printf(1,"ORing %6d and %6d\n\n",{single_value,ORer_atom})

ORed_atom = or_bits(single_value,ORer_atom)

work_value = single_value
returned_bits = int_to_bits(work_value,32)
printf(1,"%6d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n")

work_value = ORer_atom
returned_bits = int_to_bits(work_value,32)
printf(1,"%6d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n")
puts(1,repeat('-',50) & "\n")

work_value = ORed_atom
returned_bits = int_to_bits(work_value,32)
printf(1,"%6d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n\n")

puts(1,"\nPress Any Key To Continue.......\n\n")
while get_key() = -1 do
end while

clear_screen()

ORed_sequence = or_bits(bunch_of_values,ORer_atom)
puts(1,"ORing ")
print(1,bunch_of_values)
printf(1," and %6d\n\n",ORer_atom)

for element = 1 to length(bunch_of_values) do
     work_value = bunch_of_values[element]
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%6d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n")

     work_value = ORer_atom
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%6d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n")
     puts(1,repeat('-',50) & "\n")

     work_value = ORed_sequence[element]
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%6d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n\nPress Any Key To Continue.......\n\n")
     while get_key() = -1 do
     end while
end for
puts(1,"Result is ")
print(1,ORed_sequence)
puts(1,"\n")

   ro = xor_bits(o1,o2)

xor_bits() performs XOR operations using values o1 and o2 to create a result that is stored in receiving variable ro. The bits in the result stored in ro will only be 1 if either of the matching bit positions in o1 and o2 are 1, but not both. o1 and o2 can be atoms or sequences containing atom elements. if o1 is an atom and o2 is a sequence (or vice-versa), then the rule of mixing atoms and sequences in a binary expression (where the atom value is converted to a sequence having the same length as the other sequence value, and made up of elements having the value of the original atom) applies. xor_bits() can handle any accepted values up to and including 32 bits in size.

The result produced by xor_bits() may be a negative value if the process causes the leftmost bit to be set to 1. This is because the leftmost bit is considered a sign bit. We've modified the or_bits() demo program to use xor_bits() instead, to show the one difference between XOR and OR.

Demo program 80
include machine.e

atom single_value, XORed_atom, work_value, XORer_atom
sequence bunch_of_values, XORed_sequence, returned_bits

clear_screen()

XORer_atom = 545454

bunch_of_values = {414141,707070,312312}
single_value = 321321

printf(1,"XORing %6d and %6d\n\n",{single_value,XORer_atom})

XORed_atom = xor_bits(single_value,XORer_atom)

work_value = single_value
returned_bits = int_to_bits(work_value,32)
printf(1,"%6d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n")

work_value = XORer_atom
returned_bits = int_to_bits(work_value,32)
printf(1,"%6d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n")
puts(1,repeat('-',50) & "\n")

work_value = XORed_atom
returned_bits = int_to_bits(work_value,32)
printf(1,"%6d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n\n")

puts(1,"\nPress Any Key To Continue.......\n\n")
while get_key() = -1 do
end while

clear_screen()

XORed_sequence = xor_bits(bunch_of_values,XORer_atom)
puts(1,"XORing ")
print(1,bunch_of_values)
printf(1," and %6d\n\n",XORer_atom)

for element = 1 to length(bunch_of_values) do
     work_value = bunch_of_values[element]
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%6d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n")

     work_value = XORer_atom
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%6d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n")
     puts(1,repeat('-',50) & "\n")

     work_value = XORed_sequence[element]
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%6d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n\nPress Any Key To Continue.......\n\n")
     while get_key() = -1 do
     end while
end for
puts(1,"Result is ")
print(1,XORed_sequence)
puts(1,"\n")

The last bit-handling library routine for this chapter is listed below:

   ro = not_bits(o)

not_bits() reverses each bit in value o to its opposite state (for example, 1 to 0 or 0 to 1). o may be an atom value, or a sequence made up of atom elements. not_bits() can handle accepted values up to and including 32 bits. The inverted result is stored in receiving variable ro. If the leftmost bit is changed to 1, the value placed in ro will be negative because this bit is the sign bit. A demo program is ready to show how not_bits() works with sequence and atom values.

Demo program 81
include machine.e

atom single_value, NOTed_atom, work_value
sequence bunch_of_values, NOTed_sequence, returned_bits

clear_screen()

bunch_of_values = {823123,-907121,621325}
single_value = -1

printf(1,"NOTing %2d\n\n",single_value)

NOTed_atom = not_bits(single_value)

work_value = single_value
returned_bits = int_to_bits(work_value,32)
printf(1,"%7d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n")
puts(1, repeat('-',51) & "\n")

work_value = NOTed_atom
returned_bits = int_to_bits(work_value,32)
printf(1,"%7d ---------> ",work_value)
for bits = 32 to 1 by -1 do
     print(1,returned_bits[bits])
end for
puts(1,"\n\n")

puts(1,"\nPress Any Key To Continue.......\n\n")
while get_key() = -1 do
end while

clear_screen()

NOTed_sequence = not_bits(bunch_of_values)
puts(1,"NOTing ")
print(1,bunch_of_values)
puts(1,"\n\n")

for element = 1 to length(bunch_of_values) do
     work_value = bunch_of_values[element]
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%7d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n")
     puts(1, repeat('-',51) & "\n")

     work_value = NOTed_sequence[element]
     returned_bits = int_to_bits(work_value,32)
     printf(1,"%7d ---------> ",work_value)
     for bits = 32 to 1 by -1 do
          print(1,returned_bits[bits])
     end for
     puts(1,"\n\nPress Any Key To Continue.......\n\n")
     while get_key() = -1 do
     end while
end for
puts(1,"Result is ")
print(1,NOTed_sequence)
puts(1,"\n")

The next chapter will show you how you can save numbers larger than 255 outside your program, whether it is an integer or a floating point number.

ToC

Working With Bytes

The puts() library routine is handy for sending character output to files and the screen. But the one drawback it has is that only byte values, or values between 0 and 255, can be used. Any attempt to send a larger value out to a file or screen will result in data loss. This occurs because only the lower 8 bits are sent. However, Euphoria has a set of library routines that convert large integer and even floating point numbers to a series of bytes. One option of sending values larger than 255 to the screen or file is to use the print() library routine. The end result, in the case of files, is an outputted character string (for example, -2453 is stored as a 5 byte string "-2543"). It works, but it is very wasteful in terms of data storage. So it stands to reason that if a value like 255 can be represented as a single byte, then values like 65535 can be represented using only two bytes rather than 5 bytes when using print(). Here is the library routine that can help you do this:

   include machine.e
   rs = int_to_bytes(a)

int_to_bytes() takes a signed integer value, a, and converts it into a 4 element long sequence representing 4 bytes. The integer is represented as a rather than i because int_to_bytes() works with 32 bit numbers, and only atom data objects can be that large. Integers are only 31 bits long. The 4 element sequence is returned to the receiving variable rs. Each element is a bundle of 8 bits, with the lowermost 8 bits (2 to the power of 0 to 2 to the power of 7) starting in the first element. To clarify, the structure of the 4-element sequence looks like this, with meanings shown for each element:

                 {byte, byte, byte, byte}
bits 20 to 27 ----^
bits 28 to 215----------^
bits 216 to 223----------------^
bits 224 to 231----------------------^

Once an integer is converted to a series of bytes, you can write each byte out to files using puts() without any risk of data loss. If you want to handle extremely large numbers that require 64 bits instead of 32 bits, then int_to_bytes() will return the lowermost 32 bits of these numbers.

So now you have a way to convert large integer values into a series of bytes. It would be just as handy to have the ability to take those same bytes and convert them back into the original integer value. Here is a library routine that can do this for positive integers:

   include machine.e
   ra = bytes_to_int(s)

bytes_to_int() takes a sequence value, s, representing a 32 bit number, and converts it to a positive integer. The positive integer value is then stored in receiving variable ra. ra is an atom because bytes_to_int() works with 32 bit long numbers. The sequence passed to bytes_to_int() is made up of 4 atom elements, where each element is a bundle of 8 bits, starting with the lowermost bits (bit 20 to 27) being the first element. The structure of the sequence is the same as introduced in int_to_bytes(). As a matter of fact, you can use the sequence generated by int_to_bytes() as a parameter for bytes_to_int() if you are working with positive numbers.

However, bytes_to_int() does not convert properly when you are trying to bring back a negative number previously into 4 bytes by int_to_bytes(). This does not mean, however, you cannot bring back the negative integer value. It means you will have to do a little extra work in order to bring it back.

To bring back a negative number previously converted by int_to_bytes():

1) Use int_to_bits() to convert the 4th element of the sequence created by
   int_to_bytes() into a 32 element sequence, and look at the 32nd element.
   If it is 1, you have a negative number.

2) Use not_bits() to reverse the bits in the int_to_bytes() sequence.

3) Use the result of not_bits() as the sequence you pass to bytes_to_int().

4) Add 1 to the integer created by bytes_to_int(), then multiply the integer by -1.
   The integer should be the correct value.

Run a demo program now that uses int_to_bytes() and bytes_to_int() to store positive and negative integers to a file on your computer.

Demo program 82
include machine.e
sequence four_bytes, values_to_be_saved, returned_bits
integer demo_file
atom value_to_be_restored

puts(1,"This program will demonstrate how to save negative and positive\n")
puts(1,"integers to file after converting them into 4 bytes, and then \n")
puts(1,"read them back and re-assemble them into their original values.\n")
puts(1,"Because puts() strips off the upper 24 bits (bit 2 to the power\n")
puts(1,"of 8 to bit to the power of 31), the procedure used to convert\n")
puts(1,"negative numbers back from the four bytes made by int_to_bytes()\n")
puts(1,"needs to be modified. First of all, the eighth element, not the\n")
puts(1,"thirty-second element, is treated as the sign bit. Second, you\n")
puts(1,"subtract {255,255,255,255} from the four_elements instead of\n")
puts(1,"using not_bits(). Third, you add -1 after converting the\n")
puts(1,"adjusted four bytes to an integer using bytes_to_int().\n\n")

values_to_be_saved = {31619125,-31619125}

for values = 1 to 2 do
     four_bytes = int_to_bytes(values_to_be_saved[values])
     demo_file = open("demo.fle","wb")
     printf(1,"Saving To File: %9d\n",values_to_be_saved[values])
     for bytes = 1 to 4 do
          puts(demo_file,four_bytes[bytes])
     end for
     close(demo_file)

     four_bytes = {}
     demo_file = open("demo.fle","rb")
     for bytes = 1 to 4 do
          four_bytes = four_bytes & getc(demo_file)
     end for
     close(demo_file)

     returned_bits = int_to_bits(four_bytes[4],8)
     if returned_bits[8] = 1 then
          four_bytes =  four_bytes - 255
          value_to_be_restored = bytes_to_int(four_bytes)
          value_to_be_restored = value_to_be_restored - 1
     else
          value_to_be_restored = bytes_to_int(four_bytes)
     end if

     printf(1,"Retrieved From File: %9d\n\n",value_to_be_restored)

end for


The numbers we have worked with have been integer values. Remember also that programs work with floating-point (numbers with a decimal) as well. If you remember from the start of the tutorial, we represent very large and very small numbers using standard notation:

   6.13451e+009 (meaning 6.13451 × 1000000000, or 6134510000)
   4.52e-005    (meaning 4.52 × .00001, or.0000452)

The decimal number being multiplied by the power of 10 is called the mantissa, and is never larger than 10. In the binary numbering system, there is no such thing as a decimal point, so you can't have numbers like 101111.01 for example. Instead, an organization in the U.S.A. called the Institute of Electrical and Electronic Engineers (IEEE) created a floating point standard that addresses this problem nicely.
The floating point standard created by the IEEE comes in two sizes, on using 32 bits, and the other using 64 bits. Please note these formats are being introduced to you for your personal interest only:

1 bit     +     8 bits     +     23 bits     =     32 bits
(sign bit)      (exponent)       (mantissa)

1 bit     +     11 bits     +     52 bits     = 64 bits
(sign bit)      (exponent)        (mantissa)

Because Euphoria automatically handles how the exponent and mantissa portions are created and used in the representation of binary floating numbers, we will not go any further at this point. All you need to know is what the IEEE 32 bit and 64 bit floating point formats are when mentioned in the library routines you will learn next. If you are interested, there are FAQ's about IEEE floating points on the internet.

To convert a floating point number to a 4-byte (32 bit) IEEE format, you use the following library routine:

   include machine.e
   rs = atom_to_float32(a)

atom_to_float32() will convert a floating point value, a, to a 4 element long sequence value, each element being an atom. The sequence will be stored in receiving variable rs. The sequence value represents the 32 bit IEEE format introduced previously. a can be a negative or positive value, and can even be an integer value, though it will still be converted to the 32 bit IEEE floating point format. To convert the 4 element sequence back to the original value, you do the following:

   include machine.e
   ra = float32_to_atom(s)

To convert a floating point number to an eight-byte (64 bit) IEEE format you use the following library routine:

   include machine.e
   ra = atom_to_float64(a)

atom_to_float64() will convert a floating point value, a, to an 8 element long sequence value, each element being an atom. The sequence will be stored in receiving variable rs. The sequence value represents the 64 bit IEEE format introduced previously. a can be a negative or positive value, and can even be an integer value, though it will still be converted to the 64 bit IEEE floating point format. To convert the 8 element sequence back to the original value, you do the following:

   include machine.e
   ra = float64_to_atom(s)

The use of these powerful floating point library routines allows for efficient storage of floating point data in files. Once you use either atom_to_float32() or atom_to_float64() , you can use puts() to write each element of the sequence. When it is time to bring the data back from the file into the data, float32_to_atom() or float64_to_atom() can be used once all the bytes previously written out are obtained using the . library routine.

You should be careful not to use atom_to_float32() on floating point numbers that, because of size and accuracy, must use the IEEE 64 bit format. There is a risk you could lose data accuracy (if not part of the data value itself) if you are not careful about this. A demo program uses these 4 library routines to save data to a file.

Demo program 83
include machine.e
integer demo_file
atom returned_float
sequence values,IEEE_bytes

clear_screen()
values = {1023.11,-33.25}

for elements = 1 to length(values) do
    printf(1,"Saving To File: %.2f\n",values[elements])
    if elements = 1 then
         IEEE_bytes = atom_to_float64(values[elements])

         demo_file = open("demo.fle","wb")
         for bytes = 1 to length(IEEE_bytes) do
              puts(demo_file,IEEE_bytes[bytes])
         end for
         close(demo_file)

         IEEE_bytes = {}

         demo_file = open("demo.fle","rb")
         for bytes = 1 to 8 do
              IEEE_bytes = IEEE_bytes & getc(demo_file)
         end for
         close(demo_file)

         returned_float = float64_to_atom(IEEE_bytes)
    end if

    if elements = 2 then
         IEEE_bytes = atom_to_float32(values[elements])

         demo_file = open("demo.fle","wb")
         for bytes = 1 to length(IEEE_bytes) do
              puts(demo_file,IEEE_bytes[bytes])
         end for
         close(demo_file)

         IEEE_bytes = {}

         demo_file = open("demo.fle","rb")
         for bytes = 1 to 4 do
              IEEE_bytes = IEEE_bytes & getc(demo_file)
         end for
         close(demo_file)

         returned_float = float32_to_atom(IEEE_bytes)
    end if

    printf(1,"Read From File: %.2f\n\n",returned_float)

end for


The next chapter of this tutorial will show you how to create your own library routines and variable types!

ToC