Binary I/O

Binary I/O is accomplished through two member functions: read and write. The syntax for read is:

in_file.read(data_ptr, size); 
data_ptr
Pointer to a place to put the data.

size
Number of bytes to be read.

The member function gcount returns the number of bytes gotten by the last read. This may be less than the number of bytes requested. For example, the read might encounter an end-of-file or error:

struct { 
    int     width; 
    int     height; 
} rectangle; 
 
in_file.read(static_cast<char *>(&rectangle), sizeof(rectangle));
if (in_file.bad(  )) {
    cerr << "Unable to read rectangle\n"; 
    exit (8); 
} 
if (in_file.gcount(  ) != sizeof(rectangle)) {
    cerr << "Error: Unable to read full rectangle\n";
    cerr << "I/O error of EOF encountered\n";
}

In this example you are reading in the structure rectangle. The & operator makes rectangle into a pointer. The cast static_cast<char *> is needed since read wants a character array. The sizeof operator is used to determine how many bytes to read as well as to check that read was successful.

The member function write has a calling sequence similar to read:

out_file.write(data_ptr, size); 

Buffering Problems

Buffered I/O does not write immediately to the file. Instead, the data is kept in a buffer until there is enough for a big write, or until the buffer is flushed. The following program is designed to print a progress message as each section is finished.

std::cout << "Starting"; 
do_step_1(  ); 
std::cout << "Step 1 complete"; 
do_step_2(  ); 
std::cout << "Step 2 complete"; 
do_step_3(  ); 
std::cout << "Step 3 complete\n"; 

Instead of writing the messages as each step completes, std::cout puts them in a buffer. Only after the program is finished does the buffer get flushed, and all the messages come spilling out at once.

The I/O manipulator std::flush forces the flushing of the buffers. Properly written, the above example should be:

std::cout << "Starting" << std::flush; 
do_step_1(  ); 
std::cout << "Step 1 complete" << std::flush; 
do_step_2(  ); 
std::cout << "Step 2 complete" << std::flush; 
do_step_3(  ); 
std::cout << "Step 3 complete\n" << std::flush; 

Because each output statement ends with a std::flush, the output is displayed immediately. This means that our progress messages come out on time.

TIP:   The C++ I/O classes buffer all output. Output to std::cout and std::cerr is line buffered. In other words, each newline forces a buffer flush. Also, C++ is smart enough to know that std::cout and std::cerr are related to std::cin and will automatically flush these two output streams just before reading std::cin. This makes it possible to write prompts without having to worry about buffering:

NOTE: std::cout << "Enter a value: "; // Note: No flush std::cin >> value;

Unbuffered I/O

In buffered I/O, data is buffered and then sent to the file. In unbuffered I/O, the data is immediately sent to the file.

If you drop a number of paperclips on the floor, you can pick them up in buffered or unbuffered mode. In buffered mode, you use your right hand to pick up a paper clip and transfer it to your left hand. The process is repeated until your left hand is full, then you dump a handful of paperclips into the box on your desk.

In unbuffered mode, you pick up a paperclip and dump it into the box. There is no left-hand buffer.

In most cases, buffered I/O should be used instead of unbuffered. In unbuffered I/O, each read or write requires a system call. Any call to the operating system is expensive. Buffered I/O minimizes these calls.

Unbuffered I/O should be used only when reading or writing large amounts of binary data or when direct control of a device or file is required.

Back to the paperclip example--if you were picking up small items like paperclips, you would probably use a left-hand buffer. But if you were picking up cannon balls (which are much larger), no buffer would be used.

The open system call is used for opening an unbuffered file. The macro definitions used by this call differ from system to system. Since the examples have to work for both Unix and MS-DOS/Windows, conditional compilation ( #ifdef/ #endif) is used to bring in the correct files:

#include <sys/types.h>  
#include <sys/stat.h>
#include <fcntl.h>
 
#ifdef _  _MSDOS_  _        // If we are MS-DOS 
#include <io.h>         // Get the MS-DOS include file for raw I/O
#else /* _  _MSDOS_  _ */
#include <unistd.h>     // Get the Unix include file for raw I/O 
#endif /* _  _MSDOS_  _ */

The syntax for an open call is:

int file_descriptor = open(name, flags);    // Existing file
file_descriptor = open(name, flags, mode);//New file
file_descriptor
An integer that is used to identify the file for the read, write, and close calls. If file_descriptor is less than 0, an error occurred.

name
Name of the file.

flags
Defined in the fcntl.h header file. Open flags are described in Table 16-6.

mode
Protection mode for the file. Normally this is 0644.

Table 16-6: Open flags

Flag

Meaning

O_RDONLY

Open for reading only.

O_WRONLY

Open for writing only.

O_RDWR

Open for reading and writing.

O_APPEND

Append new data at the end of the file.

O_CREAT

Create file (the file mode parameter required when this flag is present).

O_TRUNC

If the file exists, truncate it to 0 length.

O_EXCL

Fail if file exists.

O_BINARY

Open in binary mode (older Unix systems may not have this flag).

For example, to open the existing file data.txt in text mode for reading, you use the following:

    data_fd = open("data.txt", O_RDONLY); 

The next example shows how to create a file called output.dat for writing only:

     out_fd = open("output.dat", O_CREAT|O_WRONLY, 0666); 

Notice that you combined flags using the OR (|) operator. This is a quick and easy way of merging multiple flags.

When any program is initially run, three files are already opened. These are described in Table 16-7.

Table 16-7: Standard unbuffered files

File number

Description

0

Standard in

1

Standard out

2

Standard error

The format of the read call is:

read_size = read(file_descriptor, buffer, size); 
read_size
The actual number of bytes read. A 0 indicates end-of-file, and a negative number indicates an error.

file_descriptor
File descriptor of an open file.

buffer
Pointer to a place to put the data that is read from the file.

size
Size of the data to be read. This is the size of the request. The actual number of bytes read may be less than this. (For example, you may run out of data.)

The format of a write call is:

write_size = write(file_descriptor, buffer, size); 
write_size
Actual number of bytes written. A negative number indicates an error.

file_descriptor
File descriptor of an open file.

buffer
Pointer to the data to be written.

size
Size of the data to be written. The system will try to write this many bytes, but if the device is full or there is some other problem, a smaller number of bytes may be written.

Finally, the close call closes the file:

flag = close(file_descriptor) 
flag
0 for success, negative for error.

file_descriptor
File descriptor of an open file.

Example 16-5 copies a file. Unbuffered I/O is used because of the large buffer size. It makes no sense to use buffered I/O to read 1K of data into a buffer (using an std::ifstream) and then transfer it into a 16K buffer.

Example 16-5: copy2/copy2.cpp

/****************************************
 * copy -- copy one file to another.    *
 *                                      *
 * Usage                                *
 *      copy <from> <to>                *
 *                                      *
 * <from> -- the file to copy from      *
 * <to>   -- the file to copy into      *
 ****************************************/
#include <iostream>
#include <cstdlib>      
 
#include <sys/types.h>  
#include <sys/stat.h>
#include <fcntl.h>
 
#ifdef _  _WIN32_  _        // if we are Windows32
#include <io.h>         // Get the Windows32 include file for raw i/o
#else /* _  _WIN32_  _ */
#include <unistd.h>     // Get the Unix include file for raw i/o 
#endif /* _  _WIN32_  _ */
 
const int BUFFER_SIZE = (16 * 1024); // use 16k buffers 
 
int main(int argc, char *argv[])
{
    char  buffer[BUFFER_SIZE];  // buffer for data 
    int   in_file;              // input file descriptor
    int   out_file;             // output file descriptor 
    int   read_size;            // number of bytes on last read 
 
    if (argc != 3) {
        std::cerr << "Error:Wrong number of arguments\n";
        std::cerr << "Usage is: copy <from> <to>\n";
        exit(8);
    }
    in_file = open(argv[1], O_RDONLY);
    if (in_file < 0) {
        std::cerr << "Error:Unable to open " << argv[1] << '\n';
        exit(8);
    }
    out_file = open(argv[2], O_WRONLY | O_TRUNC | O_CREAT, 0666);
    if (out_file < 0) {
        std::cerr << "Error:Unable to open " << argv[2] << '\n';
        exit(8);
    }
    while (true) {
        read_size = read(in_file, buffer, sizeof(buffer));
 
        if (read_size == 0)
            break;              // end of file 
 
        if (read_size < 0) {
            std::cerr << "Error:Read error\n";
            exit(8);
        }
        write(out_file, buffer, (unsigned int) read_size);
    }
    close(in_file);
    close(out_file);
    return (0);
}

Several things should be noted about this program. First of all, the buffer size is defined as a constant, so it is easily modified. Rather than have to remember that 16K is 16,384, the programmer used the expression (16 * 1024). This form of the constant is obviously 16K.

If the user improperly uses the program, an error message results. To help the user get it right, the message tells how to use the program.

You may not read a full buffer for the last read. That is why read_size is used to determine the number of bytes to write.

Page:  1   2   3   4   5   6   7   8   9   Next 



Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: October 29, 2014 @ 11:00 a.m. ET / 8:00 a.m. PT Are you interested in building a cognitive application using the power of IBM Watson? Need a platform that provides speed and ease for rapidly deploying this application? Join Chris Madison, Watson Solution Architect, as he walks through the process of building a Watson powered application on IBM Bluemix. Chris will talk about the new Watson Services just released on IBM bluemix, but more importantly he will do a step by step cognitive …

  • Live Event Date: October 29, 2014 @ 1:00 p.m. ET / 10:00 a.m. PT It's well understood how critical version control is for code. However, its importance to DevOps isn't always recognized. The 2014 DevOps Survey of Practice shows that one of the key predictors of DevOps success is putting all production environment artifacts into version control. In this eSeminar, Gene Kim will discuss these survey findings and will share woeful tales of artifact management gone wrong! Gene will also share examples of how …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds