Saturday, 4 July 2015

FILES

10.1 Introduction:

Many real-Life problems handle large volumes of data and, in such situations; we need to use some devices such as floppy disk or hard disk to store the data. The data is stored in these devices using the concept of files. A file is a collection of related data stored in a particular area on the disk. Programs can be designed to perform the read and write operations on these files.

A program typically involves either or both of the following kinds of data communication:
  1. Data transfer between the console unit and the program.
  2. Data transfer between the program and a disk file.

We have already discussed the technique of handling data communication between the console unit and the program. The I/O system of C++ handles, file operations, which are very much similar to the console input and output operations. It uses file streams as an interface between the programs and the file. The stream that supplies data to the program is known as input stream and the one that receive, data from the program is known as output stream. In other words, the input stream extracts (or reads) data from the file and the output stream inserts (or write,) data to the file. The input operation involves the creation of an input stream and linking it with the program and the input file. Similarly, the output operation involves establishing an output stream with the necessary links with the program and the output file.

10.2 Classes For File Stream Operations:

The I/O system of C++ contains a set of classes that define the file handling methods. These include ifstream, ofstream and fstream. These classes are derived from fstreambase and from the corresponding iostream.h class. These classes, designed to manage the disk files, are declared in fstream.h and therefore we must include this file in any program that uses files Table shows the details of file operation classes. Note that these classes contain many more features. For more details, refer to the manual.

Table:  Details of file stream classes
Class                                                                Contents
filebuf                                     Its purpose is to set the file buffers to read and write.
Contains openprot constant used in the open( ) of file stream classes Also contains close( ) and open( ) as members.

fstreambase                 Provides operations common to the file streams. Serves as a base for fstream, ifstream and ofstream classes.   Contains open( ) and close( )  functions .

ifstream                       Provides input operations. Contains open( ) with default
input mode. Inherits the functions get( ), getline( ),    read( ), seekg( ) and tellg( ) functions from istream.
ofstream                      Provides output operations. Contains open( ) with default
output mode. Inherits put( ), seekp( ), tellp( ), and write( )
functions from ostream.
fstream                        Provides support for simultaneous input and output
                                    operations contains open() with default input mode inherits
all the functions from istream and ostream classes through
iostream. 


10.3 Opening Files Using open( ):

The function open( ) can be used to open multiple files that use the same stream object. For example, we may want to process a set of files sequentially. In such cases, we may create a single stream object and use it to open each file in turn. This is done as follows:

file-stream-class stream-object;
stream-object. open ("filename) ;

Example:

ofstream outfile;                      //Create stream (for output)
outfile.open("DATA1");         //Connect stream to DATA1
………..
………..
outfile.close( );                        //Disconnect stream from DATA1
outfile.open("DATA2);           //Connect stream to DATA2
            …………
            …………
outfile.close( );                        //Disconnect stream from DATA2
            …………
            ………….

The above program segment opens two files in sequence for writing the data. Note that the first file is closed before opening the second one. This is necessary because a stream can be connected to only one file at a time.

// Creating files with open( ) function
# include <fstream.h>
main( )
{
ofstream fout;                                     //create output stream
fout.open("country") ;             //connect "country" to it
fout << "Republic of India \n";
fout << "United Kingdom \n" ;
fout << "Kingdom of Saudi Arabia\n";
fout .close ( ) ;                         //disconnect "country" and
fout.open("capital") ;               //connect "capital"
fout < < "New Delhi \n" ;
fout < < "London\n" ;
fout << "Riyadh \n" ;
fout .close ( ) ;                         // disconnect "capital"
//Reading the files
const int N = 80;                     //size of line
char line [N] ;
ifstream fin;                             //create input stream
fin. open ( "country" ) ;           / /connect "country" to it'
cout <<"contents of country file \n";
while ( fin)                              / /check end-of-file
{
fin.getline(line, N);                  //read a line
cout << line;                            //display it
}
fin. close ( ) ;                           / /disconnect "country" and
fin.open("capital") ;                 //connect "capital"
cout << "\n Contents of capital file \n";
while(fin) {
fin.getline(line, N) ;
            cout<<line;
}

At times we may require to use two or more files simultaneously. For example, we may require merging two sorted files into a third sorted file. This means, both the sorted files have to be kept open for reading and the third one kept open for writing. In such cases, we need to create two separate input streams for handling the two input files and one output stream for handling the output file.

//  Program to demonstrate reading from two files simultaneously
# include <fstream.h>
#include <stdlib.h>                             // for exit() function
void main( )
{
const int SIZE = 80 ;
char line[SIZE];
ifstream fin1, fin2;                  Il create TWO input streams
fin1.open("country");              // connect country to fin1
fin2.open("capital”);                // connect capital to fin2
for(int i = 1; i <= 10; i++) {
if(fin1.eof( )!= 0) {
cout << “ Exit from country \n";
exit(l) ;
            }
fin1.getline(line, SIZE) ;
cout << "Capital of "<< line ;
if(fin2.eof( ) != 0){    
cout << "Exit from capital \n";
exit (1) ;
            }
fin2.getline(line, SIZE) ;
cout << line << "\n";
}
}

Closing files: Every file opened should be closed. To close file opened the close( ) member function could be used
            file.close( );

file indicates the file opened. close( ) is a built in member function.

10.4 Detecting End-Of-File:

Detection of the end-of-file condition is necessary for preventing any further attempt to read data from the file. An ifstream object, such as fin’s, in the above program returns a value of 0 if any error occurs in the file operation including the end-of-file condition. Thus, the while loop in the program terminates when fin’s returns a value of zero on reaching the end-of-file condition. Remember, this loop may terminate due to other failures as well.

eof( ) is a member function of ios class. It returns a non-zero value if the end-of-file(EOF) condition is encountered, and a zero, otherwise program on reaching the end of the file.

10.5 About Open() : File Modes:

We have used ifstream and ofstream constructors and the function open( ) to create new files as well as to open the existing files Remember, in both these methods, we used only one argument that was the filename. However, these functions can take two arguments, the second one for specifying the file mode. The general form of the function open( ) with two arguments is:

stream-object. open (“Filename: mode”);

The second argument mode (called file mode parameter) specifies the purpose for which the file is opened. The prototype of these class member functions contain default values for the second argument and therefore they use the default values in the absence of the actual values. The default values are as follows

ios::in for ifstream functions meaning open for reading only.
ios::out for ofstream functions meaning open for writing only

The file mode parameter can take one (or more) of such constants defined in the class ios Table lists the file mode parameters and their meaning,

Table: File mode parameters and their meaning,

Parameter                              Meaning


ios::app                                    Append to end-of-file
ios::ate                                     Go to end-of-file on opening
ios::binary                                Binary file
ios::in                                       Open file for reading only
ios::nocreate                            Open fails if the file does not exist
ios::noreplace                          Open fails if the file already exists
ios::out                                    Open file for writing only
ios::trunc                                 Delete contents of the file if it exists

Notes:

  1. Opening a file in ios::out  mode also opens it in the ios::trunc mode by default.
  2. Both ios::app & ios::ate  take us to the end of the file when it is opened.  But the difference between the two parameters is that the ios::app allows us to add data to the end of the file only,  while ios::ate mode permits us to add data or to modify the existing data anywhere in the file.  In both the cases, a file is created by the specified name, if it does not exist.
  3. The parameter ios::app can be used only with the files capable of output.
  4. Creating a stream using ifstream implies input and creating a stream using ofstream   implies output. So in these cases it is not necessary to provide the mode parameters.
  5. The fstream class does not provide a mode by default and therefore, we must provide     the mode explicitly when using an object of fstream class. .
  6. The mode can combine two or more parameters using the bitwise OR operator    (symboll) as shown below:

fout.open("data", ios::app | ios:: nocreate)

This opens the file in the append mode but fails to open the file if it does not exist.

10.6 File Pointers And Their Manipulations:

Each file has two associated pointers known as the file pointers. One of them is called the input pointer (or get pointer) and the other is called the output pointer (or put pointer). We can use these pointers to move through the files while reading or writing. The input pointer is used for reading the contents of a given file location and the output pointer is used for writing to a given file location. Each time an input or output operation takes place, the appropriate pointer is automatically advanced.

Default Actions: When we open a file in read-only mode, the input pointer is automatically set at the beginning so that we can read the file from the start. Similarly, when we open a file write-only mode, the existing contents are deleted and the output pointer is set at the beginning. This enables us to write to the file from the start. In case, we want to open an existing file to add more data, the file is opened in 'append' mode. This moves the output pointer to the end of the file (i.e. the end of the existing contents).

Functions for Manipulation of File Pointers: All the actions on the file pointers take place automatically by default. How do we then move a file pointer to any other desired position inside the file? This is possible only if we can take control of the movement of the file pointers ourselves. The file stream classes support the following functions to manage such situations:


seekg ( )           Moves get pointer (input) to a specified location.
seekp ( )           Moves put pointer (output) to a specified location.
tellg ( )             Gives the current position of the get pointer.
tellp ( )             Gives the current position of the put pointer.

For example, the statement
                        infile.seekg(10);

moves the file pointer to the byte number 10. The bytes in a file are numbered beginning from zero. Therefore, the pointer will be pointing to the 11th byte in the file. Consider the following statements:

ofstream fileout;
fileout.open("hello", ios::app);
int p = fileout.tellp( );

On execution of these statements, the output pointer is moved to the end of the file “hello" and the value of p will represent the number of bytes in the file.

Specifying the Offset: We have just now seen how to move a file pointer to a desired location using the 'seek' functions. The argument to these functions represents the absolute position in the file. ‘seek’ function seekg( ) and seekp( )  can be used with two arguments as follows:


seekg(offset,refposition);
seekg(offset,refposition);

The parameter offset represents the number of bytes the file pointer is to be moved from the location specified by the parameter refposition. The refposition takes one of the following three constants defined in the ios class:
ios::beg            start of the file
ios::cur             current position of the pointer
ios::end            End of the file

The seekg( ) function moves the associated file’s ‘get’ pointer while the seekp( ) function moves the associated file’s pointer.  Table below lists some sample pointer offset calls and their actions.  fout is an ofstream object.

Table: Pointer offset calls

Seek call                                  Action

fout.seekg(o, ios::beg);            Go to the start
fout.seekg(o, ios::cur);            Stay at the current position
fout.seekg(o, ios::end);           Go to the end of the file
fout.seekg(m, ios::beg);           Move to (m+1)th byte in the file
fout.seekg(m, ios::cur);           Go forward by m bytes from the current position
fout.seekg(-m, ios::beg);         Go backward by m bytes from the current position
fout.seekg(-m, ios::beg);         Go backward by m bytes from the end


10.7 Disk I/O With Member Functions:

When you use more sophisticated classes its natural to include stream I/O operations as member functions of the class. Consider the program below which shows how this might be done. The class consists of an array called parr, of structures of type person. These structures hold a name and age as the person class.

# include <fstream.h>                         //for file streams
struct person
{
char name[40];                        // person's name
int age;                                    // person's age
};
class group                                          // group of persons, in array
{
private:
            person parr[20];
            int count;
public:
            group( )
{          count=0;          }
void data( );
void show( );
void diskIn( );
void diskOut( );
};
void group::data( )                              II get one person's data
{
cout << "\n Enter name: “ ;     cin >> parr[count].name;
cout << " Enter age: " ;           cin >> parr[count++].age;
}
void group::show( )                             II display all persons in group
{
for(int j = 0; j < count; j ++)
{
cout << "\nPerson #" << j+1 ;
cout << "\n Name: " << parr[j].name;
cout << "\n Age: " << parr[jJ.age;
}
}
void group::diskIn( )                           II read array from file
{
ifstream infile;                                     II make file
infile.open("GROUP.DAT", ios::nocreate);   II open it
if( infile )                                             II if it exists,
infile.read( (char*)this, sizeof(*this) );            II read it
}
void group::diskOut( )                                    II write array to file
{ 
ofstream outfile;                      II make file
outfile.open("GROUP.DAT");           II open it
outfile.write( (char*)this, sizeof(*this) );        II write to it
}
void main(void)
{
group grp;                                II make a group
grp.diskIn( );                           II fill it with disk data (if any)
grp.showData( );                     II display data for group
char ch;
do {                                         II get new persons for group
cout << "\n Enter data for person:";
grp.addData( );
cout << “Enter another (y/n)? ";
cin >> ch;
} while(ch=='y');
grp.showData( );                     II display augmented. group
grp.diskOut( );                                    II write augmented group to disk
}

The main( ) program creates a group object called grp. Such an object holds the data for 20 person structures. The program then displays all the persons in the group, and asks the user to enter data for an additional person. The user can enter as many persons as desired. Before terminating, the program again displays the data for all persons. Besides the array of persons, group contains four member functions and a count of how many of the object spaces are actually in use. The first function, addData( ) , is the only one that acts on a single person structure. It prompts the user for data, and adds a person object to the end of the array.
The other member functions act on the array as a whole. The showData( ) function displays the data for each of the count structures stored in the array in memory. The diskIn( ) function checks to see that the GROUP.DAT file ,exists. If so, it opens the file and reads its entire contents into the array parr in single statement. The infile object will return a zero value if the GROUP.DAT file does not exist, so That the statement

if( infile )

can be used to determine whether an attempt should be made to read the file The diskOut( ) function writes the entire array parr from memory to the disk file GROUP.DAT. In the read( ) and write( ) stream functions the address of the object to be read or written is this and its size is sizeof(* this). The this pointer holds the address of the object of which diskOut( ) and diskIn( ) are members, so we can use it to read or write the entire object.

10.8 Sequential Input and Output Operations:

The file stream classes support a number of member functions for performing the input and output operations on files.  One pair of functions, put( ) and get( ) , are designed for handling a single character at a time.  Another pair of functions, write( ) and read( ), are designed to write and read blocks of binary data.

put( ) and get( ) functions:  The function put( ) writes a single character to the associated stream. Similarly, the function get( ) reads a single character from the associated stream.

# include<fstream.h>
# include<string.h>
void main( )
{
            char string[80];
            cout<<”Enter a string \n”;
            cin>>string;
            int len = strlen (string);
            fstream file;
            file.open(“TEXT”, ios::in|ios::out);
            for(int j= o; j<len; j++)
                        file.put(string[j]);

            file.seekg(o);
            char ch;
            while(file)
            {
                        file.get(ch);
            cout<<ch;
}
}

Note that we have used an fstream object to open the file.  Since an fstream object can handle both the input and output simultaneously, we have opened the file in ios::in|ios::out mode.  After writing the file, we want to read the entire file and display its contents.  Since the file pointer has already moved to the end of the file, we must bring it back to the start of the file.  This is done by the statement

                        file.seekg(0);

write( ) and read( ) functions: The functions write( ) and read( ), unlike the functions put( ) and get( ), handle the data in binary form.  This means that the values are stored in the disk file in the same format in which they are stored in the internal memory. An int takes four bytes to store its value in the binary form, irrespective of its size.  But a 4-digit int will take four bytes to store it in the character form. The binary format is more accurate for storing the numbers, as they are stored in the exact internal representation. There are no conversions while saving the data and therefore saving is much faster. The binary input and output functions take the following form:

Binary format (2 bytes)           -------> 00001010 | 00100010

Character format (4 bytes)      -------> 2   5  9   4

infile.read((char *)&v, sizeof(v));
outfile.write((char *)&v, sizeof(v));

These functions take two arguments.  The first is the address of the variable v, and the second is the length of that variable in bytes.  The address of the variable must be cast to type char* (i.e. pointer to character type). Program illustrates how these two functions are used to save an array of float numbers and then recover them for display on the screen.

#include<fstream.>
#include<string>
const char* filename = ”BINARY” ;
void main ( )
{
            float height [4] = [175.5, 153.0,167.25,160.70};
            ofstream outfile (filename);
            outfile.write((char *) &height, sizeof (height));
            outfile.close( );
            for(int j=0; j<4,j++)
                        height[j]=0;
            ifstream infile(filename);
            infile.read((char *)& height, sizeof(height));
            for(int j=0; j<4,j++)
            {
                        cout.setf(ios::showpoint);
            cout<<setw(10) <<setprecision(2)
                   << height[j];
}
infile.close( );
}

Reading and writing a class object: The class objects are the central elements of C++ programming, it is quite natural that the language supports features for writing to and reading from the disk files objects directly. The binary input and output functions read( ) and write( ) are designed to do exactly this job.  These functions handle the entire structure of an object as a single unit, using the computer’s internal representation of data.  For instance, the function write( ) copies a class object from memory bytes by byte with no conversion. One important point to remember is that only data members are written to the disk file and the member functions are not. Program illustrates how class objects can be written to and read from the disk files.  The length of the object is obtained using the sizeof operator. 


#include<fstream.>
#include<iomanip.h>
class inventory
{          char name[10];
            int code;
            float cost;
public:
            void readdata(void);
            void writeata(void);
};
void inventory:: readdata(void){
            cout<<” Enter name”; cin>>name;
            cout<<” Enter code”; cin>>code;
            cout<<” Enter cost”; cin>>cost;
}
void inventory:: writedata(void){
            cout<<setiosflags(ios::left)<<setw(10) <<name
                   <<setiosflags(ios::right)<<setw(10) <<code
                   <<setprecision(2)          <<setw(10) <<cost
                  << endl;
}
void main( )
{
inventory item[3];
fstream file;
file.open(“stockdat”, ios::in|ios::out);
cout<<”Enter details for three items \n”;
for(int j=0;j<3;j++){
            itme[j].readdata( );
            file.write((char *) &item[j],sizeof(item[j]));
}
cout<<”output”
for(int j=0;j<3;j++){
            file.read((char *) &item[j],sizeof(item[j]));
itme[j].writedata( );
}
file.close( );
}

The program uses for loop for reading and writing data. This is possible because we know the exact number of objects in the file.  In case, the length of the file is not known, we can determine the file-size in terms of objects with the help of the file pointer functions and use it in the for loop or we may use while(file) test approach to decide the end of the file. 

10.9 Error Handling during File Operations:

So far we have been opening and using the files for reading and writing on the assumption that everything is fine with the files. This may not be true always. For instance, one of the following things may happen when dealing with the files:
1.      A file, which we are attempting to open for reading, does not exist.
2.      The file name used for a new file may already exist.
3.      We may attempt an invalid operation such as reading past the end-of-file;
4.      There may not be any space in the disk for storing more data.
5.      We may use invalid file name.
6.      We may attempt to perform an operation when the file is not opened for that purpose.

The C++ file streams inherit a 'stream-state' member from the class ios. This member records information on the status of a file that is being currently used. The stream state member uses bit fields to store the status of the error conditions stated above. The class ios supports several member functions that can be used to read the status recorded in a file stream. These functions along with their meanings are listed in Table.

Table: Error handling functions
Function                      Return value and meaning

eof( )                           Returns true (non-zero value) if end-of-file 15 encountered while reading; otherwise returns false (zero).

fail( )                           Returns true when an input or output operation has failed.

bad( )                           Returns true if an invalid operation is attempted or any
unrecoverable error has occurred. However, if it is false, it may be possible to recover from any other error reported and continue operation.

good()                         Returns true if no error has occurred. This means, all the above
functions are false. For instance, if file.good() is true, all is well
with the stream file and we can proceed to perform I/O operations. When it returns false, no further operations can be carried out.


These functions may be used in the appropriate places in a program to locate the status of a file stream and thereby to take the necessary corrective measures. Example:
            ………..
ifstream infile;
infile.open("ABC");
while(linfile.fail( )){
            ………..
            ……….. (process the file)
            }
if(infile.eof( ))
            {
……………(terminate program normally)
            }else
                  if(infile.bad( ))
{ …………...(report fatal error)
} else
            {
            infile.clear( ); //clear error state
                        ……………
            }          ………..
                        ………..
The function clear( ) resets the error state so that further operations can be attempted. In the statements such as
while(infile)
{
……….
}
and
while(infile.read(...))
{
……….
}
 infile becomes false (zero) when end of the file is reached (and eof( ) becomes true).

10.10 Command-Line Arguments: 

Like C, C++ too supports a feature that facilitates the supply of arguments to the main( ) function. These arguments are supplied at the time of invoking the program. They are typically used to pass the names of data files. Example.
C > exam data results

Here, exam is the name of the file containing the program to be executed and data and results are the filenames passed to the program as command-line arguments. The command-line arguments are typed by the user and are delimited by a space. The first argument is always the filename (command name) and contains the program to be executed. How do these arguments get into the program? The main( ) functions which we have been using up to now without any arguments can take two arguments as shown below.
main(int argc, char* argv[])

The first argument argc (known as argument counter) represents the number of arguments in the command line. The second argument argv (known as argument vector) is an array of char type pointers that point to the command line arguments. The size of this array will be equal to the value of argc. For instance, for the command line

C >exam data results
the value of argc would 3 and the argv would be any of three pointers to string as shown below :
argv[O] -> exam
argv[1] -> data
argv[2] -> results

Note that argv[O] always represents the command name that invokes the program. The character pointers argv[l] and argv[2] can be used as file names in the file opening statements as shown below:
            ………..
infile.open(argv[1 ]); //open data file for reading
……..
outfile.open(argv[2]); //open results file for writing
…….
Program illustrates the use of the command-line arguments for supplying the file names. The command line is
test ODD EVEN

The program creates two files called ODD and EVEN using the command-line arguments and a set of numbers stored in an array are written to these files. Note that the odd numbers are written to the file ODD and the even numbers are written to the file EVEN. The program then displays the contents of the files.

#include <fstream.h>
#include <stdlib.h>
void main( int argc, char * argv[])
{          int number[9] = 111,22,33,44,55,66,77,88,991;
if(argc != 3)  {
cout << "argc = " << argc << "\n";
cout < < "Error in arguments \n" ;
exit(l);
}
ofstream fout1, fout2;
fout1.open(argv[1]);
if(fout1.fail( )) {
cout << "could not open the file" << argv[l] << “\n";
exit (1) ;
}
fout2.open(argv[2]);
if(fout2.fail() ) {
cout << "could not open the file “ << argv[2] << “\n";
exit(l);
}
for(int i = 0; i < 9; i++) {
if(number[i] % 2 == 0)
fout2 < < number [i] < < “ “;              / /write to EVEN file
else
fout1 << number[i] << “ “;                 //write to ODD file
}
fout1.close( );
fout2.close( );
ifstream fin;
char ch;
for(i = 1; i < argc; i++) {
fin.open(argv[i]) ;
cout << "Contents Of" << argv[i] << “\n";
do {
fin.get(ch) ;                             //read a value
cout << ch;                              //display it
            }while(fin);
cout << "\n\n";
fin.close( );
            }          }






Exercise:

1.      Write a program that reads a text file and creates another file that is identical.

  1. A file contains a list of telephone numbers in the following form:
Amar               234566
Atishay            243423
Rakhee                        453455
                             
            Write a program to store this data.

  1. Write a program that returns the size in bytes of a program entered as the command line argument:

Eg. C>filesize Rdata.txt

  1. Write a program that emulates the DOS copy Command.

  1. Perform validation to the above program for checking the number of arguments.