File handling in C Language
Files management is very important aspect in any of the programming language. C Language has a very specific file handling mechanism to open, close, update and edit the files. Using file handling in C Language we can store the data in text or binary files for later use.
File handling Methods in C
There are two ways of file handling in C Language :
-
Standard I/O
Standard Input/Output can be implemented by standard I/O library routines.
A sophisticated bunch of operations are performed for file I/O, where the user is not concerned with the complex internal details like buffering, data conversion, etc.
These operations are automatically performed by the standard (library) I/O functions.
-
System-level I/O
System level input /output is implemented by system calls. This is a bit un-sophisticated I/O service, that is specific to the underlying operating system (E.g., DOS and UNIX have their own set of system calls to perform the file handling.
This service is unsophisticated in the sense that:
(i) Data cannot be written as individual characters or strings, but can only be written as a buffer full with bytes,
(ii) Data cannot be written in a formatted manner, data is like record I/O in standard I/O. However, the programmer has full control to set up the buffer for the data, place the data in it before any write operation performed, and take it out from the buffer after read operation is performed.
(Standard I/O) File handling
Difference between an external and an internal file name
Files are the most common form of a stream. We can access files via their internal file name or external file names.
When we create a file using the (operating) system editor, and give it a name, so that we could use this name in future. whenever we want to do any operation on those files, it is recognised by the operating system. This name is known as external file name.
These files might be external to a C program but, if we want to access the file for some purpose (e.g., for reading, or writing, or updating, etc.), we can’t use this external file name for the following reasons :
- The file we are trying to access may not be created till now.
- If external name of the file is used, the program can access the file with that particular name only. Later on, when it is found that the data is in some other file (with a different name than of the first file), then either the file name will have to be changed at every point where its name is used in the program.
- Portability of the program using external file name becomes less, as different operating systems have different conventions for naming files. A file name that is valid on the one OS, may be invalid on the another OS !
For the above mentioned reasons, a file must have an internal name (which should be a pointer to the file), so that even if the file is not created, this internal file name can be used by the program. Then, at some later stage, this internal name would be associated with the actual file (which is the file having external name).
In C language, we better know this internal name as file (or stream) pointer that points to a chunk of memory containing all the information about the file (known as file structure). So, in order to work with a file we must have a file (or stream) pointer that points to a FILE structure. This FILE structure is defined in the header file stdio.h to provide all the information about the opened file (e.g., file’s name, status, address of the buffer associated with the file, the position of the next character to be read or written, current position of the file, etc). To declare a file pointer, we must do :
FILE *fp ; /* fp is a pointer to the file structure FILE */
After declaring this file pointer, we can do following things with the file:
- Opening a file (fopen())
- closing file(s) (fclose(), fcloseall())
- stream status enquiry (ferror(), feof(), clearer())
- flushing the stream content to the file (fflush(), flushall())
- accessing a file
Sequential access :
- character I/O (getc() , fgetc() , putc() , fputc() )
- string I/O (fgets() , fputs() )
- formatted I/O (fscanf() , fprintf() , sprint() , sscanf() )
- block I/O (fscanf() , fprintf() , sprint() , sscanf())
- block I/O (fread() , fwrite() )
Random access:
- positioning the file pointer to the beginning of required data item in the file (fseek())
- positioning the file pointer to the beginning of the file (rewind() )
- finding the current position of the file pointer within the file ( ftell() )
managing a file:
- file management means dealing with existing files in order to
- delete a file (remove() )
- rename a file (rename() )
-
Opening a file
The function fopen() is used to open a file in C Language as file handling function. Its prototype is :
FILE *fopen(char *filename, char *mode)
fopen returns a pointer to a FILE. The filename string is the name of the file on disc that we wish to access. The mode string controls our type of access. If a file cannot be accessed for any reason a NULL pointer is returned.
Modes include :
“r” – open text file for reading ; the file must already exist, otherwise fopen() returns NULL.
“w” – create text file for writing ; if file already exists, its contents are washed, otherwise it is
created.
“a” – open text file for appending; if file does not exist, it is created, otherwise the file is written
into at the end.
“r+” – open text file for both reading and writing; the file must already exist.
“w+” – create text file for both reading and writing; if file exists, its contents are washed, otherwise it is created.
“a+” – open text file for reading and appending; if file does not exist, it is created, otherwise the file is written into at the end of the file (with previous contents of the file retained).
We see that in certain modes ( “w”, “a”, “w+”, “a+”), the file must be created, cannot be created fo certain reason (e.g., the allotted disk space is full , or the disk is write-protected), then the fopen() detects these errors and returns a NULL. A NULL is used to indicate failure because no file (or stream) pointer will ever have that value. Remember that :
- if an already existing (external) file is opened for writing (“w”), its old contents will be erased.
- if an already existing (external) file is opened with “a” or “a+”, all write operations take place at the end of the file, without harming the old contents of the file. This holds true even if the file pointer is repositioned with f seek () or rewind (). When a write operation is about to occur, the file pointer is positioned at the end of the file. This makes sure that the existing data cannot be overwritten, as in the case of writing (“w”) mode.
- if we add ‘b’ at the end of each of the above modes (“rb”, “wb”, “ab”, “r+b”, “w+b”, “a+b”), then the files are treated as binary files rather than text files.
So to open an external file called abc.text, for reading, we would do :
FILE *fp; /* declare a stream */
fp = fopen(“abc.txt”, “r”); /* fopen(), if successfully opens a file abc. txt and returns a pointer to it, which is assigned to instream*/
this says that “open the external file abc.txt for reading”. If such a file does not exist, an error is
resulted and fopen() returns NULL. If the file exists, fopen () returns the address of the place in memory where information about the file would be kept. This address is assigned to the file (or stream) pointer fp. This establishes a link between the external file name abc. txt and the internal file name fp. It is good practice to check that the file is opened correctly, as is shown below :
if((fp=fopen(“abc.txt”, “r”))==NULL) { printf(“Can’t open %s\n”,”abc.txt”); exit(1); }
closing a file
A file must be closed once operation like(i.e., reading, writing, appending, updating, etc.) are complete on it. For this purpose, the standard I\O library function fclose() is used. fclose() closed the stream that was opened by using fopen(). Its prototype is :
int fclose(FILE*stream);
fclose() returns an integer value 0 for successful closing of file, and returns EOF if file is not closed for any reason. Closing a file means:
- writing any data that still remains in the disk buffer, to the file pointer to by the stream, and doing a formal (operating) system-level close on the file. This disk buffer is implemented internally, and is invisible to the programmer.
- Breaking the connection between the stream (file pointer) and the external name that was established by fopen().
- freeing the various system resources
- freeing the FILE structure and the buffer
- flushing out buffer associated with the opened file, which helps in preventing loss of data when writing to a disk.
Freeing the file pointer is necessary because there is a limit on the number of files that can be opened simultaneously by a program. another function, related to fclose(), is fcloseall() that closes all the opened streams at a time. Its prototype is :
int fcloseall(void);
This function returns the number of streams closed, and in case any error is detected returns EOF. It can be used as:
int n_files_closed; n_files_closed=fcloseall(); if (n_files_closed==EOF) printf(“Error in closing files…!”); else printf(“%d files have been closed.”); …
Block I/O (Binary file I/O)
Some applications demand the use of data files to store blocks of data, where each block consists of contiguous bytes. Each block usually represents a complex data type, such as an array, a structure, or an array of structures (though a block may represent a basic type of data too, like int, float, double, etc.).
Many applications may need a data file consisting of multiple structures of same built, or multiple array of structures of same size and type. For such applications it is required to read the entire block from the data file or write the entire block to the data file, rather than to read or write the individual members (i.e., individual members of structures, or individual array elements) of each block separately.
So far we have gone through the functions that allow reading/writing of character data (character I/O and string I/O functions) and functions that allow reading/writing of character data as well as numeric data (formatted I/O functions).
But, since all the functions we have discussed so far, are related to text files in which the numbers are stored as a sequence of characters rather than the way they should actually store in the memory.
For example, 1234 is stored in 4 bytes in a text file but only 2 bytes in a binary file. A binary file contains data in a form that matches the way the computer stores data internally – in a sequence of bytes. That is, a float value, say 12345678.912345 of size 4bytes is actually stored in 4 bytes only, as it is internally represented.
But, in a text file, each digit of number 12345678.912345 occupies 1 byte, and so takes 15 bytes to get stored (as 12345678.912345 is treated as a 15-character long string in a text file).
Shortcomings of a binary file as compared to the text file are :
- Usually, a binary file can be created only from within the program (and not from the command line), and also its content can also be read only from within the program.
- A binary file cannot be listed (i.e., printed), as it will show strange looking characters (called garbage), or may sometimes generate an error.
- A binary file input or output cannot be formatted.
The advantages of a binary file over a text file are :
Complex data types (as arrays, structures, etc.) can also be written to the file as a single unit with ease.
- Data transfer rate for a binary file is much faster than for a text file as no data conversions are made. The data is read or written as it is. So, we can read from, or write into, a binary file much faster than in a text file.
- Less space is occupied by the data stored in a binary file as compared to the same data stored in a text file, because in a binary file the data is stored in its natural representation.
The library functions used for such type of block I/O (or, binary file I/O) are fread(), and fwrite(), with prototypes as :
- fread()
size_t fread (void*ptr, size_t size, size_t n,FILE *stream);
fread() reads a specified number of equal-sized data items from an input stream into a block, and
ptr : points to a block into which data (i.e., array or structure) is stored after reading
size : Length of each data item (i.e., array or structure) read, in bytes
n : Number of data items read
stream : Points to input stream
The total number of bytes read is (n * size) . On success, fread returns the number of data items (not bytes) actually read. On end-of-file or error, fread returns a short count (possibly 0).
- fwrite()
Size_t fwrite (const void *ptr, size_t size, size_t n, FILE*stream);
fwrite() appends a specified number of equal-sized data items to an output file, and
ptr : pointer to any data items to be written. So, it is the address of the data items (i.e., array or structure) to be written
size : Length of each item of data
n : Number of data items to ne appended
stream : specifies output file
The total number of bytes written is (n * size ). On success, returns the number of items (not bytes) actually written. On error, returns a short count.
Random access.
The library functions studied till now by us are sequential. We can also access the file randomly, i.e. from any position where the data item required by us is placed. This sort of accessing is known as random accessing of direct accessing. There are following library routines for randomly accessing a file :
- ftell() : finding the current of the file pointer within the file
- rewind() : Positioning the file pointer to the beginning of the file
- fseek() : Positioning the file pointer to the beginning of required data item in the file.
Functioning of File Pointers
A file pointer, in fact, is a pointer to a particular byte in a file. Each time we write into the file, this file pointer automatically increment by a size equal to the size of data written, and so points to the end of the data item written.
For this reason, the writing continues from that end-point when a data item is written next time. When a file is closed and then opened for reading, the file pointer is positioned at the beginning of the file (automatically).
After reading one data item, the file pointer moves to the next data item to continue reading. Moreover, when a file is opened in an append mode, then the file pointer is positioned at the end of the existing file, so that next coming data items can be written from that point onwards.
ftell(): Returns the current position of the file pointer. Its prototype is :
long ftell(FILE*stream);
If the file is binary, the offset is measured in bytes from the beginning of the file.The value returned by ftell() can be used in a subsequent call to fseek(). On success, ftell() returns the current file pointer position. On error, returns- 1L and error no to a positive value.
/* fileposition.c*/ #include<stdio.h> main(int argc, char *argv [] ) { FILE * stream; if(argc!=2) { printf(“Usage: filepos filename”); exit(1); } stream = fopen(argv [1] , “w+”) ; fprintf(stream, “This is a ftell testing”) ; printf(“The file pointer is at byte %1d/n”,ftell (stream)) ; fclose(stream) ; }
The above program is run from the command line (after successful compilation and execution) as :
fileposition filename
where filename is any valid file name. argv[1] refers to this filename in the program.
rewind(): Just as rewinding an audio cassette ultimately brings it to the beginning of the cassette, the rewind ( ) library function positions the file pointer to the beginning of the file it points to. Its prototype is :
void rewind (FILE *stream) ;
following program well demonstrates the working of the rewind() function. The following program is named as rew.c and is run from the command line (after successful compilation and execution) as :
rew filename
where filename is any file (that already exists) whose contents we want to display.
#include<stdio.h> main (int argc, char *argv[] ) { FILE *fp; int ch , c; if (argc!=2) { fprintf (stderr, “Usage: rew filename”) ; exit (1) ; } if ( (fp=fopen (argv [1] ,”r”) ) = =NULL) { fprintf (stderr, “Error in opening %s” ,argv [1] ) ; exit (1) ; } while ( (ch=getch ( ) ) != tolower ( ‘q’) ) { while ( (c=fgetc (fp) )!=EOF) printf (“%c”,c); rewind (fp) ; /* set the file pointer to the beginning of the file again */ } getch ( ) ; }
What the rewind ( ) does in the above program is that, after the contents of the given file are printed, it positions the file pointer again to the beginning of the file (for next time), because the inner while loop causes the file pointer to be positioned at the end of he file. The contents of the file keep on printing until we press ‘q’ or ‘Q’.
fseek(). This function gives us the flexibility and freedom to read a data item from any position of the file, as it positions the file pointer to the beginning of the required data item. Its prototype is :
int fseek (FILE *stream, long int offset, int origin) ;
fseek() repositions the file pointer of a stream. It sets the file pointer associated with a stream to a new position, and
stream : Stream whose file pointer fseek() sets.
offset : Difference in bytes between origin (a file pointer position) and new position.
For text mode streams, offset should be 0 or a value returned by ftelll() .
origin : One of three SEEK_xxx file pointer locations. origin can be one of the values 0, 1,
or 2, which represent three symbolic constants (defined in stdio.h) as given
below :
SEEK_SET or 0 beginning of the file
SEEK_CUR or 1 current position of the file pointer
SEEK_END or 2 end of the file
fseek() discards any character pushed back using ungetc() . fseek() is used with stream I/O. For file handle I/O, we should use 1seek(). After fseek() , the next operation on an update file can be either input or output. On success (the pointer is successfully moved), fseek() returns 0. On failure, fseek() returns a non-zero value. fseek() returns an error code only on an unopened file or device.
#include <stdio.h> long filesize (FILE *stream) ; int main (void) { FILE *stream; stream = fopen ( “ABC.TXT” , “w+”) ; fprintf (stream, “RDS Computer Institute”); fclose (stream) ; return 0; } long filesize (FILE *stream) { long curr_pos, length; curr_pos = ftell (stream) ; fseek (stream, 0Lk, SEEK_END) ; length =ftell (stream) ; fseek (stream , curr_pos, SEEK_SET); return length; }
Low Level I/O
In low-level (or system-level) I/O, the method used for I/O matches the way the underlying operating system uses for reading or writing files – bytes of information. In this type of file- handling, therefore,
- The I/O is unbuffered, i.e. each read/write request results in accessing isk (or device) directly to fetch/put a specific number of bytes.
- There are no formatting facilities with this type of I/O – we are dealing with bytes of information. This means that we are now using binary (and not text) files.
- Instead of file pointers, we use low level file handles of file descriptors which give a unique integer number to indentify each file.
Since this type of file handling deals directly with the system, it is also known as direct
file handling. In low-level file handling, we use the system calls related to the underlying operating system. Some of the file related system calls are :
Create() : to create a file
Open() : to open a file
Close() : To close a file
Read() : to read a file
Write() : to write to a file
Seek() : to randomly accessing a file
File handling in C Language gives insight about the how the files in C are handled and managed. If you find this topic informative please like and share it
Similar Topics : Recursion in C Preprocessor Directives in C File, Stream and Standard I/O
You may Also Like : How to Generate Output in JavaScript Discounts in Sales Invoices in Tally.ERP9 What is New in HTML5