String Handling in C

String Handling in C

A string in C Language can be defined as a character array terminated with a special character ‘\0’ to mark the end of the string. Unlike some other high-level languages like BASIC language, C does not have built-in “string” type data. So, C has no built-in facilities for manipulating entire arrays(such as copying and comparing them, etc).

It also has very few built-in facilities for handling/manipulating strings. In real sense, C language has truly built-in string handling capabilities,  which allows us to use string constants (also known as string literals) in our code, Whenever a string is written in double  quotes, C language automatically creates an array of characters for us, containing that string, terminated by the  ‘\0‘ character.

The reason why the last character in a character array should be a ‘\0’ (called null character , a character with the value 0) is that in most programs that manipulate character arrays expect it. For example, printf uses the ‘\0’ to detect the end of a character array when printing it out with a ‘%s’.

There are two ways to represent a string (of characters) in C, that are :

  • As arrays of type char ( as in char str[10] or char str[] )
  • As pointers of type char (as in char *str )

Since, neither of these approaches provides a complete solution to the representation of strings in C,  so in practice, elements of both of them are often used together.

String as an array of type char

When a string is represented as an array of type char, then that array must have the provision to accommodate the string terminating character ‘\0‘ also. For example, if a string as a character array is defined as,

     char str[10];

Then, the longest string which we would contain in this array would be 9 characters long, since, all the strings are terminated by a special character ‘\0 ‘. Although this special character ‘\0’ is appended automatically (and implicitly) to strings of characters in most contexts, but if it is not there ( or is lost anyway), then you may see unpredictable results. For example, in the following code

           char str1[10] = “kangaroo” ;

           char str2[10] = “australian” ;

The first string str1 contains 8 characters, and so in the memory, the ninth character  ‘\0’ is automatically appended to it, as is shown below :

K A N G A R O O \0

 

But, the second string str2 contains 10 characters, and so in memory, the terminating ‘\0’ is lost, as is shown below :

A U S T R A L I A N

So, any attempt made to copy or display this sting would cause unpredictable results, since the string now has an undefined length, and will be  terminated after displaying ( or copying) the contents of all the memory locations after the string “australian” until a ‘\0’ is encountered.

Initializing the (char array) strings

To initialize a character array (or string) , with a string literal (or string constant), then you have to initialize that character array just along with the declaration in a single statement, and not in two statements (i.e. , first statement for declaring the character array and second for initializing it!) For example,

       Char str [ ] =”This is ok” ;    /* declare & initialize at the same time is valid*/

It is valid to declare a character array (string) but invalid to assign value to string is invalid

            char str[11] ;         /* declaring a character array ( or string) , */

           str = ”This is ok”;      /* and then trying to initialize it is invalid */

Is invalid, because an array name is a pointer constant, and a constant must be initialized then and there along with the declaration. So,  first declaring, and then assigning it a string constant “This is OK” is illegal. However, we can initialize a character array after declaration in some other way – by initializing it one character at a time! For example,

       char str[6] ;          /* 5 characters + ‘/0’ */

       str[0]=’R’ ;           /* OK, 1st character of str initialized to R */

       str[1]=’a’ ;           /* OK, 2nd character of str initialized to a */

       str[2]= ‘m’ ;          /*OK, 3rd character of str initialized to m */

       str[3] =’a’ ;           /*OK, 4th character of str initialized to a */

       str[4]=’n’ ;            /*OK, 5th character of str initialized to n */

strcpy and strncpy

We have just saw that we cannot copy one char array (string) into another, as shown below:

       char str1[] =”string1” ;

       char str2[8] ;

       str2=str1;                    /* invalid !  */

However, C provides two functions declared in the header file string. h, to do this job, whose syntax are :

strcpy(string1, string2) ;       :      copies contents of string2 to string1 , including the terminating ‘\0’

strncpy (string1, string2 ,n) ; :      copies first n characters of string2 to string1 (n<length of string2)

We see that the strncpy function takes an extra parameter than strcpy function, which is the number of characters to be copied from one string to another. This makes copying a little safer, since we can avoid the possibility of putting too many characters into the array. However, care must be taken to reserve the last character position for the terminating ‘\0’. If required, this ‘\0’ can be explicitly added. For example,

Char str1 [ ] =”Avinash” ;

Char str2 [8] ;

Strncpy (str2, str1 , 7) ;

Str2 [7] =’\0’ ;

The following program illustrates the use of strncpy in safely copying a string from one array to another . Again, let me make  you remember that a character array is treated as a string, if terminated with ‘\0’ ;.

#include<stdio.h>
#include<string.h>
void main()
{
   char str1 [25} ;
   char str2 [16] ;
  printf(“Enter a string (up to 24 characters) :”) ;
  gets (str1) ;
  puts (str1) ;
    strncpy (str2, str1, 15) ;   /* strncpy will truncate any string longer than 16 Characters */
    str2 [15] =’
#include<stdio.h>
#include<string.h>
void main()
{
   char str1 [25} ;
   char str2 [16] ;
printf(“Enter a string (up to 24 characters) :”) ;
gets (str1) ;
puts (str1) ;
    strncpy (str2, str1, 15) ;   /* strncpy will truncate any string longer than 16 Characters */
    str2 [15] =’\0’ ;     /* the last character (i.e. 16th character) will be the terminating ‘\0’ */
puts (str2) ;
}
’ ;     /* the last character (i.e. 16th character) will
be the terminating ‘
#include<stdio.h>
#include<string.h>
void main()
{
   char str1 [25} ;
   char str2 [16] ;
printf(“Enter a string (up to 24 characters) :”) ;
gets (str1) ;
puts (str1) ;
    strncpy (str2, str1, 15) ;   /* strncpy will truncate any string longer than 16 Characters */
    str2 [15] =’\0’ ;     /* the last character (i.e. 16th character) will be the terminating ‘\0’ */
puts (str2) ;
}
’ */
puts (str2) ; }

 

Output :

       Enter a string (up to 24 characters): I am going deeper in C.

       I am going deeper in C.

       I am going deep

Most of the time we have to put in our own ‘ \ 0 ‘ at the end of a string. If we want to print the line with printf, it’s necessary. This code prints the number of characters before the line:

#include<stdio.h>
#include<string.h>
void main ( )
{
        int i;
       char line[80];
       for(i=0 ;(line[i++ ] = getchar( ))!=’ \n’;)
              ;
       line [i ] = ‘
#include<stdio.h>
#include<string.h>
void main ( )
{
        int i;
       char line[80];
       for(i=0 ;(line[i++ ] = getchar( ))!=’ \n’;)
              ;
       line [i ] = ‘\0’;
       printf (“%d : \t%s”,i,line);
}
’;
      printf (“%d : \t%s”,i,line); }

Here we increment i in the subscript itself, but only after the previous value have been used.The character is read, placed in line[i] , and only then i incremented.

String as a pointer to type char

From the above discussion, it is clear that using arrays of type ‘char’ to represent strings has many drawbacks, since arrays are hard to manipulate and so cannot be used for many purposes (e.g., they cannot be used as the return type of a function!) . Another way of representing strings is to use a pointer to reference a string of characters. To declare a pointer to a string , the pointer must be of the type ‘char’ , as shown below :

char *str;            /* now, str is a pointer to a string of characters */

The above declaration declares a pointer of type char called str, which is able to point to the first character of a string of characters.

The benefit of using this approach for representing strings is that now we can manipulate the pointer by using the arithmetic that is applicable on pointers. For example, now we can use a char pointer (string) to reference an array of characters quite easily, as is shown below :

#include<stdio.h>
#include<string.h>
void main ( )
{
       char str1[10]=”Education”;   /* string represented by a char array str1 */
       char *str2;            /* string represented by char pointer str2 */
       str2=str1;             /* ok, put address of str1 to str2, so that now str2 can refer str1 */
        puts(str2);                        /* displays Education*/
}      

Thus, with pointer representation of strings, we can use assignment operator at any time after declaration to make one char pointer equal to another. The following example makes things more clear :

#include<stdio.h>
#include<string.h>
void main( )
{
       char str1[10]=”Education”;  /* string represented by a char array str1*/
       char str2[10]=”Literacy”;     /* string represented by a char array str2 */
       char *str3=” Institute”; /*string represented by char pointer str3*/
       char *str4=”Schooling”; /* string represented by char pointer str4 */
       str2=str1;                   /*wrong! ! Won’t compile . Cannot assign to char array. */
       str4=str3;                   /*valid. Will compile. */
       str4=”Graduation”;       /*valid. Will compile. */
       str4=”Graduation”;       /* valid. Will compile. */
}

No strong medicine comes without its side effects! There is a potential problem with char pointer representation of strings too! This is so because using the assignment operator to make them equal. This results in making both pointers pointing to the same memory location (this is known as shallow copying!).

#include<stdio.h>
#include<string.h>
void main()
{
       char *str1=”string1”;           /* declare 1st string with initial value */
       char *str2=”string2”;           /* declare 2nd string with initial value */
       str1=str2;                    /* assign str1 equal to str2 */
}      

Thus, after assigning one char pointer to another results in both the pointers now pointing to the same string, so that change in one will affect the another one. This is not safe, and can lead to strange behavior by the program in many cases ( e.g., when memory is allocated dynamically!). So, for safe storage of strings, char arrays are more reliable option.

  1. Can’t we use strcpy (or, strncpy ) function with char pointer (strings) to copy data (i.e., to make deep copy) and not addresses?

It is possible to use strcpy (or, strncpy) function with char pointers to make deep copy (i.e. to copy data in one string to another string, and not to copy the address of one string to the another). This is shown below :

       char *str1=”Avinash” ;

       char *str2=”Vikas” ;

       strcpy(str1, str2) ;

Now , the situation of str1 and str2 is shown below :

 

v I K A S \0

 

V I K A S \0


Both the char pointers (strings) have the same data,
        

After using strcpy( )

And even they are pointing to different addresses.

But, the use of strcpy can still lead to problems with char pointers  (strings), because by using strcpy we are actually overwriting the data in the subsequent memory locations. This can create a problem when we declare a char pointer to point to a string of one  length, and then

  • Copy a longer string to it, or
  • Try to allocate a char string to a pre declared char pointer.

Running the following program, for example

void main()

{
   char *str1=”Ram”;
   char *str2=”Kumar”;
   char *str3 ;
   strcpy(str1,str2);   /*str2 is larger than str1, so this copy is not good, and the result */
   puts(str2);            /* of strcpy (str1, str2) is that str2 may print unpredictable string */
   strcpy(str3, str2) ; /* also, trying to copy a str2 to str3 may result in unpredictable str3 */
}

May or may not give the desired output, since its behavior is unpredictable. This is so, because we are trying to copy a char pointer with a string constant (i.e. str1), to the char pointer that is not pre-initialized (i.e., str2)! So, any attempt to copy a string into this uninitialized char pointer may result in copying only the starting address of the string that is assigned to it. Hence printing a string would print only the first character of the copied string (ie., of “Ram”), and rest of the characters printed may be unpredictable because what will be in the subsequent memory locations is unpredictable!. 

Similarities and differences between strings represented by char arrays and char pointers?

      The similarities , the differences between character array and character pointer representations of strings are :

  • With strings, the definitions char str[] =”My string”; and char *str1=”My string” ; have the same effect. In both the case , a string is created and its starting address is used for str or str1.
  • Each individual character, in both the representations of the string, can be accessed by any of the following expressions:
  • str[index] or str1[index]
  • *(str+index) or *(str1+index)
  • We cannot assign a value to the character string after it is declared.

 char str[25] ;                          /* declaring a character array string, and */

 str=”This can not be done”;             /* then assigning it a value, is invalid. */

We can initialize it only at the time of declaration, as shown below:

char str[25] =”This can be done” ; /* declaring & initializing in a single shot is valid */

On the other hand, a string represented as char pointer, can be assigned a value (at the time of declaration as well as) after being declared also. For, example,

 char *str1;  /* declaring a character pointer string , and*/

str1=”This can also be done”;   /* then assigning it a value, is also valid.*/

  • We cannot assign one char array string to another, to copy the contents of the one to the another. For example,

char str1 [ ] =”First” ;

char str2 [6] ;

str2=str1 ;        /* invalid, cannot copy one char array string into another */

But, we can assign one char pointer string to another, to copy the contents of the one to the another. For example,

char *str1=”First” ;

char *str2;

ctr2=str1;                                     /*valid, now str2 also points to “First” */

  • The most significant difference between string representations char str[] and char *str1 , is that later one is a pointer variable and so we can modify it. So, str1=str and str1++ both are legal. But str=str1 and str++ is totally illegal because str isn an array and not a pointer. So, when we say str, we produce the starting address of the array, but str is not a variable, and therefore we cannot say str=str1 or str++.
  • Char arrays are suitable only for storing strings, and not allows for their manipulation. A char pointer is suitable for manipulating strings, but can lead to unpredictable results in some contexts where one char pointer is copied to another one.

How do we perform operations on strings?

We have a variety of standard library string-handling functions declared in the header file string.h readily available to be used by us . Some of these library functions are :

  1. strlen()

          Syntax : size _t strlen (const char *string) ;

          To find the length  of the string given by string. The number of characters before the terminating character ‘\0’ is returned. The string may be a string constant or a string variable. Size_t may be an integral unsigned type. On systems with 2-bytes int, it is equivalent to unsigned  type. On systems with 4-bytes int, it is equivalent to unsigned long.

  1. Strchr()

          Syntax : char *strchr ( const char *string, int ch) ;

          Returns a pointer to the first occurrence of ch in string. Returns NULL if ch is not in string.

  1. Strcpy()

          Syntax  : char *strcpy(char *string1, const char *string2) ;

          To copy string2 to string1, where string2 may be a string constant or a string variable. This function effectively assign one string to another string. The characters in string2 are copied into string1 until ‘\0’ is moved. It returns string1.

  1. Strncpy()

          Syntax : char *strncpy (char *string1, const char * string2, size_t n) ;

          Replaces first n characters or string1 with first n characters of string2, where string2 may be a string constant or a string variable. It returns string1. Also,

If n < strlen (string1), then the length of string1 is not affected , but

If n >= strlen (string1), then strncpy ( string1, string2, n) and strcpy (string1, string2) have the same effect.

  1. Strcmp()

          Syntax :  int   strcmp (const char *string, const char *string2);

This function takes 2 strings (string1 and string2) as its arguments to compare string2 with string1, where any of these strings may be a string constant or a string variable. This function returns an integer value, depending on the relative order of the two strings, as follows:

  • a negative value, if string1 is alphabetically les than string2
  • a positive value, if string1 is alphabetically greater than string2
  • a zero value, if string1 is identical to string2
  1. strncmp()

          Syntax : int strncmp (const  char * string1, const char *string2, size_t n) ;

This function takes 2 strings (string1 and string2) and an integer (n) as its arguments to compare first n characters of string1 with first n characters of string2, where any of these strings may be a string constant or a string variable. This function returns an integer value, which can be a negative value, a positive value, or a zero, depending on whether the first substring is alphabetically less than, greater than, or identical to second substring.

  1. Strcat()

          Syntax : char *strcat (char *string1, const char *string2) ;

This function takes 2 strings (string1 and string2) as arguments to concatenate string2 to string1. That is, to append (or add at the end) string2 to string1. It returns string1. The programmer must ensure that string 1 points to enough space to hold the result.

  1. Strncat()

          Syntax : char *strncat (char *string1, const char * string2, size_t n0 ;

This function takes 2 strings (string1 and string2) and an integer (n) as arguments, and appends the first n characters of string2 to string1. It returns string1. If n>=strlen (string2), this function has same effect as strcat (string1,string2).

  1. Strstr()

          Syntax : char * strstr (const char *string1, const char * string2);

Returns the address of the first occurrence of string 2 as a substring of string1. Returns NULL if string2 is not in string1.

  1. Strtok(0)

Syntax : char * strtok (char  *string1, const char * string2) ;

Tokenizes string1 into tokens delimited by the characters found in string2. After the initial call to strtok (string1, string2), each successive call strtok (NULL, string2) returns a pointer to the next token found in string1. These calls change the string1, replacing each delimiter with the NUL character ‘\0’.

  1. Memcpy()

     Syntax : void *memcpy (const void *string1, const void * string2, size_t n) ;

Copies the n bytes of memory beginning at string2 into memory location string1, and returns string1.

  1. Memmove()

          Syntax : int memmove ( const void *string1, const void * string2, size_t n) ;

Same as memcpy ( ) except strings may overlap.

  1. Memcmp()

          Syntax : int memcmp (const void *string1, const void* string2, size_t n) ;

Compares the n bytes of memory beginning at string1 with the n bytes of memory beginning at string2. Returns a negative, zero, or positive integer according to whether the first string is alphabetically less than, equal to, or greater  than the second string.

  1. Memchr()

          Syntax : void *memchr (const void * string, int ch, size_t n);

     Searches the n bytes of memory beginning at string for character ch. If ch is found , the address of its first occurrence is returned, otherwise a NULL is returned.        

Similar Topics :

Pointer in C Language                          Functions                            Arrays in C

You may also like:

What is New in HTML5              How To Enable JavaScript in Browsers      Discounts in Sales Invoices in Tally.ERP9

Download Official TurboC Compiler from here

 

                                                                                                                                      

0 0 votes
Article Rating
Subscribe
Notify of
guest

3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
trackback
3 years ago

[…] String Handling in C […]

trackback
3 years ago

[…] String Handling in C […]

Please Subscribe to Stay Connected

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

DigitalSanjiv will use the information you provide on this form to be in touch with you and to provide updates and marketing.
Share via
Copy link
Powered by Social Snap