Lecture 8 Strings
1. Concepts of strings
2. String pointers
3. String operations
4. String library functions
1. Concepts of Strings
• Concepts of strings in C
1. A string is a sequence of characters terminated with the null
character stored in a char array.
2. Char array is used to hold a string. Each character is stored
in an array position one after another. The end of string is
marked by the null character ‘\0’
Note: the ASCII value of ‘\0’ is 0. NULL as pointer value of 0.
3. The length of a string is the number of non-null characters
in the string.
• A string is represented and accessed by a char pointer
pointing to the memory location of the first character.
Example
char str[10]; // declare a char array of 10 bytes
str[0] = ‘H';
str[1] = ‘e';
ste[2] = ‘l';
str[3] = ‘l';
str[4] = ‘o';
str[5] = ‘\0'; // this is the terminator
printf(“%s”, str); // output: Hello
char array str holds the string “Hello”.
str presents the address of the string.
The first five elements of array str hold characters 'H', 'e', 'l', 'l', 'o'
The sixth element str[5] holds the string terminator ‘\0'.
The array elements str[6], str[7], str[8], str[9] are not used.
Array name str is a char pointer pointing to the first character.
The length of the string str is 5
The length of the array str is 10.
The size of the array str is 10.
char array initialization by string
char name[ ] = “cp264”;
char name[6] = {‘c’, ‘p’, ‘2’, ‘6’, ‘4’, ‘\0’};
• The compiler allocates char array name of 6 elements (6 types),
• Assign values ‘c’, ‘p’, ‘2’, ‘6’, ‘4’, ‘\0’ to the char array elements.
• The length of the string that array name holds is 5.
• The size of the array name is 6.
2. String pointers
• Since a string is stored in a char array, char pointer and
operations apply to string data operations.
char name[20] = “cp264”;
char *p = &name[0]; // or char *p = name;
printf("%s\n", p); // output: cp264
printf("%c\n", *p); // output: c
printf("%c\n", *(p+3)); // output: 6
printf("%s\n", p+2); // output: 264
*p = ‘C’;
*(p+1) = ‘P’;
printf("%s\n", p); // output: CP264
p is char type pointer and char type has 1 byte, p+1 increase the
value of p by 1.
String pointer
char *p = "cp264";
printf("%s\n", p); // output: cp264
printf("%c\n", *(p+2)); // output: 2
• When a string is declared like
char *p = “cp264”;
Then compiler will allocate read-only memory char
array to hold string “cp264”, and char pointer p
pointers to the array, so operations like
*(p+1) = ‘P’; is not allowed.
String traversal
void display_string(char s[]) {
char *p = s;
while( *p != ‘\0’) { printf(“%c”, *p); p++; }
}
char str[] = “cp264”;
display_string(str) ;
This print string: cp264
Since a string has a terminator, no need to pass the
length of a string to a function.
Other implementations of the display function
void display_string(char *s) {
char *p = s;
while( *p != ‘\0’) { printf(“%c”, *p); p++; }
}
void display_string(char *s) {
while( *s != ‘\0’) { printf(“%c”, *s); s++; }
}
void display_string(char *s) {
while( *s ) { printf(“%c”, *s++); }
}
Array of strings
• An array of strings is a sequence of strings stored in a 2D char
array, each row holds one string.
char names[20][40] = {
"cp264“,
“data structures II",
"C programming language“
};
– This declares a 2D char array, it can hold 20 strings, each has maximum
length 39. It also initializes the first three rows by the given string data.
– names[i] is a pointer pointing to row i, ith string in names.
printf("%s", names[0]); // output: cp264
printf("%s", names[1]); // output: data structures II
printf("%s", names[2]); // output: C programming language
for (int i=0;i<2;i++) printf("%s ", names[i]); // output: cp264 data structures II
printf("%s", names[1]); // output: data structures II
printf("%c", *(names[1]+5)); // output: s
*(names[1]+5) = ‘S’;
printf("%c", *(names[1]+5)); // output: S
printf("%s", names[1]); // output: data Structures II
Array of char pointers
Use an array of char pointers (char pointer array) to represent a list of
strings.
char *sp[2];
char *s0 = "cp264“;
char *s1 = “data structures II“;
sp[0] = s0;
sp[1] = s1;
for (int i =0; i<2;i++) printf(“%s “, sp[i]);
// output: cp264 data structures II
Command line arguments
-- an application of char pointer array
main( int argc, char *argv[] )
• Parameter argc represents the number of command line
arguments. Count every thing in command line separated by space.
• Parameter argv represent an array of char pointers, argv[i] points to
the ith argument string.
Command:
a.out argument1 argument2
argc will have value 3
argv[0] points to string “a.out”
argv[1] points to string “argument1”
argv[2] points to string “argument2”
3. String operations
1. Read string from stdin
2. Write string to stdout
3. Length of a string
4. Change case
5. Copy string
6. Concatenate string
7. Compare string
8. Reverse string
General algorithm of string processing
Input: string str[] or char *str
Step 1: Set char *ptr = str
Step 2:
while (*ptr) { // traversal / scan the string by loop
If pattern is matched
take action
ptr++;
}
Step 3: stop
This algorithm apply to processing data in general arrays.
Reading strings from stdio (keyboard)
stdio has three functions for getting input from stdin keyboard.
getchar() – get and return a character from keyboard
char str[100];
gets(char *p) – get a string for keyboard and store in p,
gets(str); prompts user to type a string and hit enter key to
terminate the input.
scanf("%s", char *p) – get string from keyboard,
scanf("%s", str); prompts user to type a string and hit the enter
key to terminate the input.
getchar() only reads and returns one character from sdtin buffer
at a time. Using getchar() to get a string, it needs to
call getchar() repeatedly until a stop character ‘\n’ is
encountered, then insert the null character at the end.
Note: the enter key character will be changed to ‘\n’ by
keyboard driver.
int i=0;
char ch = getchar ();
while(ch != ‘\n’)
{ str[i] = ch;
i++;
ch = getchar();
}
str[i] = '\0';
Both gets() and scanf() functions insert the typed
characters one after another into the array str starting
from location str[0], and insert '\0' at the end.
fgets(name, sizeof(name), stdin);
cp264 2024<enter>
printf(“%s”, name); //output: cp264 2024
char name[20];
scanf(“%s”, name);
cp264 2024<enter>
printf(“%s”, name); // output: cp264
Output string to stdout (screen)
• stdio has three functions for stdout (screen) output
putchar() - print a character
puts(str) - print a string
printf(“%s”, str) - formatted printing
Example
char str[] = “data structures”;
char *p=str;
while(*p) putchar(*p++);
puts(str);
printf(“%s”, str);
Compute the length of a string
• The length of a string is the number of non-null characters.
Example: the length of “cp264” is 5, length of “hello c” is 7.
int str_length(char *s) {
if (s == NULL) return -1;
int counter = 0;
while (*s) { // pattern: *s != ‘\0’
counter++; // action
s++;
}
return counter;
}
char a[50] = “C language”;
printf(“%d”, str_length(a)); // output 10
String copy
char a[] = “C programming language”;
char c[30];
int i;
for (i = 0; *(a+i) !='\0'; i++) {
*(c+i) = *(a+i);
}
*(c+i) = ‘\0’; // add null to the end
char *p1, *p2;
p1 = a;
p2 = c;
for (; *p1!='\0'; p1++,p2++) {
*p2 = *p1;
}
*p2 = '\0'; // add null to the end
void copy_string(char *from, char *to) { // version 1
for (; *from!='\0'; from++, to++) { // pattern: *from!= 0
*to = *from; // action
}
*to = '\0';
}
void copy_string(char *from, char *to){ // version 2
while ((*to = * from) != ‘\0’) { to++; from++; }
}
void copy_string(char *from, char *to){ // version 3
while ((*to++ = *from++) != ‘\0’) ;
}
Converting little case to upper Case
void upper_case(char *s) {
if (s == NULL) return;
while (*s) {
if (*s >= 'a' && *s <= 'z' ) // pattern
*s -= 32; // action
s++; // move to next character
}
}
Concatenate string
• Appending a source string to the end of a destination string
is like copying the source string to the end position
destination string.
• The algorithm first traverses the destination string to the null
position, then starts to copy characters from the source
string.
void append_string(char *from, char *to) {
if (from == NULL || to == NULL) return;
while (*to) to++; // traverse to the null position of to string
while ((*to++ = *from++) != '\0') ; // copy string
}
Comparing two strings in dictionary/lexical order
Input: Given two strings s1 and s2.
Output: 0 if s1 and s2 are exactly the same; s1 == s2
Let i be the first i s.t. s1[i] !=is2[i]
if s1[i] > is2[i] return 1; s1 > s2
else return -1; s1 < s2
int compare_string(char *s1, char *s2) {
while (*s1 || *s2) {
if (*s1 > *s2) // pattern
return 1; // action
if (*s1 < *s2) // pattern
return -1; // action
s1++;
s2++;
}
return 0;
}
Reversing a string in place
• If string s1 is “HELLO”, the reverse of s1 is “OLLEH”.
• To reverse a string in place we just need to swap the first
character with the last, second character with the second last
character, so on until the middle.
void reverse_string(char *s) {
if (s == NULL) return;
char *p = s, temp;
while (*p++) ; // traverse p to the null position
p--; // now p points to the last non-null character
while (s < p) { // s and p point to symmetric positions
temp = *s; // swap
*s = *p;
*p = temp;
s++; // move forward
p--; // move backword
}
}
4. String library functions
#include <string.h>
1. String copy function
char *strcpy( char *destination, char *source );
Copy the source string to the destination location.
char str1[20], str2[20];
strcpy( str1, “cp264 data structures II” );
strcpy( str2, str1);
2. Memory copy function
void *memcpy(void *destination, const void *source, size_t n);
Copy n characters from the source location to the destination location.
char src[50] = "cp264 data structures II";
char dest[50];
memcpy(dest, src, strlen(src)+1);
3. String length function
int strlen( const char *str ); /* returns string length */
e.g.
char str1[20];
strcpy( str1, “cp264” );
printf(“%d”, sizeof(str1)); // output 20
printf(“%d”, strlen(str1)); // output 5
4. Concatenate two strings
char *strcat( char *destination, char *source );
– Source is appended to destination
char s1[20] = “hello “;
char s2[20] = “world.”;
strcat(s1, s2); // s1 holds hello world.
5. Compare two strings
int strcmp( const char *first, const char *second );
returns -1 if first < second
0 if first = second
1 if first > second
char s1[20] = “hello “;
char s2[20] = “world.”;
printf(“%d”, strcmp(s1, s2)); // output -1
6. Finds the first occurrence of substring
char *strstr(const char *s1, const char *s2)
returns NULL pointer if s1 does not contains in s2, else
a pointer pointing to the first substring that matches s2;
return s1 if s2 is NULL.
char s1[] = “CP264 Data Structures II";
char s2[] = “Data";
char *p =strstr(s1, s2); // returns address of at D.
printf(“%s”, p); // output: Data Structures II
7. String token function
char *strtok(char *str, const char *delim)
returns the pointer to the first word, separated by
delimiters in delim.
Example
char *delim = “,. “; // comma, period, space as delimiters
char str[30] = “cp264, data structures.“;
char *token = strtok(str, delim); // get the first word
printf(“%s”, token); // output cp264
token = strtok(NULL, delim); // get the next word
printf(“%s”, token); // output data
token = strtok(NULL, delim); // get the next word
printf(“%s”, token); // output structures