Previous Section
 < Free Open Study > 
Next Section


Case Study

User-Defined String I/O Class

Strings are lists of characters; they represent text in a program. Strings are used extensively in all kinds of programs. They label output, they are read from a file and sent to an output stream, or they can be the data that a program processes. Many languages have strings as a built-in data type. The C++ standard library provides a string class, whose declarations are available in the header file <string>. The operations given in the string class include concatenation of two strings using the + operator, searching a string for a substring, determining the number of characters in a string, and some input/output operations. C++ also inherits a primitive string type from C, which is simply an array of type char, with the null character (\o) used to signal the end of the string. C++ also inherits from C a set of string-handling functions in <cstring> and input/output functions in <iostream> and <fstream>. The input/output functions are the same in both cases and are rather restrictive.

Let's create our own abstract data type String that has general-purpose input and output functions and encapsulate it into a class. We will call our class StrType so as not to confuse it with the library class string.

Logical Level

At the logical level, a string is a finite sequence of alphanumeric characters, characterized by a property called length. The length is the number of characters in the string.

What operations do we want defined on strings? At a minimum we need a primitive constructor that creates a string by initializing it to the empty state, a transformer that reads values from a file or the keyboard into a string, and an observer that sends a copy of the string to the output stream or a file.

If a string has the property of length, we can define initializing a string as setting the length of the string to zero. Reading values into a string is more difficult. We must decide what we mean by a string in an input stream or a file. Do we mean the characters from the current point in the stream until the next whitespace character is encountered (blank, tab, newline, and so forth)? Until the next end-of-line is encountered? Until a special character is encountered? What do we do if the character at the current position of a stream is a white-space character?

Let's examine the operations for string handling available in <iostream> before we make our decision. We assume that string in the following examples is an array of characters.

cin >> string: Whitespace is skipped and characters are collected and stored in string until whitespace is encountered. The stream is left positioned at the first whitespace character encountered. Using the extraction operator (>>) is not appropriate if the string you are trying to enter contains blanks (which are whitespace).

cin.getline(string, max): Whitespace is not skipped; the characters from the current position of the stream up to end-of-line are read and stored into string; the newline character is read but not stored in string. If max-1 characters are read and stored before encountering the newline character, the processing stops with the stream positioned at the next character.

Intermingling these two methods of inputting strings can cause serious problems. The >> operator leaves the stream positioned at whitespace. If that whitespace character is the new-line character and the next operation is cin.getline(string, max), no characters are input because the newline character stops the reading. Therefore, string is the empty string. We want our string-input operation to avoid this problem and to work consistently in all situations.

Let's allow the user of our string class to decide what is meant by an input string by providing two parameters: a Boolean flag that determines if inappropriate characters should be skipped before inputting the string and a parameter that specifies which characters are legal in the string (anything else ends the input process). If no appropriate characters are found, the string should be set to the empty string.

The only question concerning output is whether the user wants the output to begin on a new line. We can provide a parameter for the user that indicates which format is desired.

Let's summarize our observations in a CRC card before we write a formal specification for our ADT.

Click To expand

Before we present our first ADT specification, a word about notation is in order. Because we want the specification to be as programming language independent as possible, we use the general word "Boolean" for the type name of Boolean variables rather than the C++ word bool. On the other hand, there is no general word for input and output file types, so we use the C++ terms ifstream and of stream, respectively. We also use the C++ symbol ampersand (&) to indicate reference parameters.

Recall that to distinguish between the logical level and the implementation level, we put logical-level identifiers in handwriting font and implementation-level identifiers in monospaced font. In an ADT specification we use regular paragraph font throughout. In the specifications we also convert the phrases used in the CRC card to operation identifiers.

Application Level

Let's use this very simple set of operations to read words from a file, store them into an array, and print them on the screen, one per line. The input is ordinary text; only alphanumeric characters are allowed in the string. Thus, any nonalphanumeric character acts as a word (string) delimiter.

#include <fstream>
#include "StrType.h"
#include <iostream>
const int MAX_WORDS = 10;
int main()
{
  using namespace std;
  StrType word;
  ifstream inFile;
  StrType words[MAX_WORDS];
  int numWords = 0;
  inFile.open("words.in");

  word.MakeEmpty();
  word.GetStringFile(true, ALPHA_NUM, inFile);
  while (inFile && numWords < MAX_WORDS)
  {
    word.CopyString(words[numWords]);
    numWords++;
    word.GetStringFile(true, ALPHA_NUM, inFile);
  }
  if (inFile)
    cout << "First " << MAX_WORDS << " words on the file: ";
  else
    cout << " Words on the file: ";
  for (int index = 0; index < numWords; index++)
    words[index].PrintToScreen(true);
  return 0;
}
Click To expand

A slight change in the parameters allows us to consider only alphabetic characters to be making up a word and all other characters to be acting as delimiters: word. GetStringFile (true, ALPHA, inFile). We could add a function that would check whether word was present in the words array and add it only if it were absent, giving us a list of unique words.

We use the String ADT (class StrType) as defined by the previous specification many times throughout the rest of this book.

Implementation Level

Now we must determine how we will represent our strings. Recall that C++ implements strings as one-dimensional char arrays with the null character (\o) signaling the end of the string. Another way of implementing a string would be a struct or a class with two data members: an array of type char and an integer variable representing the length. The string characters would be located between position 0 and position length - 1 in the array.

Which approach shall we use? The amount of storage required for both array-based designs is nearly the same, and the amount of processing is approximately the same although the algorithms differ somewhat. Let's use the null-terminated method here. To accommodate the null character, we must remember to allocate one more position than the maximum number of characters expected. The maximum number of characters-where does this number come from? Nothing in the specification hints at a limit on the number of characters allowed in a string, but our array-based implementation requires us to specify an array size.

Let's arbitrarily choose a reasonably large number-say, 100-for the maximum string length. In the specification file, StrType.h, we define a constant MAX_CHARS to be 100, letting the user know that it is the maximum length allowed. In Chapter 4, we look at a more flexible technique that lets us specify an array size dynamically (at run time) rather than statically (at compile time).

Should the client be responsible for making sure that the string is within the allowable length, or should the code of GetString and GetStringFile check for this problem and discard any characters that cannot be stored? Both approaches have merit. We choose the latter and do the checking within StrType. The specifications need to be changed to reflect this decision.

The specification for the class StrType is contained in the following header file. Note that the postconditions for GetString and GetStringFile have been expanded to describe what happens if the number of characters is too large. Here is file StrType.h.[5]

// Header file for class StrType, a specification for the
//  String ADT

#include <fstream>
#include <iostream>
const int MAX_CHARS = 100;
enum InType {ALPHA_NUM, ALPHA, NON_WHITE, NOT_NEW};
class StrType
{
public:
// Assumptions:
// InType is a data type consisting of the following constants:
//  ALPHA: only alphabetic characters are stored;
//  ALPHA_NUM: only alphanumeric characters are stored;
//  NON_WHITE: all nonwhitespace characters are stored;
//  NOT_NEW: all characters excluding the newline character
//           are stored.
// If skip is true, characters not allowed are skipped until the
//  first allowed character is found. Reading and storing
//  begins with this character and continues until a character
//  not allowed is encountered. This character is read but not
//  stored. If skip is false, reading and storing begins with
//  the current character in the stream.

  void MakeEmpty();
  void GetString(bool skip, InType charsAllowed);
  // Post: If the number of allowable characters exceeds
  //       MAX_CHARS, the remaining allowable characters have
  //       been read and discarded.

  void GetStringFile(bool skip, InType charsAllowed,
    std::ifstream& inFile);
  // Post: If the number of allowable characters exceeds
  //       MAX_CHARS, the remaining allowable characters have been
  //       read and discarded.

  void PrintToScreen(bool newLine);
  void PrintToFile(bool newLine, std::ofstream& outFile);
  int LengthIs();
  void CopyString(StrType& newString);
private:
  char letters[MAX_CHARS + 1]:
};

Now we must design the algorithms for our member functions and code them. In Chapter 1, we discussed the testing process and suggested that planning for testing should occur in parallel with the design. Let's practice what we preach, and consider testing as we code the member functions. Our strategy is clear-box testing, because we are planning our testing as we design and code the algorithms.

MakeEmpty When called prior to any other processing, MakeEmpty serves as a primitive constructor that takes the storage structure assigned to a variable of the class type and initializes any data members as necessary. We also can use MakeEmpty to return a structure to the empty state after it has been used. In the case of the null-terminated implementation, storing '\0' in letters [0] changes the instance of StrType from undefined to the empty string. To test this function, we must take a variable of type StrType, apply the function to it, and determine whether the string is empty.

void StrType::MakeEmpty()
// Post:  letters is empty string.
{
  letters[0] = '\0';
}

GetStringFile If skip is true, then characters are read and discarded until one is encountered that is found in the set of allowed characters. This character becomes the first character in the data member letters. Characters are read and stored in letters until a character is read that is not allowed. That character is then discarded. If MAX_CHARS characters are read and stored before a character not allowed is encountered, characters are read and discarded until such a character is encountered or end-of-file is encountered. The last step is to store the null-terminator following the last character stored in letters.

If skip is false, no characters are skipped before reading and storing characters.

How do we determine what to skip and what to store? The constants of InType tell us. We use them as labels on a switch statement.

GetStringFile(Boolean skip, InType charsAllowed, ifstream& inFile)

switch (charsAllowed)
    case ALPHA_NUM : GetAlphaNum(skip, letters)
    case ALPHA     : GetAlpha(skip, letters)
    case NON_WHITE : GetNonWhite(skip, letters)
    case NOT_NEW   : GetTilNew(skip, letters)

We can use the functions available in <cctype> to control our reading in each of the functions. If charsAllowed is ALPHA_NUM, we skip characters until the function isalnum returns true, and store them until isalnum returns false or inFile goes into the fail state. If charsAllowed is ALPHA, we skip characters until the function isalpha returns true, and store them until isalpha returns false or inFile goes into the fail state. If charsAllowed is NON_WHITE, we skip characters until the function isspace returns false, and store them until isspace returns true or inFile goes into the fail state. If charsAllowed is NOT_NEW, we skip characters until the character is not '\n', and store them until the character is '\n' or inFile goes into the fail state.

Each of the four cases has a Boolean parameter that controls processing. Our test driver must call each case with skip set to true and with skip set to false. In addition, each alternative must be examined to determine what characters should appear within the test data to test that alternative. We must be sure that each behaves properly when encountering end-of-file within the skip phase, and we must be sure that our test data include words that are longer than the maximum length MAx_CHARS.

We code GetAlphaNum and GetTilNew here, leaving GetAlpha and GetNonWhite as exercises.

#include <cctype>
// Prototypes of auxiliary functions.
// Note: If skip is true, nonallowable leading characters are
//  skipped. If end-of-file is encountered while skipping
//  characters, the empty string is returned. If the number
//  of allowable characters exceeds MAX_CHARS, the rest are
//  read and discarded.

void GetAlphaNum(bool skip, char letters[], std::ifstream& inFile);
// Post: letters array contains only alphanumeric characters.

void GetAlpha(bool skip, char letters[], std::ifstream& inFile);
// Post: letters array contains only alphabetic characters.

void GetNonWhite(bool skip, char letters[], std::ifstream& inFile):
// Post: letters array contains only nonwhitespace characters.

void GetTilNew(bool skip, char letters[], std::ifstream& inFile):
// Post: letters array contains everything up to newline character.

void StrType::GetStringFile(bool skip, InType charsAllowed,
  std::ifstream& inFile)
{
  switch (charsAllowed)
  {
    case ALPHA_NUM : GetAlphaNum(skip, letters, inFile):
                     break;
    case ALPHA     : GetAlpha(skip, letters, inFile);
                     break:
    case NON_WHITE : GetNonWhite(skip, letters, inFile):
                     break;
    case NOT_NEW   : GetTilNew(skip, letters, inFile);
                     break;
  }
}
void GetAlphaNum(bool skip, char letters [], std::ifstream& inFile)
// Post: If skip is true, non-alphanumeric letters are skipped.
//       Alphanumeric characters are read and stored until a
//       non-alphanumeric character is read or MAX_CHARS characters
//       have been stored. If the stream is not in the fail state,
//       the last character read was a non-alphanumeric character.
{
  using namespace std;
  char letter;
  int count = 0;

  if (skip)
  {// Skip non-alphanumeric characters.
    inFile.get(letter);
    while (!isalnum(letter) && inFile)
      inFile.get(letter);
  }
  else
    inFile.get(letter);
  if (!inFile || !isalnum(letter))
  // No legal character found; empty string returned.
    letters[0] = '\0';
  else
  {// Read and collect characters.
    do
    {
      letters[count] = letter;
      count++;
      inFile.get(letter);
    } while (isalnum(letter) && inFile && (count < MAX_CHARS));

    letters [count] = '\0';
    // Skip extra characters if necessary.
    if (count == MAX_CHARS && isalnum(letter))
      do
      {
        inFile.get(letter);
      } while (isalnum(letter) && inFile);
  }

}

void GetTilNew(bool skip, char letters[]. std::ifstream& inFile)
// Post: If skip is true, newline characters are skipped.
//       All characters are read and stored until a newline
   //       character is read or MAX_CHARS characters have been
   //       stored. If the stream is not in the fail state, the
   //       last character read was a newline character.
   {
     using namespace std;
     char letter;
     int count = 0;
     if (skip)
     (// Skip newlines.
       inFile.get(letter);
       while ((letter == '\n') &&inFile)
         inFile.get(letter);
     }
     else
       inFile.get(letter);
     if (!inFile || letter == '\n')
       letters[0] = '\0';
     else
     {// Read and collect characters.
       do
       (
         letters[count] = letter;
         count++;
         inFile.get(letter);
       } while ((letter != '\n') && inFile && (count < MAX_CHARS));

       letters[count] = '\0';
       // Skip extra characters if necessary.
       if (count == MAX_CHARS && letter != '\n')
         do
         {
           inFile.get(letter);
         } while ((letter != '\n') && inFile):
    }
}

GetString This operation is nearly identical to GetStringFile, with inFile changed to cin. We must write new auxiliary functions like those for GetStringFile, but replacing inFile with cin and removing the file name as a parameter. The same test cases shown later apply to this operation as to GetStringFile. We leave the coding of this function for you.

FrintToScreen and PrintToFile Because we have implemented our string using the same technique employed by C++, we can use cout to print to the screen. If newLine is true, we print a newline character before printing letters. We must test this function with newLine both true and false.

void StrType::PrintToScreen(bool newLine)
// Post:  letters has been sent to the output stream.
{
  using namespace std;
  if (newLine)
    cout << end1;
  cout << letters;
}

PrintToFiie is nearly identical to PrintToScreen, with cout replaced with outFile.

LengthIs and CopyString Because our implementation of a string is the same as that found in C++, we can use the strcpy and strlen functions provided by the standard library for these operations. Alternatively, we could write loops to count characters until the null terminator is found for LengthIs and to copy characters from self to newString until the null terminator has been copied for CopyString. We use strcpy here and leave the other implementation as an exercise.

#include <cstring>
void StrType::CopyString(StrType& newString)
// Post: letters has been copied into newString.letters.
{
  std::strcpy(newString.letters, letters);
}

int StrType::LengthIs()
// Post: Function value = length of letters string
{
  return std::strlen(letters);
}

Test Plan

To test the String ADT, we create a test driver program similar to the one we created at the end of Chapter 1 to test the class FractionType. That test driver accepted a sequence of instructions from an input file that indicated which member function of FractionType to invoke next. The test input also included any parameter values required by the FractionType functions. Results of the function invocations were printed to an output file. Meanwhile, a final count of the number of test cases was indicated in an output window.

Thanks to our planning when we created that test driver, it is not difficult to transform it into a test driver for a different class. To use it to test StrType, we simply change the declarations of the variables to appropriate ones for testing StrType, and rewrite the sequence of if-else statements to invoke and report on the string functions instead of the fraction functions. Because the member functions to be tested involve reading from a file, we use the name of the data file as input to the test driver as well as the name of the input file with the operation names and the output file. Creating the test driver is easy; the difficult part is designing the test cases to use as input data to the test driver.

We must unit test each member function in the class representing the String ADT. We need to go back through our design and collect the tests outlined during the design process. Following is a portion of the test plan that covers the parts of the ADT that we have implemented in this chapter: MakeEmpty, PrintToFile, LengthIs, CopyString, and GetStringFile with ALPHA_NUM and NOT_NEW.

Member Function/Reason for Test Case (parameters)

Input Values

Expected Output (one word per line) (| stands for newline)

MakeEmpty

none

empty string

PrintToFile

none

blank line

GetStringFile true, ALPHA_NUM

now is a1,a3## ABCE

now|is|a1|a3|ABCE

GetStringFile false. ALPHA_NUM

  • now is a1,a3## ABCE

||now|is|a1|a3|||ABCE ||.

GetStringFile true, NOT NEW

now is the time a1,a3 ##, ABCE,

now is the time a1,a3 ##, ABCE,

GetStringFile false, NOT_NEW

now is the time a1,a3 ##, ABCE,

now is the time a1,a3 ##, ABCE,

CopyString

ABCE,

ABCE,

LengthIs

ABCE,

5

GetStringFile

empty file

empty string

GetStringFile

string longer than MAX_CHARS

string with first MAX_CHARS characters

Here is a copy of the test driver and the input file. These files and the output files (strType. out and strTest. screen) are available on the Web site.

// Test driver
#include <iostream>
#include <fstream>
#include <string>
#include <cctype>
#include <cstring>
#include "StrType.h"
InType Allowed(std::string& inString);
bool Skip(std::string& inString);
int main()
{
  using namespace std;
  ifstream inFile;       // File containing operations
  ifstream inData;       // Input data file
  ofstream outFile;      // File containing output
  string inFileName;     // Input file external name
  string outFileName;    // Output file external name
  string inDataName;
  string outputLabel;
  string command;        // Operation to be executed
  string skip;
  string allowed;
  StrType inputString;
  int numCommands;

// Prompt for file names, read file names, and prepare files
cout << "Enter name of input command file; press return." << end1;
cin >> inFileName;
inFile.open(inFileName.c_str());

cout << "Enter name of output file; press return." << end1;
cin  >> outFileName;
outFile.open(outFileName.c_str());

cout << "Enter name of input data file; press return." << end1;
cin  >> inDataName;
inData.open(inDataName.c_str());

cout << "Enter name of test run; press return." << end1:
cin  >> outputLabel;
outFile << outputLabel << end1;

inFile >> command;
numCommands = 0;
while (command !- "Quit")
  {
    if (command == "GetString")
    {
      inFile >> skip >> allowed;
      inputString.GetStringFile(Skip(skip).
        Allowed(allowed), inData);
    }
    else if (command == "MakeEmpty")
      inputString.MakeEmpty();
    else if (command == "PrintToFile")
      inputString.PrintToFile(true, outFile);
    else if (command == "PrintToScreen")
      inputString.PrintToScreen(true);
    else if (command == "CopyString")
    {
      StrType secondString;
      inputString.CopyString(secondString);
      outFile << "String to copy: ":
      inputString.PrintToFile(false, outFile):
      outFile << " Copy of string: ";
      secondString.PrintToFile(false, outFile);
    }
    else
    {
      outFile << endl << "length of string " ;
      inputString.PrintToFile(false, outFile);
      outFile << " is "  << inputString.LengthIs()  << end1;
    }
    numCommands++;
    cout <<  " Command number " << numCommands << " completed."
         << end1;
    inFile >> command;
  }

  cout << "Testing completed." << end1;
  return 0;
}

InType Allowed(std::string& inString)
{
  if (inString == "ALPHA_NUM")
    return ALPHA_NUM;
  else if (inString == "ALPHA")
    return ALPHA;
  else if (inString == "NON_WHITE")
    return NON_WHITE;
  else return NOT_NEW;
}

bool Skip(std::string& inString)
{
  if (inString == "true")
    return true;
  else return false;
}

MakaEmpty
PrintToFile
GetString true ALPHA_NUM
PrintToFile
GetString true ALPHA_NUM
PrintToFile
GetString true ALPHA_NUM
PrirttToFile
GetString true ALPHA_NUM
PrintToFile
GetString true ALPHA_NUM
PrintToFile
GetString false ALPHA_NUM
PrintToFile
Getstring false ALPHA_NUM
PrintToFile
GetString false ALPHA_NUM
PrintToFile
GetString false ALPHA_NUM
PrintToFile
GetString false ALPHA_NUM
PrintToFile
GetString false ALPHA_NUM
PrintToFile
GetString false ALPHA_NUM
PrintToFile
GetString false ALPHA_NUM
PrintToFile
GetString false ALPHA_NUM
PrintToFile
GetString true NOT_NEW
PrintToFile
GetString true NOT_NEW
PrintToFile
GetString true NOT_NEW
PrintToFile
GetString true NOT_NEW
PrintToFile
GetString false NOT_NEW
PrintToFile
GetString false NOT_NEW
PrintToFile
GetString false NOT_NEW
PrintToFile
GetString false NOT_NEW
PrintToFile
LengthIs
CopyString
PrintToScreen
Quit

[5]In the interest of brevity, we do not repeat the preconditions and postconditions on the member function prototypes unless they have changed from those listed in the specification of the ADT. The code available on the Web is completely documented.



Previous Section
 < Free Open Study > 
Next Section
Converted from CHM to HTML with chm2web Pro 2.85 (unicode)