Question

A friend of mine gave me a task to write a program to

replace "a", "an", "the" with blank space in a text file in C.

I wrote that program, but that went too lengthy as I checked "a", "an", "the" individually.

For example, I replaced "a" by

NOTE: fs is pointer to the source file and ft is pointer to the target file.

while(fgets(str, 100, fs) != NULL)
{
    for(i = 0; str[i] != '\0'; i++)
    {
        if (str[i] ==  ' '  ||
            str[i] ==  '.'  ||
            str[i] ==  ','  ||
            str[i] ==  '\n' ||
            str[i] ==  '\t')
        {
            if (str[i+1] == 'a' || str[i+1] == 'A')
            {
                if (str[i+2] == ' '  ||
                    str[i+2] == '.'  ||
                    str[i+2] == ','  ||
                    str[i+2] == EOF  ||
                    str[i+2] == '\0' ||
                    str[i+2]==  '\n' ||
                    str[i+2]==  '\t')
                {
                    str[i+1]=' ';
                }
            }
        }
    }
    fputs(str,ft);
}

Is there a shorter way to do the same?

Take care of the fact that "a", "an", "the" can be the first words in the source file.

Was it helpful?

Solution

Use the fscanf and fprintf functions so that scanning the file would be easy for you and you can easily check for 'a, an, the':

char s[50];
while(fscanf(fp, "%s", s) != EOF) 
{
    if(strcmp(s, "a") == 0 || strcmp(s, "an") == 0 || strcmp(s, "the") == 0)
    {
        char c = ' ';
        fprintf(ft, "%s", c);
    }
    else
    {
        fprintf(ft, "%s", s); 
    }
}

OTHER TIPS

You can read the input char-by-char, using for example getchar, or always remember the last few chars, even if they were in the previous buffer. This way you need to remember the previous two chars, and the current char in a little "rolling-array", that you would reset at each word boundary.

Using a fixed sized buffer with fgets, or fscanf, you need a lot of codeing to handle special cases. There are a few, for example the lines don't start with space or tab, but a line parhaps starts with "the". In that case, there will be no such character before the word. The same thing is true about whitespace following a word. You can get around these things by allocating a bit more space for the buffer, fill the first char with ' ' , and call fgets this way:

 fgets(str + 1, 99, fs)

But you still have the problem of words at boundaries, where your buffer ends with "... t" and the next fgets gives you "he ..." . Just keep an array of 3 chars, and the current length of the array, resetting the length to zero at each word boundary.

I think this code works for a reasonably plausible definition of the problem:

#include <ctype.h>
#include <stdio.h>
#include <string.h>

static char const *words[] = { "a", "the", "an" };
enum { NUM_WORDS = sizeof(words) / sizeof(words[0]) };

static void mapword(char *word, int len)
{
    char lower[256];
    word[len] = '\0';
    for (int i = 0; i <= len; i++)
        lower[i] = tolower(word[i]);
    for (int i = 0; i < NUM_WORDS; i++)
    {
        if (strcmp(words[i], lower) == 0)
        {
            putchar(' ');
            return;
        }
    }
    fputs(word, stdout);
}

int main(void)
{
    char word[256];
    int c;
    size_t nletters = 0;

    while ((c = getchar()) != EOF)
    {
        /*
        ** This would break if you enter a word with 256 characters
        ** ending in 'a' because the word would be split after 255
        ** characters and the trailing 'a' would then match the
        ** next buffer full, which is an awfully improbable event.
        */
        if (!isalpha(c) || nletters >= sizeof(word)-1)
        {
            if (nletters > 0)
            {
                mapword(word, nletters);
                nletters = 0;
            }
            putchar(c);
        }
        else
            word[nletters++] = c;
    }

    if (nletters > 0)
    {
        /*
        ** Since a text file should end with a newline, the program
        ** should not get here!
        */
        mapword(word, nletters);
    }

    return 0;
}

For example, given the first three lines of the question as input:

A friend of mine gave me a task to write a program to
replace "a", "an", "the" with blank space in a text file in c.
I wrote that program but that went too lengthy as I checked "a", "an", "the" individually.

the output from the program is:

  friend of mine gave me   task to write   program to
replace " ", " ", " " with blank space in   text file in c.
I wrote that program but that went too lengthy as I checked " ", " ", " " individually.

If you want to use some system command then your life is easy. sed is linux command to serve your requirement.

You can do as follows

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char * argv[]) 
{
    system("sed 's/an//g;s/a//g;s/the//g' file");
}

If file contain

replace “a”, “an”, “the” with blank space in a text file

Output is

replce “”, “”, “” with blk spce in  text file

Caution : This code replace space every where when it found matching pattern.So it not check for matching whole word.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top