Question

I can't think of any way to implement pipelining in c that would actually work. That's why I've decided to write in here. I have to say, that I understand how do pipe/fork/mkfifo work. I've seen plenty examples of implementing 2-3 pipelines. It's easy. My problem starts, when I've got to implement shell, and pipelines count is unknown.

What I've got now: eg.

ls -al | tr a-z A-Z | tr A-Z a-z | tr a-z A-Z

I transform such line into something like that:

array[0] = {"ls", "-al", NULL"}
array[1] = {"tr", "a-z", "A-Z", NULL"}
array[2] = {"tr", "A-Z", "a-z", NULL"}
array[3] = {"tr", "a-z", "A-Z", NULL"}

So I can use

execvp(array[0],array)

later on.

Untli now, I believe everything is OK. Problem starts, when I'm trying to redirect those functions input/output to eachother.

Here's how I'm doing that:

    mkfifo("queue", 0777);

    for (i = 0; i<= pipelines_count; i++) // eg. if there's 3 pipelines, there's 4 functions to execvp
    {
    int b = fork();             
    if (b == 0) // child
        {           
        int c = fork();

        if (c == 0) 
        // baby (younger than child) 
        // I use c process, to unblock desc_read and desc_writ for b process only
        // nothing executes in here
            {       
            if (i == 0) // 1st pipeline
                {
                int desc_read = open("queue", O_RDONLY);
                // dup2 here, so after closing there's still something that can read from 
                // from desc_read
                dup2(desc_read, 0); 
                close(desc_read);           
                }

            if (i == pipelines_count) // last pipeline
                {
                int desc_write = open("queue", O_WRONLY);
                dup2(desc_write, 0);
                close(desc_write);                              
                }

            if (i > 0 && i < pipelines_count) // pipeline somewhere inside
                {
                int desc_read = open("queue", O_RDONLY);
                int desc_write = open("queue", O_WRONLY);
                dup2(desc_write, 1);
                dup2(desc_read, 0);
                close(desc_write);
                close(desc_read);
                }               
            exit(0); // closing every connection between process c and pipeline             
            }
        else
        // b process here
        // in b process, i execvp commands
        {                       
        if (i == 0) // 1st pipeline (changing stdout only)
            {   
            int desc_write = open("queue", O_WRONLY);               
            dup2(desc_write, 1); // changing stdout -> pdesc[1]
            close(desc_write);                  
            }

        if (i == pipelines_count) // last pipeline (changing stdin only)
            {   
            int desc_read = open("queue", O_RDONLY);                                    
            dup2(desc_read, 0); // changing stdin -> pdesc[0]   
            close(desc_read);           
            }

        if (i > 0 && i < pipelines_count) // pipeline somewhere inside
            {               
            int desc_write = open("queue", O_WRONLY);       
            dup2(desc_write, 1); // changing stdout -> pdesc[1]
            int desc_read = open("queue", O_RDONLY);                            
            dup2(desc_read, 0); // changing stdin -> pdesc[0]
            close(desc_write);
            close(desc_read);                               
            }

        wait(NULL); // it wait's until, process c is death                      
        execvp(array[0],array);         
        }
        }
    else // parent (waits for 1 sub command to be finished)
        {       
        wait(NULL);
        }       
    }

Thanks.

Was it helpful?

Solution

Patryk, why are you using a fifo, and moreover the same fifo for each stage of the pipeline?

It seems to me that you need a pipe between each stage. So the flow would be something like:

Shell             ls               tr                tr
-----             ----             ----              ----
pipe(fds);
fork();  
close(fds[0]);    close(fds[1]);
                  dup2(fds[0],0); 
                  pipe(fds);
                  fork();         
                  close(fds[0]);   close(fds[1]);  
                  dup2(fds[1],1);  dup2(fds[0],0);
                  exex(...);       pipe(fds);
                                   fork();     
                                   close(fds[0]);     etc
                                   dup2(fds[1],1);
                                   exex(...);  

The sequence that runs in each forked shell (close, dup2, pipe etc) would seem like a function (taking the name and parameters of the desired process). Note that up until the exec call in each, a forked copy of the shell is running.

Edit:

Patryk:

Also, is my thinking correct? Shall it work like that? (pseudocode): 
start_fork(ls) -> end_fork(ls) -> start_fork(tr) -> end_fork(tr) -> 
start_fork(tr) -> end_fork(tr) 

I'm not sure what you mean by start_fork and end_fork. Are you implying that ls runs to completion before tr starts? This isn't really what is meant by the diagram above. Your shell will not wait for ls to complete before starting tr. It starts all of the processes in the pipe in sequence, setting up stdin and stdout for each one so that the processes are linked together, stdout of ls to stdin of tr; stdout of tr to stdin of the next tr. That is what the dup2 calls are doing.

The order in which the processes run is determined by the operating system (the scheduler), but clearly if tr runs and reads from an empty stdin it has to wait (to block) until the preceding process writes something to the pipe. It is quite possible that ls might run to completion before tr even reads from its stdin, but it is equally possible that it wont. For example if the first command in the chain was something that ran continually and produced output along the way, the second in the pipeline will get scheduled from time to time to prcess whatever the first sends along the pipe.

Hope that clarifies things a little :-)

OTHER TIPS

It might be worth using libpipeline. It takes care of all the effort on your part and you can even include functions in your pipeline.

The problem is you're trying to do everything at once. Break it into smaller steps instead.

1) Parse your input to get ls -al | out of it. 1a) From this you know you need to create a pipe, move it to stdout, and start ls -al. Then move the pipe to stdin. There's more coming of course, but you don't worry about it in code yet.

2) Parse the next segment to get tr a-z A-Z |. Go back to step 1a as long as your next-to-spawn command's output is being piped somewhere.

Implementing pipelining in C. What would be the best way to do that?

This question is a bit old, but here's an answer that was never provided. Use libpipeline. libpipeline is a pipeline manipulation library. The use case is one of the man page maintainers who had to frequently use a command like the following (and work around associated OS bugs):

zsoelim < input-file | tbl | nroff -mandoc -Tutf8

Here's the libpipeline way:

pipeline *p;
int status;

p = pipeline_new ();
pipeline_want_infile (p, "input-file");
pipeline_command_args (p, "zsoelim", NULL);
pipeline_command_args (p, "tbl", NULL);
pipeline_command_args (p, "nroff", "-mandoc", "-Tutf8", NULL);
status = pipeline_run (p);

The libpipeline homepage has more examples. The library is also included in many distros, including Arch, Debian, Fedora, Linux from Scratch and Ubuntu.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top