Question

I believe that no question is silly if it is bugging you. I have this question about pipe-lining?

What is pipe-lining?

Theory says that : "With pipelining, the CPU begins executing a second instruction before the first instruction is completed. Pipelining results in faster processing because the CPU does not have to wait for one instruction to complete the machine cycle."

My question is considering i am working on a uni-processor system, where only one instruction can be executed at a time, how is it possible that simultaneous operation of fetching next instruction is performed when my CPU is busy? If i am lacking conceptual clarity please throw some light on me. If there is separate hardware which makes simultaneous processing happen, what is it? Kindly explain.

Was it helpful?

Solution

There is indeed separate hardware for fetching. There is a whole bunch of bits of separate hardware, arranged in a pipeline. Each part is executing one part of a separate instruction simultaneously. On every clock edge, the results of one stage get passed down to the next.

OTHER TIPS

Pipelining has nothing to do with uni- versus multi-processor systems. It has to do with thinking hard about the steps taken in executing a single instruction on a machine, in hardware.

Imagine you want to implement the MIPS "add-immediate" instruction, addi $d, $s, $t, which adds an integer stored in the register named by $s to an integer $t directly encoded in the instruction, and stores the result in the register named by $t. Think about the steps you'd need to take to do that. Here's one way of breaking it down (for example only, this doesn't necessarily correspond to real hardware):

  1. Parse out the (binary-encoded) instruction to find out which instruction it is.
  2. Once you recognize that it is an addi instruction, parse out the source and destination registers and the literal integer to add.
  3. Read the appropriate register, and compute the sum of its value and the immediate integer.
  4. Write the result into the named result register.

Now remember, all this needs to be built in hardware, meaning there are physical circuits associated with each of these things. And if you executed one instruction at a time, three fourths of these circuits would be sitting idle, doing nothing all the time. Pipelining takes advantage of this observation: If the processor needs to execute two addi instructions in a row, then it can:

  1. Identify the first one
  2. Parse the first one, and identify the second one with circuits that would otherwise be idle
  3. Add the first one, and parse the second
  4. Write out the first one, and add the second
  5. Write out the second one

So now, even though each instruction takes 4 processing rounds, the processor has finished two instructions in just 5 rounds total.

This gets complicated due to the fact that sometimes you've got to wait for one instruction to finish before you know what to do in the next one (or even what the next one is), but that's the basic idea.

Rather than try to cram a year-long university course into this text box, I'll point you at a textbook that explains this whole subject in clear detail:

Hennessy, John L.; and Patterson, David A. Computer Architecture, Fifth Edition: A Quantitative Approach. Morgan Kauffman.

Think about those How its made or other tv shows where you see a factory in action. Think about what you may have read or seen about a car factory. The "Car" moves through the factory starting as a frame or body and things are added to it as it moves. If you sat on the outside of the building you would see tires and paint cans and rolls of wire and steel go into the building and a steady stream of cars going out. Just because it is a single (uniprocessor) factory doesnt mean it cant have an assembly line (pipeline). A uniprocessor with a pipeline is not actually, necessarily executing one instruction at a time any more than the car in the factory is built one car at a time. A little bit of the construction of that car happens at each station that it passes through, likewise the execution of your program happens a little bit at each station in the pipeline.

The typical simple stages in the pipe are fetch, decode, and execute, three stages. it takes three clocks to execute one instruction, minimum (usually many more due to I/O being slow) lets say three stages in the pipe. While instruction a is in the execution phase though you have instruction b being decoded and instruction c being fetched. Back to the auto factory, they might produce "one car every 7 minutes" that doesnt mean it takes 7 minutes to make a car, it might take a week to make a car, but they start a new one every 7 minutes and the average time at each station is such that you can roll one out the door every 7 minutes. Same here, with a pipeline it doesnt mean you can fetch, decode, and execute all three steps at the clock rate for the processor. Like the factory it is more of an average thing. If you can feed each of the stages in the pipeline at the processor clock rate then it will complete one instruction per clock (if designed to do that). these days you cant feed the data/instructions that fast and there are pipeline stalls, etc which cause you to have to start over or discard some of the progress and back up some.

Pipelining is simply taking an assembly line approach to executing instructions in a processor.

I thought it was used when there are branches in the code, and the logic predicts which branch will be taken, and preloads the instructions for that branch into a cache. If the prediction proves to be false, then it needs to throw away those instructions and load the alternate, resulting in a loss. But I believe there are patterns in code that make the prediction true more often than not, especially with modern compilers that repeat patterns over and over.

I'm not up on the actual implementation, but I don't really think that additional hardware is necessarily required, although it is useful for optimum speed.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top