How Bucket sort is considered under Linear Sorting?

Question 1

Type 1:
The first type you describe is not really bucket sort. It's actually counting sort or key index counting. Although it's considered a variant on bucket sort. The reason is because you're actually just counting the occurences of each key, rather than storing the keys itself in the buckets.

Ref: http://en.wikipedia.org/wiki/Counting_sort
Ref: http://www.cs.princeton.edu/courses/archive/spr13/cos226/demo/51DemoKeyIndexedCounting.pdf

Space: O(1)
we can set up buckets for each possible element,

Isn't this contradictory? You're going to declare buckets for every possible element and still keep O(1)? ;)

If you want the algorithm to be stable, you cannot overwrite the input array either. So in practice you need space requirement n + k for:

An output array of the length 'n' (same size as your input array basically)
'k' buckets

If you check the pseudocode for counting sort, you will notice the last loop goes over the input array again to see where every element needs to go. By doing this in the order they appear in the input-array you get a stable sort.

PS: Keep in mind that you're not necessarily sorting integers. If the input is an array of characters between A-Z you can also use this algorithm.

Type 2:

So the quickest way to sort it would be to allocate 151 linked lists (let's call them buckets) and put each person's data structure in the bucket according to his/her age:

It might be the most easy way because you can find the required bucket rather easy, but it's not necessarily the quickest way. Another possibility for example is to create buckets for every 10 years.

00 - 09
10 - 19
20 - 29
...

And when you want to insert something into a bucket, you can do:

Binary search on the bucket (LinkedList for example) to find the right position
Insert the element

This way you also don't need to sort the buckets afterwards because everything is already sorted. Not saying it is a good idea, just pointing out the possibility.

Questions:

Simply put; It's a linear sort because it takes linear time to sort. Both type 1 and type 2 take O(n + k). Another important factor to take into account is the subalgorithm used by bucket sort to sort each individual bucket. If quicksort is used, it will result in another lowerbound compared to - for example - bubblesort. It's also possible to choose a non-comparison subalgorithm with different boundaries. A good choice of a subalgorithm and the distribution over the buckets makes it so that bucketsort is not limited to the O(n(log n)) lowerbound. Keep in mind that the O-notation doesn't give any guarantee about speed, it gives a guarantee of growth rate. If your input size doubles from 'N' to '2N' your linear time algorithm will cope with it better than for example an O(n^2) (worse-case) algorithm like bubblesort.
Insertion sort is indeed efficient for small arrays and that's mainly the reason why it's chosen. Plus the fact that it's stable. Because if you don't use a stable algorithm to sort the buckets itself, the whole algorithm (bucket sort) won't be stable.
Hard to say. It depends on the data in my opinion. If you have to sort 1 million 32bit integers, you're not going to create 2^32 buckets for them. In that case it might be nice to take a look at other algorithms (for example LSD Radix sort) which would basicly create 9 buckets (1 for every digit).

Question 2

Bucket sorting is linear-time when the buckets each are sorted in linear time. "Type 1" and "Type 2" both are linear-time because all of the values in each bucket compare pairwise equal and need no further sorting.

The answer to your latter two questions is whatever works in practice. Usually the writer of your standard-library sort has determined the appropriate cutoff for insertion sort. I imagine that the performance of bucket sort depends heavily on the data and the memory subsystem in question.

Question 3

Type 1 and Type 2 that you described are actually the same meaning you have a range. Yes, in that case, it's linear time complexity since no further sorting is required within each bucket. Each bucket contains a single type of values.