Gnuplot multiple boxplots from samples

Question 1

In the examples they use a fixed number to set each boxplot:

plot 'data.txt' using (0):1 with boxplot

plots the data in the first column placed at the x-value 0. For two plots it is accordingly:

set style data boxplot
plot 'data.txt' using (0):1, '' using (1):2

Gnuplot cannot determine automatically the number of columns, but you can achieve some kind of automatization as follows:

file = 'data.txt'
header = system('head -1 '.file);
N = words(header)

set xtics ('' 1)
set for [i=1:N] xtics add (word(header, i) i)

set style data boxplot
unset key
plot for [i=1:N] file using (i):i

If I duplicate the two columns you showed, and label them with A B C D, I get the following plot with gnuplot 4.6.3:

enter image description here

As you see, outliers aren't taken into account. To hide the outliers, use set style boxplot nooutliers.

Question 2

I had the same issue and found out the reason for it. If you have the value for an outlier multiple times in your data set, then gnuplot will plot them in a line, resulting in a graph similar to what you have shown.

Apparently you can't avoid it or suppress the additional values. What you can do is tell gnuplot to use the whiskers in such a way that they mark the maximum and minimum value too. According to Wikipedia this is one alternative to use whiskers. I don't know if it fits for your plots, but it resolves the issue by circumventing it.

I'm not sure if I could help you, but maybe somebody who comes across this finds it useful or can even propose a way to remove the additional points for an outlier.