Some refs:
Wikipedia
Linear classifier and
Support vector machine (SVM),
scikit-learn SVM,
an example with 3 classes,
questions/tagged/classification on SO,
3000 more questions/tagged/classification on stats.stackexchange,
400 more questions/tagged/classification on datascience.stackexchange .
For your 2-class problem, do these steps:
find the midpoints Rmid of the red points, Bmid of the black, Mid of the lot
draw the line L from Rmid to Bmid
the (hyper)plane through Mid, perpendicular to line L, is what you want: a linear classifier.
Or you can just compare the distances |x - Rmid| and |x - Bmid|: call x nearer Rmid red, nearer Bmid black.
But there's more to be said. Projecting all the data points onto line L gives a 1-dimensional problem:
rrrrrrrrrrbrrrrrrrrbbrrr | rrbbbbbbbbbbbbbbb
It's a good idea to plot all the points on this line,
to see and better understand the data.
(For point clouds in say 5 or 10 dimensions, it might be fun and/or informative
to look at 2d or 3d slices from different angles.)
Each cut, "|" above, gives a "confusion matrix" of 4 numbers:
R-correct R-called-B e.g. 490 10
B-called-R B-correct 50 450
This gives a rough idea of the error rate of your predictions red / black; print it, discuss it.
The best cut depends on costs,
e.g. if calling an R a B is 10 times or 100 times worse than calling a B an R.
If the red points and the black points have different scatter / covariance, see Fisher's linear discriminant .
("SVM" is jargon for a class of methods for "good" separating hyperplanes / hypersurfaces -- there's no "machine".)