Linear regression: Substituting the non-numerical discrete domain of a predictor with numerical one

Question 1

Best practice is to do a one-hot (one-of-K) encoding: for each value that A can take on, define a separate indicator feature. So with fives "types", A = type1 would be

[1, 0, 0, 0, 0]

and A = type3 is

[0, 0, 1, 0, 0]

Then concatenate these vectors with your other features so that your hypothesis becomes

H = w[Atype1] * [A=type1] + ... + w[Atype5] * [A=type5] + w[B] * B + ...

using [] to denote indicator functions.

This avoids the main problem with your approach, which is that you're introducing a number of (probably incorrect) biases, e.g. that type5 = type2 + type3. For further intuition why this is better than your encoding, see this answer of mine.

Question 2

In general this won't work, because usually an average of nominal attributes doesn't make sense. For example if you assign Apple = 1, Banana = 2, Orange = 3 then in the model Banana would appear as an average of an Apple and an Orange. For classification tasks, consider using a perceptron, a neural network (using Winner-take-all paradigm eliminates the problem with average between nominal attributes), a decision tree or some other tools I forgot to mention. As correctly pointed out by larsmans a typical model for your case is the Logistic Regression.

Possibly you could also use WTA paradigm for linear regression - building a regression model for each of the output vector dimensions.

Clarification: WTA is the same as one-hot in larsmans's answer.