I think part of the confusion stems from the fact that "symbol table" means different things to different people, and potentially at different stages in the compilation process.
It is generally agreed that the lexer splits the input stream into tokens (sometimes referred to as lexemes or terminals). These, as you say, can be categorized as different types, numbers, keywords, identifiers, punctuation symbols, and so on.
The lexer may store the recognized identifier tokens in a symbol table, but since the lexer typically does not know what an identifier represents, and since the same identifier can potentially mean different things in different compilation scopes, it is often the parser - which has more contextual knowledge - that is responsible for building the symbol table.
However, in some compiler designs the lexer simply builds a list of tokens, which is passed on to the parser (or the parser requests tokens from the input stream on demand), and the parser in turn generates a parse tree (or sometimes an abstract syntax tree) as the output, and then the symbol table is built only after parsing has completed for a certain compilation unit, by traversing the parse tree.
Many different designs are possible.