Finiteness of Regular Language

Question 1

Finite automaton can be constructed for any regular language, and regular language can be a finite or an infinite set. Of-course there are infinite sets those are not regular sets. Check the Venn diagram below:

_{Notes:

     1. every finite set is a regular set.

     2. any dfa for an infinite set will always contains loop (or dfa without loop is not possible for infinite set).

     3. every non-regular language is an infinite set.}

The word "finite" in finite automata significance the presence of 'finite amount of memory' in automata for the class of regular languages, hence only 'finite' (or says bounded) amount of information can be stored at any instance of time while processing a string of language.

In finite automata, memory is present in the form of states only (whereas in the other class of automata like Pda, Turing Machines external memory are used to store unbounded information). You can think a finite automata as a CPU without explicit memory; that can only store recent results in its registers.

So, we can define "regular language" as — a class of languages for which only bounded (finite) information is required to stored at any instance of time while processing language strings.

Further read (for infinite languages):

What is regular language: What is basically a regular language? And Why a*b* is regular? But language { aⁿbⁿ | n > 0 } is not a regular language
To understand how states are uses as memory element read this answer: How to write regular expression for a DFA
And difference between automate for finite ans infinite regular language: To make sure: Pumping lemma for infinite regular languages only?

Question 2

Each word in the language (a+b) is of finite length. The same way as there are infinitely many integers, but each of them finite.

Yes, the language itself is an infinite set. Most languages are. But a finite automaton (NB: automata is plural) works just fine for them, provided each word is of finite length.

As an aside: This type of question probably should go to cs.stackexchange.com.

Question 3

But (a + b)* is a string of infinite length

No, (a + b)* is a finite way to express an infinite set (language) of finite strings.

Question 4

1. A regular expression describes the string generated by some language. Applying that regular expression gives you all the strings that can be described by that language.

2. When you convert that regular expression to a finite automaton (automata with finite states) , it means that those same strings can also be generated by traversing from state-to-state on that automaton. Now, intuitively, each state here represents a group of strings belonging to that language. It says, after having "absorbed" some input, the string is now in state X.

Example:

If you want a regex to accept strings with even numbers of 0 , then you'll have one state (group) which indicates that even number of 0 has been observed in the input so far. And another state (group) for odd numbers --> this state would be your non-accepting state in the FA.

As shown here, you just needed 2 (finite) states to generate an infinite number of strings, because of the grouping of odd and even we did.

And that is why it is regular.

Question 5

It just means there exists a finite regular expression for the specified language and is no where related to no of strings generated from the expression. For many regular languages we can generate infinite number of strings which follow that language but to that language is regular to prove that we need a regular expression which must be finite. So here the expression (a+b)* is finite way of expressing 0-n number of a's or b's or combination of that but n can take any value which results in infinite no. of strings.