In SQL, what's the difference between JOIN and CROSS JOIN?

Question 1

MySQL doesn't offer a distinction between JOIN and CROSS JOIN. They are the same.

In both your examples the clause

WHERE t1.a3 = t2.a1

converts any sort of join into an inner join. The standard way of expressing this query is

SELECT t1.a1, t1.a2, t1.a3 
  FROM t1 
  JOIN t2 ON t1.a3 = t2.a1

Question 2

SQL has the following types of joins, all of which come straight from set theory:

Inner join.
From A inner join B is the equivalent of A ∩ B, providing the set of elements common to both sets.
Left outer join.
From A left outer join B is the equivalent of (A − B) ∪ (A ∩ B). Each A will appear at least once; if there are multiple matching Bs, the A will be repeated once per matching B.
Right outer join.
From A right outer join B is the equivalent of (A ∩ B) ∪ (B − A). It is identical to a left join with the tables trading places. Each B will appear at least once; if there are multiple matching As, each B will be repeated once per matching B.
Full outer join.
From A full outer join B is the equivalent of (A − B) ∪ (A ∩ B) ∪ (B − A). Each A and each B will appear at least once. If an A matches multiple Bs it will be repeated once per match; if a B matches multiple As it will be repeated once per match.
Cross Join.
From A cross join B is produces the cartesian product A × B. Each A will be repeated once for every B. If A has 100 rows and B has 100 rows, the result set will consist of 10,000 rows.

It should be noted that the theoretical execution of a select query consists of the following steps performed in this order:

Compute the full cartesian product of the source set(s) in the from clause to prime the candidate result set.
Apply the join criteria in the from clause and reduce the candidate result set.
Apply the criteria in the where clause to further reduce the candidate result set.
partition the candidate result set into groups based on the criteria in the group by clause.
Remove from the candidate result set any columns other than those involved in the group by clause or involved in the evaluation of an aggregate function.
Compute the value of any such aggregate functions for each group in the candidate result set.
Collapse each group in the candidate result set into a single row consisting of the grouping columns and the computed values for each aggregate function. The candidate result set now consists of one row for each group, with all columns other than the group by columns or the compute values of aggregate functions for the group are eliminated.
Apply the criteria in the having clause to reduce the candidate result set and produce the final result set.
Order the final result set by the criteria in order by clause and emit it.

There are more steps, having to do with things like compute and compute by clauses, but this is sufficient to get the theoretical notion of how it works.

It should also be noted that nothing but the most naïve implementation would actually evaluate a select statement this way, but the results produced must be the same as if the above steps were performed in full.

Question 3

There's no difference, although one can smile about it a little. You make a cross join first and with the where clause you convert it to an inner join (at least with the first one). This way of writing joins is more than 20 years old and kind of error prone in complex statements. I'd recommend to use the new funky way of writing it like this:

select t1.a1, t1.a2, t1.a3 
from t1 
inner join t2 on t1.a3 = t2.a1