The main reason to have your cache on your app server is the issue of cost. This is the same idea of not having your DB on the same server as your web or app server.
If you have a small scale application you can probably squeeze all your resources on the same machine, but then your ability to recover from any type of failure (and "everything fails"), you will either lose data or it will take part of your service down for some of your users.
Once you have enough app servers your costs for the cache cluster is smaller per server.
From architecture point of view, when scale and high availability are important, you should have more smaller components than few more complex ones.
For example, if you want to add another app server to your fleet as you have more users, it will be faster to add a server, as you have less software components to install on this server, and the server can already serve previous users as their sessions are stored centrally. If you want to remove an app server (or when you lose one), the users that were served by that server can easily migrate to the other servers as their session/status is stored in the cache cluster.