@minhaz's link got me started in a long quest to understand cellular networking in the Network Layer side of things. So here goes:
This question turns out to be mainly about how 3G networking is implemented. Answering the central question: "So how can GCM receive messages while in the 'Idle State'" can answer all the questions above.
Short Answer
Yes, while in Idle model the radio can still receive limited "control signals". Basically the network operator will ask the device to switch energy states to be able to receive the actual payloads. The mechanism they use to do that is similar to how you receive phone calls or SMS.
Long Answer
It turns out that the state transition is controlled by the network operator, not the cell phone itself. From Resource Radio Control (RRC); Protocol specification:
8.6.3.3 Generic state transition rules depending on received information elements
The IE (Information Element) "RRC State Indicator" indicates the state the UE (User Equipment) shall enter. The UE shall enter the state indicated by the IE "RRC State Indicator...
And how does the network operator do that when the RRC is in Idle Mode? From the book 3G, 4G and Beyond: Bringing Networks, Devices and the Web Together:
2.2.3.3 Radio Resource Control States
... Idle state -- devices not actively communicating with the network are in this state. Here, they periodically listen to the paging channel for incoming voice or video calls and SMS messages.
From LTE in Bullets:
23.1 The RRC connection establishment procedure is always initiated by the UE but can be triggered by either the UE or the network. ... The network triggers the RRC connection establishment procedure by sending a Paging message. ...
So there we have it. Now it is fairly obvious to see how all these tie together. To answer the original questions:
Connection means an RRC connection. Since an RRC connection is Layer 3, all (normal) sorts of network activity including TCP and UDP, will create an RRC connection (i.e. "wake up the radio").
Since the radio still needs to listen to the paging channel it is not completely shut off. Empirically it still uses energy as experimented by the XMPP link provided by @minhaz. In the result it is 2 orders of magnitude less energy consumption compared to the other states.
As mentioned in various places on Stack Overflow and apparently TCP connections are maintained in memory and doesn't really care if the underlying layers have gone through a RRC reconnection procedure. If the TCP connection is idle the RRC connection can be released (i.e. UE can become idle). If it keeps receiving data the network will not instruct the UE to release RRC connection so it won't go Idle.
Since RRC belongs to UMTS WCDMA which is underlying 3G it is likely that iPhone also operates similarly.
Notes:
- The resources I linked to mixes 3G and 4G. I suspect 4G is an incremental improvement so the main concepts can be mixed between them.
Resource dump for more indepth understanding (i.e. sources for the above digest):
- Radio Resource Control - Wikipedia
- 3G, 4G and Beyond: Bringing Networks, Devices and the Web Together
- Introduction to 3G Mobile Communications
- WCDMA Design Handbook
- LTE for 4G Mobile Broadband: Air Interface Technologies and Performance
- Convergence Technologies for 3G Networks: IP, UMTS, EGPRS and ATM
- LTE in Bullets
- Let's Learn LTE: RRC States in LTE
- Cell PCH State
- 3GPP 25.331 Radio Resource Control (RRC); Protocol specification
- 3GPP 36.201: Evolved Universal Terrestrial Radio Access (E-UTRA); LTE physical layer; General description
- High Performance Browser Networking - Chapter 7. Mobile Networks