Have you noticed that raw_input writes the prompt string into stderr if stdout is terminal (isatty); if stdout is not a terminal, then the prompt too is written to stdout, but stdout will be in fully buffered mode.
With stdout on a tty
write(1, "Hello.\n", 7) = 7
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(2, "Type your name: ", 16) = 16
fstat(0, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 3), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb114059000
read(0, "abc\n", 1024) = 4
write(1, "Nice to meet you, abc!\n", 23) = 23
With stdout not on a tty
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff8d9d3410) = -1 ENOTTY (Inappropriate ioctl for device)
# oops, python noticed that stdout is NOTTY.
fstat(0, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 3), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f29895f0000
read(0, "abc\n", 1024) = 4
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f29891c4bd0}, {0x451f62, [], SA_RESTORER, 0x7f29891c4bd0}, 8) = 0
write(1, "Hello.\nType your name: Nice to m"..., 46) = 46
# squeeze all output at the same time into stdout... pfft.
Thus all writes are squeezed into stdout all at the same time; and what is worse, after the input is read.
The real solution is thus to use the pty. However you are doing it wrong. For the pty to work, you must use the pty.fork() command, not subprocess. (This will be very tricky). I have some working code that goes like this:
import os
import tty
import pty
program = "python"
# command name in argv[0]
argv = [ "python", "foo.py" ]
pid, master_fd = pty.fork()
# we are in the child process
if pid == pty.CHILD:
# execute the program
os.execlp(program, *argv)
# else we are still in the parent, and pty.fork returned the pid of
# the child. Now you can read, write in master_fd, or use select:
# rfds, wfds, xfds = select.select([master_fd], [], [], timeout)
Notice that depending on the terminal mode set by the child program there might be different kinds of linefeeds coming out, etc.
Now about the "waiting for input" problem, that cannot be really helped as one can always write to a pseudoterminal; the characters will be put to wait in the buffer. Likewise, a pipe always allows one to write up to 4K or 32K or some other implementation defined amount, before blocking. One ugly way is to strace the program and notice whenever it enters the read system call, with fd = 0; the other would be to make a C module with a replacement "read()" system call and link it in before glibc for the dynamic linker (fails if the executable is statically linked or uses system calls directly with assembler...), and then would signal python whenever the read(0, ...) system call is executed. All in all, probably not worth the trouble exactly.