Question

Here's a simple python 3.x TCP server:

import socketserver

class MyTCPHandler(socketserver.BaseRequestHandler):

    def handle(self):
        self.data = self.request.recv(1024).strip()
        print(str(self.client_address[0]) + " wrote: " + str(self.data.decode()))

if __name__ == "__main__":
    HOST, PORT = "localhost", 9999

    server = socketserver.TCPServer((HOST, PORT), MyTCPHandler)
    server.serve_forever()

and client:

import socket
import sys

HOST, PORT = "localhost", 9999

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((HOST, PORT))

while( True ):
    data = input("Msg: ")

    if data == "exit()":
        print("Exiting...")
        sock.close()
        exit();

    sock.sendall(bytes(data, "utf-8"))

#numBytes = ....?
#print("Sent: " + str( numBytes ) + " bytes\n")

I can't figure out how to view the exact number of bytes that I send in a message. I can use len(data), but it doesn't account for the null terminator and such.... Is null terminator being sent as well, or is it irrelevant? I tried researching on an exact byte count of a sent/received message, but I couldn't find any python-specific documentation and only have seen examples of people using len(), which I don't think is exact...

Any ideas?

Was it helpful?

Solution

There is no null terminator in Python strings. If you want to send one, you have to do it explicitly: sock.sendall(bytes(data, "utf-8") + b'\0').

However, there's no good reason to add a null terminator in the first place, unless you're planning to use it as a delimiter between messages. (Note that this won't work for general Python strings, because they're allowed to include null bytes in the middle… but it will work fine for real human-readable text, of course.)

Using null bytes as a delimiter is not a bad idea… but your existing code needs to actually handle that. You can't just call recv(1024) and assume it's a whole message; you have to keep calling recv(1024) in a loop and appending to a buffer until you find a null—and then save everything after that null for the next time through the loop.


Anyway, the sendall method doesn't return the number of bytes sent because it always sends exactly the bytes you gave it (unless there's an error, in which case is raises). So:

buf = bytes(data, "utf-8") + b'\0'
sock.sendall(buf)
bytes_sent = len(buf)

And on the server side, you might want to write a NullTerminatedHandler class like this:

class NullTerminatedHandler(socketserver.BaseRequestHandler):
    def __init__(self):
        self.buf = b''
    def handle(self):
        self.buf += self.request.recv(1024)
        messages = self.buf.split(b'\0')
        for message in messages[:-1]:
            self.handle_message(message)
        self.buf = self.buf[:-1]

Then you can use it like this:

class MyTCPHandler(NullTerminatedHandler):
    def handle_message(self, message):
        print(str(self.client_address[0]) + " wrote: " + str(message.decode()))

While we're at it, you've got some Unicode/string issues. From most serious to least:

  • You should almost never just call decode with no argument. If you're sending UTF-8 data on one side, always explicitly decode('utf-8') on the other.
  • The decode method is guaranteed to return a str, so writing str(message.decode()) just makes your code confusing.
  • There's a reason the sample code uses format instead of calling str on a bunch of objects and concatenating them—it's usually a lot easier to read.
  • It's generally more readable to say data.encode('utf-8') than bytes(data, 'utf-8').
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top