Suppose you have:
>>> s = StringIO("cat\u2014jack")
>>> s.seek(0, os.SEEK_END)
>>> print(s.tell())
8
This however is incorrect. \u2014 is an emdash character - a single character but it is 3 bytes long.
>>> len("\u2014")
1
>>> len("\u2014".encode("utf-8"))
3
StringIO may not use utf-8 for storage either. It used to use UCS-2 or UCS-4 but since pep 393 may be using utf-8 or some other encoding depending on the circumstance.
What ultimately matters is the binary representation you ultimately go with. If you're encoding text, and you'll eventually write the file out encoded as utf-8 then you have to encode the value in it's entirety to know how many bytes it will take up. This is because as a variable-length encoding, utf-8 characters may require multiple bytes to encode.
You could do something like:
>>> s = StringIO("cat\u2014jack")
>>> size = len(s.getvalue().encode('utf-8'))
10