Are devices connected through PCI guaranteed to deliver data at a certain speed?
No. To my understanding, PCI is a shared bus with bus mastering, a feature that allows a single device to temporarily utilize the full bus bandwidth. In non-server environments, that usually means 133MB/s of bandwidth. Once you have a few I/O cards on the PCI bus, you can see how bandwidth might become scarce under load. Misbehaving devices can also adversely affect latency; some controllers hang until its drive responds to I/O, but granted, that's an abnormal situation.
When communicating with the device, I would like my UI to stay responsive. Is it a good idea to kick off communication with the device in a (threadpool) thread, or is the overhead associated with this too large compared to how fast accessing the PCI bus is?
When writing user-facing software, it always a good idea to separate the UI from I/O or other potentially long-running processes. For example, the modern Android SDK enforces this in its HTTP client - it throws an exception if an HTTP request is executed on the UI thread.
Depending on the version of the .NET framework you are targeting, you have several options for separating the device communications from the UI. Take a look at Parallel Processing and Concurrency in the .NET Framework for a high-level overview of the various options. Threading is low-level abstraction, and you may find it easier and less error-prone to use a higher-level abstraction such as Tasks or C#'s async and await keywords.