Short answer: **Yes**, in theory, an I/O device *could* cause the CPU to "block" on an I/O read (`in` instruction).
However, I'm not aware of any memory or I/O devices that actually stalled for any significant period of time, causing CPU execution to "block".
---
Long Answer:
The `in` and `out` instructions perform an I/O read/write, which is almost identical to a typical memory bus cycle. The only difference is that a different signal(s) is asserted to indicate I/O vs. memory access.
Now this gets pretty low-level, and the details get more complex with later CPUs. I'm referencing [this presentation][1] goes into signal-level detail about the x86 bus cycles, starting with the 8086/8088.
**8086/8088 Read Cycle with 1 Wait State**
![8086/8088 Read Cycle with 1 Wait State][2]
<sup>https://web.archive.org/web/20130319052544/http://www.ece.msstate.edu/~reese/EE3724/lectures/bustran/bustran.pdf</sup>
We see here that there is a `READY` signal, which is asserted by the memory or I/O device, to indicate that it has presented its data to the bus, and is *ready* for the CPU to latch it in. That PDF states
> x86 has a READY Input Line on Control Bus
> – READY Input “Checked” During T3
> – If READY is Inactive (LOW), Additional T3 States are Added
> – These Additional T3 States are Called “Wait States”
So it is possible, with these older CPUs at least, that a device could wait many cycles before asserting `READY`, causing the CPU to "block" on the memory or I/O instruction.
I believe this is still valid, at least through the [Pentium 4][3], which has a `DRDY#` (Data Ready) pin that `"is asserted by the data driver on each data transfer, indicating valid data on the data bus. In a multi-common clock data transfer, DRDY# may be de-asserted to insert idle clocks."`
---
Longer Answer:
With the early systems, I believe many of the system devices were connected directly to the address/data/other lines and communicated directly with the CPU. So some custom or rouge device could probably "stall" on a bus cycle.
Now days, the architecture is much different. Modern x86 processors don't even have "address" and "data" pins per se, but instead have links like [DMI][4] and [QPI][5], which communicate to a northbridge/southbridge (or [Platform Controller Hub][6]) setup. These devices then forward on the memory/IO requests to appropriate devices. With this setup, I doubt that the PCH would allow an outgoing I/O read to stall a processor request over the QPI link.
[1]:
[2]:
[3]:
[4]:
[To see links please register here]
[5]:
[To see links please register here]
[6]:
[To see links please register here]