-
Shaohua Li authored
Herbert Poetzl reported a performance regression since 2.6.39. The test is a simple dd read, but with big block size. The reason is: T1: ra (A, A+128k), (A+128k, A+256k) T2: lock_page for page A, submit the 256k T3: hit page A+128K, ra (A+256k, A+384). the range isn't submitted because of plug and there isn't any lock_page till we hit page A+256k because all pages from A to A+256k is in memory T4: hit page A+256k, ra (A+384, A+ 512). Because of plug, the range isn't submitted again. T5: lock_page A+256k, so (A+256k, A+512k) will be submitted. The task is waitting for (A+256k, A+512k) finish. There is no request to disk in T3 and T4, so readahead pipeline breaks. We really don't need block plug for generic_file_aio_read() for buffered I/O. The readahead already has plug and has fine grained control when I/O should be submitted. Deleting plug for buffered I/O fixes the regression. One side effect is plug makes the request size 256k, the size is 128k...
3deaa719