- 02 Dec, 2022 1 commit
-
-
Jens Axboe authored
commit 7cfe7a09 upstream. With how task_work is added and signaled, we can have TIF_NOTIFY_SIGNAL set and no task_work pending as it got run in a previous loop. Treat TIF_NOTIFY_SIGNAL like get_signal(), always clear it if set regardless of whether or not task_work is pending to run. Cc: stable@vger.kernel.org Fixes: 46a525e1 ("io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL") Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 26 Nov, 2022 1 commit
-
-
Pavel Begunkov authored
commit 91482864 upstream. Having REQ_F_POLLED set doesn't guarantee that the request is executed as a multishot from the polling path. Fortunately for us, if the code thinks it's multishot issue when it's not, it can only ask to skip completion so leaking the request. Use issue_flags to mark multipoll issues. Cc: stable@vger.kernel.org Fixes: 390ed29b ("io_uring: add IORING_ACCEPT_MULTISHOT for accept") Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7700ac57653f2823e30b34dc74da68678c0c5f13.1668710222.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 29 Oct, 2022 1 commit
-
-
Jens Axboe authored
[ Upstream commit 46a525e1 ] This isn't a reliable mechanism to tell if we have task_work pending, we really should be looking at whether we have any items queued. This is problematic if forward progress is gated on running said task_work. One such example is reading from a pipe, where the write side has been closed right before the read is started. The fput() of the file queues TWA_RESUME task_work, and we need that task_work to be run before ->release() is called for the pipe. If ->release() isn't called, then the read will sit forever waiting on data that will never arise. Fix this by io_run_task_work() so it checks if we have task_work pending rather than rely on TIF_NOTIFY_SIGNAL for that. The latter obviously doesn't work for task_work that is queued without TWA_SIGNAL. Reported-by:
Christiano Haesbaert <haesbaert@haesbaert.org> Cc: stable@vger.kernel.org Link: https://github.com/axboe/liburing/issues/665 Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
- 21 Oct, 2022 1 commit
-
-
Pavel Begunkov authored
[ Upstream commit aa1df3a3 ] Overflowing CQEs may result in reordering, which is buggy in case of links, F_MORE and so on. If we guarantee that we don't reorder for the unlikely event of a CQ ring overflow, then we can further extend this to not have to terminate multishot requests if it happens. For other operations, like zerocopy sends, we have no choice but to honor CQE ordering. Reported-by:
Dylan Yudaken <dylany@fb.com> Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ec3bc55687b0768bbe20fb62d7d06cfced7d7e70.1663892031.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
- 27 Jul, 2022 1 commit
-
-
Pavel Begunkov authored
We want to do request allocation out of the core io_uring code, make the allocation functions public for other io_uring parts. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0314fedd3a02a514210ba42d4720332538c65956.1658913593.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 25 Jul, 2022 35 commits
-
-
Pavel Begunkov authored
Allow to flush notifiers as a part of sendzc request by setting IORING_SENDZC_FLUSH flag. When the sendzc request succeedes it will flush the used [active] notifier. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e0b4d9a6797e2fd6092824fe42953db7a519bbc8.1657643355.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Add internal part of send zerocopy notifications. There are two main structures, the first one is struct io_notif, which carries inside struct ubuf_info and maps 1:1 to it. io_uring will be binding a number of zerocopy send requests to it and ask to complete (aka flush) it. When flushed and all attached requests and skbs complete, it'll generate one and only one CQE. There are intended to be passed into the network layer as struct msghdr::msg_ubuf. The second concept is notification slots. The userspace will be able to register an array of slots and subsequently addressing them by the index in the array. Slots are independent of each other. Each slot can have only one notifier at a time (called active notifier) but many notifiers during the lifetime. When active, a notifier not going to post any completion but the userspace can attach requests to it by specifying the corresponding slot while issueing send zc requests. Eventually, the userspace will want to "flush" the notifier losing any way to attach new requests to it, however it can use the next atomatically added notifier of this slot or of any other slot. When the network layer is done with all enqueued skbs attached to a notifier and doesn't need the specified in them user data, the flushed notifier will post a CQE. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3ecf54c31a85762bf679b0a432c9f43ecf7e61cc.1657643355.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Make io_put_task() available to non-core parts of io_uring, we'll need it for notification infrastructure. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3686807d4c03b72e389947b0e8692d4d44334ef0.1657643355.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
If we're offloading requests directly to io-wq because IOSQE_ASYNC was set in the sqe, we can miss hashing writes appropriately because we haven't set REQ_F_ISREG yet. This can cause a performance regression with buffered writes, as io-wq then no longer correctly serializes writes to that file. Ensure that we set the flags in io_prep_async_work(), which will cause the io-wq work item to be hashed appropriately. Fixes: 584b0180 ("io_uring: move read/write file prep state into actual opcode handler") Link: https://lore.kernel.org/io-uring/20220608080054.GB22428@xsang-OptiPlex-9020/ Reported-by:
kernel test robot <oliver.sang@intel.com> Tested-by:
Yin Fengwei <fengwei.yin@intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Dylan Yudaken authored
In overflow we see a duplcate line in the trace, and in some cases 3 lines (if initial io_post_aux_cqe fails). Instead just trace once for each CQE Signed-off-by:
Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-13-dylany@fb.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Dylan Yudaken authored
Some use cases of io_post_aux_cqe would not want to overflow as is, but might want to change the flags/result. For example multishot receive requires in order CQE, and so if there is an overflow it would need to stop receiving until the overflow is taken care of. Signed-off-by:
Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-8-dylany@fb.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Dylan Yudaken authored
For multishot we want a way to signal the caller that multishot has ended but also this might not be an error return. For example sockets return 0 when closed, which should end a multishot recv, but still have a CQE with result 0 Introduce IOU_STOP_MULTISHOT which does this and indicates that the return code is stored inside req->cqe Signed-off-by:
Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220630091231.1456789-7-dylany@fb.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Dylan Yudaken authored
This optimisation has some built in assumptions that make it easy to introduce bugs. It also does not have clear wins that make it worth keeping. Signed-off-by:
Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220622134028.2013417-2-dylany@fb.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
It's annoying to have io-wq.h as a dependency every time we want some of struct io_wq_work_list helpers, move them into a separate file. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c1d891ce12b30767d1d2a3b7db2ca3abc1ecc4a2.1655802465.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Since SQPOLL now uses TWA_SIGNAL_NO_IPI, there won't be task work items without TIF_NOTIFY_SIGNAL. Simplify io_run_task_work() by removing task->task_works check. Even though looks it doesn't cause extra cache bouncing, it's still nice to not touch it an extra time when it might be not cached. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/75d4f34b0c671075892821a409e28da6cb1d64fe.1655802465.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Improve naming of the inline/deferred completion helper so it's consistent with it's *_post counterpart. Add some comments and extra lockdeps to ensure the locking is done right. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/797c619943dac06529e9d3fcb16e4c3cde6ad1a3.1655684496.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Since __io_commit_cqring_flush users moved to different files, introduce io_commit_cqring_flush() helper and encapsulate all flags testing details inside. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0da03887435dd9869ffe46dcd3962bf104afcca3.1655684496.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
spin_lock(&ctx->completion_lock); /* post CQEs */ io_commit_cqring(ctx); spin_unlock(&ctx->completion_lock); io_cqring_ev_posted(ctx); We have many places repeating this sequence, and the three function unlock section is not perfect from the maintainance perspective and also makes it harder to add new locking/sync trick. Introduce two helpers. io_cq_lock(), which is simple and only grabs ->completion_lock, and io_cq_unlock_post() encapsulating the three call section. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/fe0c682bf7f7b55d9be55b0d034be9c1949277dc.1655684496.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
It's not clear how widely used IOSQE_CQE_SKIP_SUCCESS is, and how often ->flush_cqes flag prevents from completion being flushed. Sometimes it's high level of concurrency that enables it at least for one CQE, but sometimes it doesn't save much because nobody waiting on the CQ. Remove ->flush_cqes flag and the optimisation, it should benefit the normal use case. Note, that there is no spurious eventfd problem with that as checks for spuriousness were incorporated into io_eventfd_signal(). Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/692e81eeddccc096f449a7960365fa7b4a18f8e6.1655637157.git.asml.silence@gmail.com [axboe: remove now dead state->flush_cqes variable] Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
It's a good idea to first do forward declarations and then inline helpers, otherwise there will be keep stumbling on dependencies between them. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1d7fa6672ed43f20ccc0c54ae201369ebc3ebfab.1655637157.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Move io_uring types to linux/include, need them public so tracing can see the definitions and we can clean trace/events/io_uring.h Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a15f12e8cb7289b2de0deaddcc7518d98a132d17.1655384063.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
With IORING_SETUP_CQE32 ->cqe_cached doesn't store a real address but rather an implicit offset into cqes. Store the real cqe pointer and increment it accordingly if CQE32. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1ee1838cba16bed96381a006950b36ba640d998c.1655455613.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Deduplicate calls to io_get_cqe() from __io_fill_cqe_req(). Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4fa077986cc3abab7c59ff4e7c390c783885465f.1655455613.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Deduplicate two trace_io_uring_complete() calls in __io_fill_cqe_req(). Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/277ed85dba5189ab7d932164b314013a0f0b0fdc.1655455613.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
__io_fill_cqe_req() is hot and inlined, we want it to be as small as possible. Add io_req_cqe_overflow() accepting only a request and doing all overflow accounting, and replace with it two calls to 6 argument io_cqring_event_overflow(). Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/048b9fbcce56814d77a1a540409c98c3d383edcb.1655455613.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
__io_get_cqe() is not as hot as io_get_cqe(), no need to inline it, it sheds ~500B from the binary. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c1ac829198a881b7af8710926f99a3559b9f24c0.1655455613.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Deduplicate some code and add a helper for filling an aux CQE, locking and notification. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b7c6557c8f9dc5c4cfb01292116c682a0ff61081.1655455613.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
REQ_F_COMPLETE_INLINE is only needed to delay queueing into the completion list to io_queue_sqe() as __io_req_complete() is inlined and we don't want to bloat the kernel. As now we complete in a more centralised fashion in io_issue_sqe() we can get rid of the flag and queue to the list directly. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/600ba20a9338b8a39b249b23d3d177803613dde4.1655371007.git.asml.silence@gmail.com Reviewed-by:
Hao Xu <howeyxu@tencent.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
There is a bunch of inline helpers that will be useful not only to the core of io_uring, move them to headers. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/22df99c83723e44cba7e945e8519e64e3642c064.1655310733.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Move both the opcodes related to it, and the internals code dealing with it. Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
This also helps cleanup the io_uring.h cancel parts, as we can make things static in the cancel.c file, mostly. Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Add a io_poll_issue() rather than export the general task_work locking and io_issue_sqe(), and put the io_op_defs definition and structure into a separate header file so that poll can use it. Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
While at it, convert the handlers to just use io_eopnotsupp_prep() if CONFIG_NET isn't set. Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-