Commits · v3.0-rc1 · Upstream / linux-stable

27 May, 2011 1 commit

block: export blk_{get,put}_queue() · d86e0e83

Jens Axboe authored 14 years ago


We need them in SCSI to fix a bug, but currently they are not
exported to modules. Export them.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

d86e0e83

26 May, 2011 1 commit

block: remove unused variable in bio_attempt_front_merge() · 700c4f33

Luca Tettamanti authored 14 years ago


sector is never read inside the function.
Signed-off-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

700c4f33

23 May, 2011 1 commit

block: call elv_bio_merged() when merged · 95cf3dd9

Vivek Goyal authored 14 years ago

Commit 73c10101

 ("block: initial patch for on-stack per-task plugging")
removed calls to elv_bio_merged() when @bio merged with @req. Re-add them.

This in turn will update merged stats in associated group. That
should be safe as long as request has got reference to the blkio_group.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Divyesh Shah <dpshah@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

95cf3dd9

20 May, 2011 2 commits

block: get rid of on-stack plugging debug checks · 771949d0

Jens Axboe authored 14 years ago


We don't need them anymore, so kill:

- REQ_ON_PLUG checks in various places
- !rq_mergeable() check in plug merging
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

771949d0

blk-cgroup: Allow sleeping while dynamically allocating a group · f469a7b4

Vivek Goyal authored 14 years ago


Currently, all the cfq_group or throtl_group allocations happen while
we are holding ->queue_lock and sleeping is not allowed.

Soon, we will move to per cpu stats and also need to allocate the
per group stats. As one can not call alloc_percpu() from atomic
context as it can sleep, we need to drop ->queue_lock, allocate the
group, retake the lock and continue processing.

In throttling code, I check the queue DEAD flag again to make sure
that driver did not call blk_cleanup_queue() in the mean time.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

f469a7b4

18 May, 2011 1 commit

block: don't delay blk_run_queue_async · 3ec717b7

Shaohua Li authored 14 years ago


Let's check a scenario:
1. blk_delay_queue(q, SCSI_QUEUE_DELAY);
2. blk_run_queue_async();
the second one will became a noop, because q->delay_work already has
WORK_STRUCT_PENDING_BIT set, so the delayed work will still run after
SCSI_QUEUE_DELAY. But blk_run_queue_async actually hopes the delayed
work runs immediately.

Fix this by doing a cancel on potentially pending delayed work
before queuing an immediate run of the workqueue.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

3ec717b7

19 Apr, 2011 2 commits

block: remove stale kerneldoc member from __blk_run_queue() · d350e6b6

Jens Axboe authored 14 years ago


We don't pass in a 'force_kblockd' anymore, get rid of the
stsale comment.
Reported-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

d350e6b6

block: get rid of QUEUE_FLAG_REENTER · c21e6beb

Jens Axboe authored 14 years ago


We are currently using this flag to check whether it's safe
to call into ->request_fn(). If it is set, we punt to kblockd.
But we get a lot of false positives and excessive punts to
kblockd, which hurts performance.

The only real abuser of this infrastructure is SCSI. So export
the async queue run and convert SCSI over to use that. There's
room for improvement in that SCSI need not always use the async
call, but this fixes our performance issue and they can fix that
up in due time.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

c21e6beb

18 Apr, 2011 7 commits

block: kill blk_flush_plug_list() export · bd900d45

Jens Axboe authored 14 years ago


With all drivers and file systems converted, we only have
in-core use of this function. So remove the export.
Reporteed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

bd900d45

block, xen/blkback: remove blk_[get|put]_queue calls. · d2436eda

Konrad Rzeszutek Wilk authored 14 years ago


They were used to check if the queue does not have QUEUE_FLAG_DEAD
set. That is not necessary anymore as the 'submit_io' call
ends up doing that for us.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

d2436eda

block: add blk_run_queue_async · 24ecfbe2

Christoph Hellwig authored 14 years ago


Instead of overloading __blk_run_queue to force an offload to kblockd
add a new blk_run_queue_async helper to do it explicitly.  I've kept
the blk_queue_stopped check for now, but I suspect it's not needed
as the check we do when the workqueue items runs should be enough.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

24ecfbe2

block: blk_delay_queue() should use kblockd workqueue · 4521cc4e
Jens Axboe authored 14 years ago
```
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
```
4521cc4e

block: drop queue lock before calling __blk_run_queue() for kblockd punt · 99e22598

Jens Axboe authored 14 years ago


If we know we are going to punt to kblockd, we can drop the queue
lock before calling into __blk_run_queue() since it only does a
safe bit test and a workqueue call. Since kblockd needs to grab
this very lock as one of the first things it does, it's a good
optimization to drop the lock before waking kblockd.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

99e22598

Revert "block: add callback function for unplug notification" · b4cb290e

Jens Axboe authored 14 years ago

MD can't use this since it really requires us to be able to
keep more than a single piece of state for the unplug. Commit
048c9374 added the required support for MD, so get rid of this
now unused code.

This reverts commit f7566457

.

Conflicts:

	block/blk-core.c
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

b4cb290e

block: Enhance new plugging support to support general callbacks · 048c9374

NeilBrown authored 14 years ago


md/raid requires an unplug callback, but as it does not uses
requests the current code cannot provide one.

So allow arbitrary callbacks to be attached to the blk_plug.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

048c9374

16 Apr, 2011 1 commit

block: make unplug timer trace event correspond to the schedule() unplug · 49cac01e

Jens Axboe authored 14 years ago


It's a pretty close match to what we had before - the timer triggering
would mean that nobody unplugged the plug in due time, in the new
scheme this matches very closely what the schedule() unplug now is.
It's essentially the difference between an explicit unplug (IO unplug)
or an implicit unplug (timer unplug, we scheduled with pending IO
queued).
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

49cac01e

15 Apr, 2011 2 commits

block: only force kblockd unplugging from the schedule() path · f6603783

Jens Axboe authored 14 years ago


For the explicit unplugging, we'd prefer to kick things off
immediately and not pay the penalty of the latency to switch
to kblockd. So let blk_finish_plug() do the run inline, while
the implicit-on-schedule-out unplug will punt to kblockd.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

f6603783

block: cleanup the block plug helper functions · 88b996cd

Christoph Hellwig authored 14 years ago


It's a bit of a mess currently. task->plug is being cleared
and reset in __blk_finish_plug(), and blk_finish_plug() is
testing for a NULL plug which cannot happen even from schedule()
anymore since it uses blk_needs_flush_plug() to determine
whether to call into this function at all.

So get rid of some of the cruft.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

88b996cd

14 Apr, 2011 1 commit

block: export blk_get/put_queue for blkback · 690f1b63

Jeremy Fitzhardinge authored 16 years ago


Impact: build fix

I'm not sure if blkback should be using these functions, but in the
meantime export them to allow blkback to be a module.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

690f1b63

12 Apr, 2011 6 commits

block: move queue run on unplug to kblockd · f4af3c3d

Jens Axboe authored 14 years ago


There are worries that we are now consuming a lot more stack in
some cases, since we potentially call into IO dispatch from
schedule() or io_schedule(). We can reduce this problem by moving
the running of the queue to kblockd, like the old plugging scheme
did as well.

This may or may not be a good idea from a performance perspective,
depending on how many tasks have queue plugs running at the same
time. For even the slightly contended case, doing just a single
queue run from kblockd instead of multiple runs directly from the
unpluggers will be faster.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

f4af3c3d

block: kill queue_sync_plugs() · cf82c798

Jens Axboe authored 14 years ago


The original use for this dates back to when we had to track write
requests for serializing around barriers. That's not needed anymore,
so kill it.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

cf82c798

block: readd plug trace event · dc6d36c9

Jens Axboe authored 14 years ago


This was removed with the queue plug state. But we can easily readd
by checking if this is the first request going to this queue. It's
good information to have when tracing to see how effective the
plugging is.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

dc6d36c9

block: add callback function for unplug notification · f7566457

Jens Axboe authored 14 years ago


MD would like to know when a queue is unplugged, so it can flush
it's bitmap writes. Add such a callback.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

f7566457

block: add comment on why we save and disable interrupts in flush_plug_list() · 18811272
Jens Axboe authored 14 years ago
```
It's done at the top to avoid doing it for every queue we unplug.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
```
18811272

block: fixup block IO unplug trace call · 94b5eb28

Jens Axboe authored 14 years ago


It was removed with the on-stack plugging, readd it and track the
depth of requests added when flushing the plug.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

94b5eb28

11 Apr, 2011 1 commit

block: splice plug list to local context · 109b8129

NeilBrown authored 14 years ago


If the request_fn ends up blocking, we could be re-entering
the plug flush. Since the list is protected by explicitly
not allowing schedule events, this isn't a terribly good idea.

Additionally, it can cause us to recurse. As request_fn called by
__blk_run_queue is allowed to 'schedule()' (after dropping the queue
lock of course), it is possible to get a recursive call:

 schedule -> blk_flush_plug -> __blk_finish_plug -> flush_plug_list
      -> __blk_run_queue -> request_fn -> schedule

We must make sure that the second schedule does not call into
blk_flush_plug again.  So instead of leaving the list of requests on
blk_plug->list, move them to a separate list leaving blk_plug->list
empty.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

109b8129

05 Apr, 2011 2 commits

block: fix request sorting at unplug · f83e8261

Konstantin Khlebnikov authored 14 years ago


Comparison function for list_sort() must be anticommutative,
otherwise it is not sorting in ordinary meaning.

But fortunately list_sort() always check ((*cmp)(priv, a, b) <= 0)
it not distinguish negative and zero, so comparison function can
implement only less-or-equal instead of full three-way comparison.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

f83e8261

block: dump request state on seeing a corrupted request completion · 8182924b

Jens Axboe authored 14 years ago


Currently we just dump a non-informative 'request botched' message.
Lets actually try and print something sane to help debug issues
around this.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

8182924b

31 Mar, 2011 1 commit

Fix common misspellings · 25985edc

Lucas De Marchi authored 14 years ago


Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

25 Mar, 2011 2 commits

block: fix issue with calling blk_stop_queue() from the request_fn handler · ad3d9d7e

Jens Axboe authored 14 years ago


When the queue work handler was converted to delayed work, the
stopping was inadvertently made sync as well. Change this back
to being async stop, using __cancel_delayed_work() instead of
cancel_delayed_work().
Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
Reported-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

ad3d9d7e

block: fix bug with inserting flush requests as sort/merge · 401a18e9

Jens Axboe authored 14 years ago


With the introduction of the on-stack plugging, we would assume
that any request being inserted was a normal file system request.
As flush/fua requires a special insert mode, this caused problems.

Fix this up by checking for this in flush_plug_list() and use
the appropriate insert mechanism.

Big thanks goes to Markus Tripplesdorf for tirelessly testing
patches, and to Sergey Senozhatsky for helping find the real
issue.
Reported-by: Markus Tripplesdorf <markus@trippelsdorf.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

401a18e9

21 Mar, 2011 1 commit

block: attempt to merge with existing requests on plug flush · 5e84ea3a

Jens Axboe authored 14 years ago


One of the disadvantages of on-stack plugging is that we potentially
lose out on merging since all pending IO isn't always visible to
everybody. When we flush the on-stack plugs, right now we don't do
any checks to see if potential merge candidates could be utilized.

Correct this by adding a new insert variant, ELEVATOR_INSERT_SORT_MERGE.
It works just ELEVATOR_INSERT_SORT, but first checks whether we can
merge with an existing request before doing the insertion (if we fail
merging).

This fixes a regression with multiple processes issuing IO that
can be merged.

Thanks to Shaohua Li <shaohua.li@intel.com> for testing and fixing
an accounting bug.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

5e84ea3a

10 Mar, 2011 4 commits

block: kill off REQ_UNPLUG · 721a9602

Jens Axboe authored 14 years ago


With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

721a9602

block: remove per-queue plugging · 7eaceacc

Jens Axboe authored 14 years ago


Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

7eaceacc

block: initial patch for on-stack per-task plugging · 73c10101

Jens Axboe authored 14 years ago


This patch adds support for creating a queuing context outside
of the queue itself. This enables us to batch up pieces of IO
before grabbing the block device queue lock and submitting them to
the IO scheduler.

The context is created on the stack of the process and assigned in
the task structure, so that we can auto-unplug it if we hit a schedule
event.

The current queue plugging happens implicitly if IO is submitted to
an empty device, yet callers have to remember to unplug that IO when
they are going to wait for it. This is an ugly API and has caused bugs
in the past. Additionally, it requires hacks in the vm (->sync_page()
callback) to handle that logic. By switching to an explicit plugging
scheme we make the API a lot nicer and can get rid of the ->sync_page()
hack in the vm.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

73c10101

block: add API for delaying work/request_fn a little bit · 3cca6dc1

Jens Axboe authored 14 years ago


Currently we use plugging for that, but as plugging is going away,
we need an alternative mechanism.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

3cca6dc1

03 Mar, 2011 2 commits

block: Move blk_throtl_exit() call to blk_cleanup_queue() · da527770

Vivek Goyal authored 14 years ago

Move blk_throtl_exit() in blk_cleanup_queue() as blk_throtl_exit() is
written in such a way that it needs queue lock. In blk_release_queue()
there is no gurantee that ->queue_lock is still around.

Initially blk_throtl_exit() was in blk_cleanup_queue() but Ingo reported
one problem.

  https://lkml.org/lkml/2010/10/23/86

  And a quick fix moved blk_throtl_exit() to blk_release_queue().

        commit 7ad58c02


        Author: Jens Axboe <jaxboe@fusionio.com>
        Date:   Sat Oct 23 20:40:26 2010 +0200

        block: fix use-after-free bug in blk throttle code

This patch reverts above change and does not try to shutdown the
throtl work in blk_sync_queue(). By avoiding call to
throtl_shutdown_timer_wq() from blk_sync_queue(), we should also avoid
the problem reported by Ingo.

blk_sync_queue() seems to be used only by md driver and it seems to be
using it to make sure q->unplug_fn is not called as md registers its
own unplug functions and it is about to free up the data structures
used by unplug_fn(). Block throttle does not call back into unplug_fn()
or into md. So there is no need to cancel blk throttle work.

In fact I think cancelling block throttle work is bad because it might
happen that some bios are throttled and scheduled to be dispatched later
with the help of pending work and if work is cancelled, these bios might
never be dispatched.

Block layer also uses blk_sync_queue() during blk_cleanup_queue() and
blk_release_queue() time. That should be safe as we are also calling
blk_throtl_exit() which should make sure all the throttling related
data structures are cleaned up.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

da527770

block: Initialize ->queue_lock to internal lock at queue allocation time · c94a96ac

Vivek Goyal authored 14 years ago


There does not seem to be a clear convention whether q->queue_lock is
initialized or not when blk_cleanup_queue() is called. In the past it
was not necessary but now blk_throtl_exit() takes up queue lock by
default and needs queue lock to be available.

In fact elevator_exit() code also has similar requirement just that it
is less stringent in the sense that elevator_exit() is called only if
elevator is initialized.

Two problems have been noticed because of ambiguity about spin lock
status.

      - If a driver calls blk_alloc_queue() and then soon calls
        blk_cleanup_queue() almost immediately, (because some other
	driver structure allocation failed or some other error happened)
	then blk_throtl_exit() will run into issues as queue lock is not
	initialized. Loop driver ran into this issue recently and I
	noticed error paths in md driver too. Similar error paths should
	exist in other drivers too.

      - If some driver provided external spin lock and zapped the lock
        before blk_cleanup_queue(), then it can lead to issues.

So this patch initializes the default queue lock at queue allocation time.

block throttling code is one of the users of queue lock and it is
initialized at the queue allocation time, so it makes sense to
initialize ->queue_lock also to internal lock. A driver can overide that
lock later. This will take care of the issue where a driver does not have
to worry about initializing the queue lock to default before calling
blk_cleanup_queue()
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

c94a96ac

02 Mar, 2011 1 commit

block: add @force_kblockd to __blk_run_queue() · 1654e741

Tejun Heo authored 14 years ago


__blk_run_queue() automatically either calls q->request_fn() directly
or schedules kblockd depending on whether the function is recursed.
blk-flush implementation needs to be able to explicitly choose
kblockd.  Add @force_kblockd.

All the current users are converted to specify %false for the
parameter and this patch doesn't introduce any behavior change.

stable: This is prerequisite for fixing ide oops caused by the new
        blk-flush implementation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

1654e741

01 Mar, 2011 1 commit

blk-throttle: Do not use kblockd workqueue for throtl work · 450adcbe

Vivek Goyal authored 14 years ago

o Dominik Klein reported a system hang issue while doing some blkio
  throttling testing.

  https://lkml.org/lkml/2011/2/24/173



o Some tracing revealed that CFQ was not dispatching any more jobs as
  queue unplug was not happening. And queue unplug was not happening
  because unplug work was not being called as there was one throttling
  work on same cpu which as not finished yet. And throttling work had not
  finished as it was tyring to dispatch a bio to CFQ but all the request
  descriptors were consume to it was put to sleep.

o So basically it is a cyclic dependecny between CFQ unplug work and
  throtl dispatch work. Tejun suggested that use separate workqueue for
  such cases.

o This patch uses a separate workqueue for throttle related work and
  does not rely on kblockd workqueue anymore.

Cc: stable@kernel.org
Reported-by: Dominik Klein <dk@in-telegence.net>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

450adcbe