1. 06 May, 2011 1 commit
    • shaohua.li@intel.com's avatar
      block: hold queue if flush is running for non-queueable flush drive · 3ac0cc45
      shaohua.li@intel.com authored
      In some drives, flush requests are non-queueable. When flush request is
      running, normal read/write requests can't run. If block layer dispatches
      such request, driver can't handle it and requeue it.  Tejun suggested we
      can hold the queue when flush is running. This can avoid unnecessary
      requeue.  Also this can improve performance. For example, we have
      request flush1, write1, flush 2. flush1 is dispatched, then queue is
      hold, write1 isn't inserted to queue. After flush1 is finished, flush2
      will be dispatched. Since disk cache is already clean, flush2 will be
      finished very soon, so looks like flush2 is folded to flush1.
      
      In my test, the queue holding completely solves a regression introduced by
      commit 53d63e6b
      
      :
      
          block: make the flush insertion use the tail of the dispatch list
      
          It's not a preempt type request, in fact we have to insert it
          behind requests that do specify INSERT_FRONT.
      
      which causes about 20% regression running a sysbench fileio
      workload.
      
      Stable: 2.6.39 only
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarShaohua Li <shaohua.li@intel.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      3ac0cc45
  2. 18 Apr, 2011 1 commit
  3. 05 Apr, 2011 2 commits
  4. 10 Mar, 2011 2 commits
    • Jens Axboe's avatar
      block: remove per-queue plugging · 7eaceacc
      Jens Axboe authored
      
      Code has been converted over to the new explicit on-stack plugging,
      and delay users have been converted to use the new API for that.
      So lets kill off the old plugging along with aops->sync_page().
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      7eaceacc
    • Jens Axboe's avatar
      block: initial patch for on-stack per-task plugging · 73c10101
      Jens Axboe authored
      
      This patch adds support for creating a queuing context outside
      of the queue itself. This enables us to batch up pieces of IO
      before grabbing the block device queue lock and submitting them to
      the IO scheduler.
      
      The context is created on the stack of the process and assigned in
      the task structure, so that we can auto-unplug it if we hit a schedule
      event.
      
      The current queue plugging happens implicitly if IO is submitted to
      an empty device, yet callers have to remember to unplug that IO when
      they are going to wait for it. This is an ugly API and has caused bugs
      in the past. Additionally, it requires hacks in the vm (->sync_page()
      callback) to handle that logic. By switching to an explicit plugging
      scheme we make the API a lot nicer and can get rid of the ->sync_page()
      hack in the vm.
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      73c10101
  5. 02 Mar, 2011 2 commits
    • Tejun Heo's avatar
      block: blk-flush shouldn't call directly into q->request_fn() __blk_run_queue() · 255bb490
      Tejun Heo authored
      blk-flush decomposes a flush into sequence of multiple requests.  On
      completion of a request, the next one is queued; however, block layer
      must not implicitly call into q->request_fn() directly from completion
      path.  This makes the queue behave unexpectedly when seen from the
      drivers and violates the assumption that q->request_fn() is called
      with process context + queue_lock.
      
      This patch makes blk-flush the following two changes to make sure
      q->request_fn() is not called directly from request completion path.
      
      - blk_flush_complete_seq_end_io() now asks __blk_run_queue() to always
        use kblockd instead of calling directly into q->request_fn().
      
      - queue_next_fseq() uses ELEVATOR_INSERT_REQUEUE instead of
        ELEVATOR_INSERT_FRONT so that elv_insert() doesn't try to unplug the
        request queue directly.
      
      Reported by Jan in the following threads.
      
       http://thread.gmane.org/gmane.linux.ide/48778
       http://thread.gmane.org/gmane.linux.ide/48786
      
      
      
      stable: applicable to v2.6.37.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarJan Beulich <JBeulich@novell.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      255bb490
    • Tejun Heo's avatar
      block: add @force_kblockd to __blk_run_queue() · 1654e741
      Tejun Heo authored
      
      __blk_run_queue() automatically either calls q->request_fn() directly
      or schedules kblockd depending on whether the function is recursed.
      blk-flush implementation needs to be able to explicitly choose
      kblockd.  Add @force_kblockd.
      
      All the current users are converted to specify %false for the
      parameter and this patch doesn't introduce any behavior change.
      
      stable: This is prerequisite for fixing ide oops caused by the new
              blk-flush implementation.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      1654e741
  6. 25 Jan, 2011 2 commits
    • Tejun Heo's avatar
      block: reimplement FLUSH/FUA to support merge · ae1b1539
      Tejun Heo authored
      
      The current FLUSH/FUA support has evolved from the implementation
      which had to perform queue draining.  As such, sequencing is done
      queue-wide one flush request after another.  However, with the
      draining requirement gone, there's no reason to keep the queue-wide
      sequential approach.
      
      This patch reimplements FLUSH/FUA support such that each FLUSH/FUA
      request is sequenced individually.  The actual FLUSH execution is
      double buffered and whenever a request wants to execute one for either
      PRE or POSTFLUSH, it queues on the pending queue.  Once certain
      conditions are met, a flush request is issued and on its completion
      all pending requests proceed to the next sequence.
      
      This allows arbitrary merging of different type of flushes.  How they
      are merged can be primarily controlled and tuned by adjusting the
      above said 'conditions' used to determine when to issue the next
      flush.
      
      This is inspired by Darrick's patches to merge multiple zero-data
      flushes which helps workloads with highly concurrent fsync requests.
      
      * As flush requests are never put on the IO scheduler, request fields
        used for flush share space with rq->rb_node.  rq->completion_data is
        moved out of the union.  This increases the request size by one
        pointer.
      
        As rq->elevator_private* are used only by the iosched too, it is
        possible to reduce the request size further.  However, to do that,
        we need to modify request allocation path such that iosched data is
        not allocated for flush requests.
      
      * FLUSH/FUA processing happens on insertion now instead of dispatch.
      
      - Comments updated as per Vivek and Mike.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: "Darrick J. Wong" <djwong@us.ibm.com>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      ae1b1539
    • Tejun Heo's avatar
      block: add REQ_FLUSH_SEQ · 414b4ff5
      Tejun Heo authored
      
      rq == &q->flush_rq was used to determine whether a rq is part of a
      flush sequence, which worked because all requests in a flush sequence
      were sequenced using the single dedicated request.  This is about to
      change, so introduce REQ_FLUSH_SEQ flag to distinguish flush sequence
      requests.
      
      This patch doesn't cause any behavior change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      414b4ff5
  7. 16 Sep, 2010 1 commit
    • Christoph Hellwig's avatar
      block: remove BLKDEV_IFL_WAIT · dd3932ed
      Christoph Hellwig authored
      
      All the blkdev_issue_* helpers can only sanely be used for synchronous
      caller.  To issue cache flushes or barriers asynchronously the caller needs
      to set up a bio by itself with a completion callback to move the asynchronous
      state machine ahead.  So drop the BLKDEV_IFL_WAIT flag that is always
      specified when calling blkdev_issue_* and also remove the now unused flags
      argument to blkdev_issue_flush and blkdev_issue_zeroout.  For
      blkdev_issue_discard we need to keep it for the secure discard flag, which
      gains a more descriptive name and loses the bitops vs flag confusion.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      dd3932ed
  8. 10 Sep, 2010 12 commits
    • Tejun Heo's avatar
      block: use REQ_FLUSH in blkdev_issue_flush() · d391a2dd
      Tejun Heo authored
      
      Update blkdev_issue_flush() to use new REQ_FLUSH interface.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      d391a2dd
    • Tejun Heo's avatar
      block: make sure FSEQ_DATA request has the same rq_disk as the original · 09d60c70
      Tejun Heo authored
      
      rq->rq_disk and bio->bi_bdev->bd_disk may differ if a request has
      passed through remapping drivers.  FSEQ_DATA request incorrectly
      followed bio->bi_bdev->bd_disk ending up being issued w/ mismatching
      rq_disk.  Make it follow orig_rq->rq_disk.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Tested-by: default avatarKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      09d60c70
    • Tejun Heo's avatar
      block: kick queue after sequencing REQ_FLUSH/FUA · 47f70d5a
      Tejun Heo authored
      
      While completing a request from a REQ_FLUSH/FUA sequence, another
      request can be pushed to the request queue.  If a driver tests
      elv_queue_empty() before completing a request and runs the queue again
      only if the queue wasn't empty, this may lead to hang.  Please note
      that most drivers either kick the queue unconditionally or test queue
      emptiness after completing the current request and don't have this
      problem.
      
      This patch removes this possibility by making REQ_FLUSH/FUA sequence
      code kick the queue if the queue was empty before completing a request
      from REQ_FLUSH/FUA sequence.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      47f70d5a
    • Tejun Heo's avatar
      block: initialize flush request with WRITE_FLUSH instead of REQ_FLUSH · 337238be
      Tejun Heo authored
      
      init_flush_request() only set REQ_FLUSH when initializing flush
      requests making them READ requests.  Use WRITE_FLUSH instead.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      337238be
    • Christoph Hellwig's avatar
      block: simplify queue_next_fseq · cde4c406
      Christoph Hellwig authored
      
      We need to call blk_rq_init and elv_insert for all cases in queue_next_fseq,
      so take these calls into common code.  Also move the end_io initialization
      from queue_flush into queue_next_fseq and rename queue_flush to
      init_flush_request now that it's old name doesn't apply anymore.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      cde4c406
    • Tejun Heo's avatar
      block: implement REQ_FLUSH/FUA based interface for FLUSH/FUA requests · 4fed947c
      Tejun Heo authored
      
      Now that the backend conversion is complete, export sequenced
      FLUSH/FUA capability through REQ_FLUSH/FUA flags.  REQ_FLUSH means the
      device cache should be flushed before executing the request.  REQ_FUA
      means that the data in the request should be on non-volatile media on
      completion.
      
      Block layer will choose the correct way of implementing the semantics
      and execute it.  The request may be passed to the device directly if
      the device can handle it; otherwise, it will be sequenced using one or
      more proxy requests.  Devices will never see REQ_FLUSH and/or FUA
      which it doesn't support.
      
      Also, unlike the original REQ_HARDBARRIER, REQ_FLUSH/FUA requests are
      never failed with -EOPNOTSUPP.  If the underlying device doesn't
      support FLUSH/FUA, the block layer simply make those noop.  IOW, it no
      longer distinguishes between writeback cache which doesn't support
      cache flush and writethrough/no cache.  Devices which have WB cache
      w/o flush are very difficult to come by these days and there's nothing
      much we can do anyway, so it doesn't make sense to require everyone to
      implement -EOPNOTSUPP handling.  This will simplify filesystems and
      block drivers as they can drop -EOPNOTSUPP retry logic for barriers.
      
      * QUEUE_ORDERED_* are removed and QUEUE_FSEQ_* are moved into
        blk-flush.c.
      
      * REQ_FLUSH w/o data can also be directly passed to drivers without
        sequencing but some drivers assume that zero length requests don't
        have rq->bio which isn't true for these requests requiring the use
        of proxy requests.
      
      * REQ_COMMON_MASK now includes REQ_FLUSH | REQ_FUA so that they are
        copied from bio to request.
      
      * WRITE_BARRIER is marked deprecated and WRITE_FLUSH, WRITE_FUA and
        WRITE_FLUSH_FUA are added.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      4fed947c
    • Tejun Heo's avatar
      block: rename barrier/ordered to flush · dd4c133f
      Tejun Heo authored
      
      With ordering requirements dropped, barrier and ordered are misnomers.
      Now all block layer does is sequencing FLUSH and FUA.  Rename them to
      flush.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      dd4c133f
    • Tejun Heo's avatar
      block: rename blk-barrier.c to blk-flush.c · 8839a0e0
      Tejun Heo authored
      
      Without ordering requirements, barrier and ordering are minomers.
      Rename block/blk-barrier.c to block/blk-flush.c.  Rename of symbols
      will follow.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      8839a0e0
    • Tejun Heo's avatar
      block: drop barrier ordering by queue draining · 28e7d184
      Tejun Heo authored
      
      Filesystems will take all the responsibilities for ordering requests
      around commit writes and will only indicate how the commit writes
      themselves should be handled by block layers.  This patch drops
      barrier ordering by queue draining from block layer.  Ordering by
      draining implementation was somewhat invasive to request handling.
      List of notable changes follow.
      
      * Each queue has 1 bit color which is flipped on each barrier issue.
        This is used to track whether a given request is issued before the
        current barrier or not.  REQ_ORDERED_COLOR flag and coloring
        implementation in __elv_add_request() are removed.
      
      * Requests which shouldn't be processed yet for draining were stalled
        by returning -EAGAIN from blk_do_ordered() according to the test
        result between blk_ordered_req_seq() and blk_blk_ordered_cur_seq().
        This logic is removed.
      
      * Draining completion logic in elv_completed_request() removed.
      
      * All barrier sequence requests were queued to request queue and then
        trckled to lower layer according to progress and thus maintaining
        request orders during requeue was necessary.  This is replaced by
        queueing the next request in the barrier sequence only after the
        current one is complete from blk_ordered_complete_seq(), which
        removes the need for multiple proxy requests in struct request_queue
        and the request sorting logic in the ELEVATOR_INSERT_REQUEUE path of
        elv_insert().
      
      * As barriers no longer have ordering constraints, there's no need to
        dump the whole elevator onto the dispatch queue on each barrier.
        Insert barriers at the front instead.
      
      * If other barrier requests come to the front of the dispatch queue
        while one is already in progress, they are stored in
        q->pending_barriers and restored to dispatch queue one-by-one after
        each barrier completion from blk_ordered_complete_seq().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      28e7d184
    • Tejun Heo's avatar
      block: misc cleanups in barrier code · dd831006
      Tejun Heo authored
      
      Make the following cleanups in preparation of barrier/flush update.
      
      * blk_do_ordered() declaration is moved from include/linux/blkdev.h to
        block/blk.h.
      
      * blk_do_ordered() now returns pointer to struct request, with %NULL
        meaning "try the next request" and ERR_PTR(-EAGAIN) "try again
        later".  The third case will be dropped with further changes.
      
      * In the initialization of proxy barrier request, data direction is
        already set by init_request_from_bio().  Drop unnecessary explicit
        REQ_WRITE setting and move init_request_from_bio() above REQ_FUA
        flag setting.
      
      * add_request() is collapsed into __make_request().
      
      These changes don't make any functional difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      dd831006
    • Tejun Heo's avatar
      block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush() · 4913efe4
      Tejun Heo authored
      
      Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
      requests.  Deprecate barrier.  All REQ_HARDBARRIERs are failed with
      -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
      blk_queue_flush().
      
      blk_queue_flush() takes combinations of REQ_FLUSH and FUA.  If a
      device has write cache and can flush it, it should set REQ_FLUSH.  If
      the device can handle FUA writes, it should also set REQ_FUA.
      
      All blk_queue_ordered() users are converted.
      
      * ORDERED_DRAIN is mapped to 0 which is the default value.
      * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
      * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Pierre Ossman <drzeus@drzeus.cx>
      Cc: Stefan Weinhuber <wein@de.ibm.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      4913efe4
    • Tejun Heo's avatar
      block: kill QUEUE_ORDERED_BY_TAG · 6958f145
      Tejun Heo authored
      
      Nobody is making meaningful use of ORDERED_BY_TAG now and queue
      draining for barrier requests will be removed soon which will render
      the advantage of tag ordering moot.  Kill ORDERED_BY_TAG.  The
      following users are affected.
      
      * brd: converted to ORDERED_DRAIN.
      * virtio_blk: ORDERED_TAG path was already marked deprecated.  Removed.
      * xen-blkfront: ORDERED_TAG case dropped.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      6958f145
  9. 07 Aug, 2010 8 commits
  10. 28 Apr, 2010 3 commits
  11. 30 Mar, 2010 1 commit
    • Tejun Heo's avatar
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo authored
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        bloc...
      5a0e3ad6
  12. 29 Dec, 2009 1 commit
  13. 01 Oct, 2009 4 commits
    • Christoph Hellwig's avatar
      block: allow large discard requests · 67efc925
      Christoph Hellwig authored
      
      Currently we set the bio size to the byte equivalent of the blocks to
      be trimmed when submitting the initial DISCARD ioctl.  That means it
      is subject to the max_hw_sectors limitation of the HBA which is
      much lower than the size of a DISCARD request we can support.
      Add a separate max_discard_sectors tunable to limit the size for discard
      requests.
      
      We limit the max discard request size in bytes to 32bit as that is the
      limit for bio->bi_size.  This could be much larger if we had a way to pass
      that information through the block layer.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      67efc925
    • Christoph Hellwig's avatar
      block: use normal I/O path for discard requests · c15227de
      Christoph Hellwig authored
      
      prepare_discard_fn() was being called in a place where memory allocation
      was effectively impossible.  This makes it inappropriate for all but
      the most trivial translations of Linux's DISCARD operation to the block
      command set.  Additionally adding a payload there makes the ownership
      of the bio backing unclear as it's now allocated by the device driver
      and not the submitter as usual.
      
      It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether
      the queue supports discard operations or not.  blkdev_issue_discard now
      allocates a one-page, sector-length payload which is the right thing
      for the common ATA and SCSI implementations.
      
      The mtd implementation of prepare_discard_fn() is replaced with simply
      checking for the request being a discard.
      
      Largely based on a previous patch from Matthew Wilcox <matthew@wil.cx>
      which did the prepare_discard_fn but not the different payload allocation
      yet.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      c15227de
    • Christoph Hellwig's avatar
      block: allow large discard requests · ca80650c
      Christoph Hellwig authored
      
      Currently we set the bio size to the byte equivalent of the blocks to
      be trimmed when submitting the initial DISCARD ioctl.  That means it
      is subject to the max_hw_sectors limitation of the HBA which is
      much lower than the size of a DISCARD request we can support.
      Add a separate max_discard_sectors tunable to limit the size for discard
      requests.
      
      We limit the max discard request size in bytes to 32bit as that is the
      limit for bio->bi_size.  This could be much larger if we had a way to pass
      that information through the block layer.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      ca80650c
    • Christoph Hellwig's avatar
      block: use normal I/O path for discard requests · 1122a26f
      Christoph Hellwig authored
      
      prepare_discard_fn() was being called in a place where memory allocation
      was effectively impossible.  This makes it inappropriate for all but
      the most trivial translations of Linux's DISCARD operation to the block
      command set.  Additionally adding a payload there makes the ownership
      of the bio backing unclear as it's now allocated by the device driver
      and not the submitter as usual.
      
      It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether
      the queue supports discard operations or not.  blkdev_issue_discard now
      allocates a one-page, sector-length payload which is the right thing
      for the common ATA and SCSI implementations.
      
      The mtd implementation of prepare_discard_fn() is replaced with simply
      checking for the request being a discard.
      
      Largely based on a previous patch from Matthew Wilcox <matthew@wil.cx>
      which did the prepare_discard_fn but not the different payload allocation
      yet.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      1122a26f