- 20 Sep, 2012 1 commit
-
-
Martin K. Petersen authored
Introduce a BLKZEROOUT ioctl which can be used to clear block ranges by way of blkdev_issue_zeroout(). Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Acked-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 18 Sep, 2012 1 commit
-
-
Alan Cox authored
65536 should be ludicrous anyway but without it we overflow the memory computation doing the allocation and badness occurs. Signed-off-by:
Alan Cox <alan@linux.intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 01 Aug, 2012 1 commit
-
-
Vivek Goyal authored
Add a new operation code (BLKPG_RESIZE_PARTITION) to the BLKPG ioctl that allows altering the size of an existing partition, even if it is currently in use. This patch converts hd_struct->nr_sects into sequence counter because One might extend a partition while IO is happening to it and update of nr_sects can be non-atomic on 32bit machines with 64bit sector_t. This can lead to issues like reading inconsistent size of a partition. Sequence counter have been used so that readers don't have to take bdev mutex lock as we call sector_in_part() very frequently. Now all the access to hd_struct->nr_sects should happen using sequence counter read/update helper functions part_nr_sects_read/part_nr_sects_write. There is one exception though, set_capacity()/get_capacity(). I think theoritically race should exist there too but this patch does not modify set_capacity()/get_capacity() due to sheer number of call sites and I am afraid that change might break something. I have left that as a TODO item. We can handle it later if need be. This patch does not introduce any new races as such w.r.t set_capacity()/get_capacity(). v2: Add CONFIG_LBDAF test to UP preempt case as suggested by Phillip. Signed-off-by:
Vivek Goyal <vgoyal@redhat.com> Signed-off-by:
Phillip Susi <psusi@ubuntu.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 11 Jan, 2012 1 commit
-
-
Martin K. Petersen authored
Introduce an ioctl which permits applications to query whether a block device is rotational. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 05 Jan, 2012 1 commit
-
-
Linus Torvalds authored
We're doing some odd things there, which already messes up various users (see the net/socket.c code that this removes), and it was going to add yet more crud to the block layer because of the incorrect error code translation. ENOIOCTLCMD is not an error return that should be returned to user mode from the "ioctl()" system call, but it should *not* be translated as EINVAL ("Invalid argument"). It should be translated as ENOTTY ("Inappropriate ioctl for device"). That EINVAL confusion has apparently so permeated some code that the block layer actually checks for it, which is sad. We continue to do so for now, but add a big comment about how wrong that is, and we should remove it entirely eventually. In the meantime, this tries to keep the changes localized to just the EINVAL -> ENOTTY fix, and removing code that makes it harder to do the right thing. Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 04 Jan, 2012 1 commit
-
-
Al Viro authored
Move invalidate_bdev, block_sync_page into fs/block_dev.c. Export kill_bdev as well, so brd doesn't have to open code it. Reduce buffer_head.h requirement accordingly. Removed a rather large comment from invalidate_bdev, as it looked a bit obsolete to bother moving. The small comment replacing it says enough. Signed-off-by:
Nick Piggin <npiggin@suse.de> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 31 Oct, 2011 1 commit
-
-
Paul Gortmaker authored
These files were getting <linux/module.h> via an implicit include path, but we want to crush those out of existence since they cost time during compiles of processing thousands of lines of headers for no reason. Give them the lightweight header that just contains the EXPORT_SYMBOL infrastructure. Signed-off-by:
Paul Gortmaker <paul.gortmaker@windriver.com>
-
- 23 Aug, 2011 1 commit
-
-
Tejun Heo authored
There are cases where suppressing partition scan is useful - e.g. for lo devices and pseudo SATA devices which advertise to be a disk but get upset on partition scan (some port multiplier control devices show such behavior). This patch adds GENHD_FL_NO_PART_SCAN which suppresses partition scan regardless of the number of possible partitions. disk_partitionable() is renamed to disk_part_scan_enabled() as suppressing partition scan doesn't imply the device can't be partitioned using BLKPG_ADD/DEL_PARTITION calls from userland. show_partition() now directly tests disk_max_parts() to maintain backward-compatibility. -v2: Updated to make it clear that only partition scan is suppressed not partitioning itself as suggested by Kay Sievers. Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 24 Feb, 2011 1 commit
-
-
Miklos Szeredi authored
Adam Kovari and others reported that disconnecting an USB drive with an ntfs-3g filesystem would cause "kernel BUG at fs/inode.c:1421!" to be triggered. The BUG could be traced back to ioctl(BLKBSZSET), which would erroneously decrement the refcount on the bdev. This is because blkdev_get() expects the refcount to be already incremented and either returns success or decrements the refcount and returns an error. The bug was introduced by e525fd89 (block: make blkdev_get/put() handle exclusive access), which didn't take into account this behavior of blkdev_get(). This fixes https://bugzilla.kernel.org/show_bug.cgi?id=29202 (and likely 29792 too) Reported-by:
Adam Kovari <kovariadam@gmail.com> Acked-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 17 Nov, 2010 1 commit
-
-
Arnd Bergmann authored
The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 13 Nov, 2010 1 commit
-
-
Tejun Heo authored
Over time, block layer has accumulated a set of APIs dealing with bdev open, close, claim and release. * blkdev_get/put() are the primary open and close functions. * bd_claim/release() deal with exclusive open. * open/close_bdev_exclusive() are combination of open and claim and the other way around, respectively. * bd_link/unlink_disk_holder() to create and remove holder/slave symlinks. * open_by_devnum() wraps bdget() + blkdev_get(). The interface is a bit confusing and the decoupling of open and claim makes it impossible to properly guarantee exclusive access as in-kernel open + claim sequence can disturb the existing exclusive open even before the block layer knows the current open if for another exclusive access. Reorganize the interface such that, * blkdev_get() is extended to include exclusive access management. @holder argument is added and, if is @FMODE_EXCL specified, it will gain exclusive access atomically w.r.t. other exclusive accesses. * blkdev_put() is similarly extended. It now takes @mode argument and if @FMODE_EXCL is set, it releases an exclusive access. Also, when the last exclusive claim is released, the holder/slave symlinks are removed automatically. * bd_claim/release() and close_bdev_exclusive() are no longer necessary and either made static or removed. * bd_link_disk_holder() remains the same but bd_unlink_disk_holder() is no longer necessary and removed. * open_bdev_exclusive() becomes a simple wrapper around lookup_bdev() and blkdev_get(). It also has an unexpected extra bdev_read_only() test which probably should be moved into blkdev_get(). * open_by_devnum() is modified to take @holder argument and pass it to blkdev_get(). Most of bdev open/close operations are unified into blkdev_get/put() and most exclusive accesses are tested atomically at the open time (as it should). This cleans up code and removes some, both valid and invalid, but unnecessary all the same, corner cases. open_bdev_exclusive() and open_by_devnum() can use further cleanup - rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop special features. Well, let's leave them for another day. Most conversions are straight-forward. drbd conversion is a bit more involved as there was some reordering, but the logic should stay the same. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Neil Brown <neilb@suse.de> Acked-by:
Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by:
Mike Snitzer <snitzer@redhat.com> Acked-by:
Philipp Reisner <philipp.reisner@linbit.com> Cc: Peter Osterlund <petero2@telia.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: dm-devel@redhat.com Cc: drbd-dev@lists.linbit.com Cc: Leo Chen <leochen@broadcom.com> Cc: Scott Branden <sbranden@broadcom.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Joern Engel <joern@logfs.org> Cc: reiserfs-devel@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk>
-
- 10 Nov, 2010 2 commits
-
-
Vasiliy Kulikov authored
Structure hd_geometry is copied to userland with 4 padding bytes between cylinders and start fields uninitialized on 64-bit platforms. It leads to leaking of contents of kernel stack memory. Currently there is no memset() in real implementations of getgeo() in drivers/block/, so it makes sense to have memset() in blkdev_ioctl(). Signed-off-by:
Vasiliy Kulikov <segooon@gmail.com> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
Mike Snitzer authored
Convert direct reads of an inode's i_size to using i_size_read(). i_size_{read,write} use a seqcount to protect reads from accessing incomple writes. Concurrent i_size_write()s require mutual exclussion to protect the seqcount that is used by i_size_{read,write}. But i_size_read() callers do not need to use additional locking. Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Acked-by:
NeilBrown <neilb@suse.de> Acked-by:
Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 16 Sep, 2010 1 commit
-
-
Christoph Hellwig authored
All the blkdev_issue_* helpers can only sanely be used for synchronous caller. To issue cache flushes or barriers asynchronously the caller needs to set up a bio by itself with a completion callback to move the asynchronous state machine ahead. So drop the BLKDEV_IFL_WAIT flag that is always specified when calling blkdev_issue_* and also remove the now unused flags argument to blkdev_issue_flush and blkdev_issue_zeroout. For blkdev_issue_discard we need to keep it for the secure discard flag, which gains a more descriptive name and loses the bitops vs flag confusion. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 15 Sep, 2010 1 commit
-
-
Will Drewry authored
I'm reposting this patch series as v4 since there have been no additional comments, and I cleaned up one extra bit of unneeded code (in 3/3). The patches are against Linus's tree: 2bfc96a1 (2.6.36-rc3). Would this patchset be suitable for inclusion in an mm branch? This changes adds a partition_meta_info struct which itself contains a union of structures that provide partition table specific metadata. This change leaves the union empty. The subsequent patch includes an implementation for CONFIG_EFI_PARTITION-based metadata. Signed-off-by:
Will Drewry <wad@chromium.org> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 12 Aug, 2010 1 commit
-
-
Adrian Hunter authored
Secure discard is the same as discard except that all copies of the discarded sectors (perhaps created by garbage collection) must also be erased. Signed-off-by:
Adrian Hunter <adrian.hunter@nokia.com> Acked-by:
Jens Axboe <axboe@kernel.dk> Cc: Kyungmin Park <kmpark@infradead.org> Cc: Madhusudhan Chikkature <madhu.cr@ti.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ben Gardiner <bengardiner@nanometrics.ca> Cc: <linux-mmc@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 07 Aug, 2010 4 commits
-
-
Arnd Bergmann authored
The blkpg_ioctl and blkdev_reread_part access fields of the bdev and gendisk structures, yet they always do so under the protection of bdev->bd_mutex, which seems sufficient. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> cked-by:
Christoph Hellwig <hch@infradead.org> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
Arnd Bergmann authored
We only call the functions set_device_ro(), invalidate_bdev(), sync_filesystem() and sync_blockdev() while holding the BKL in these commands. All of these are also done in other code paths without the BKL, which leads me to the conclusion that the BKL is not needed here either. The reason we hold it here is that it was originally pushed down into the ioctl function from vfs_ioctl. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
Arnd Bergmann authored
The blktrace driver currently needs the BKL, but we should not need to take that in the block layer, so just push it down into the driver itself. It is quite likely that the BKL is not actually required in blktrace code and could be removed in a follow-on patch. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Christoph Hellwig <hch@infradead.org> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
Arnd Bergmann authored
As a preparation for the removal of the big kernel lock in the block layer, this removes the BKL from the common ioctl handling code, moving it into every single driver still using it. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Christoph Hellwig <hch@infradead.org> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 28 Apr, 2010 1 commit
-
-
Dmitry Monakhov authored
The patch just convert all blkdev_issue_xxx function to common set of flags. Wait/allocation semantics preserved. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 30 Mar, 2010 1 commit
-
-
Tejun Heo authored
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include bloc...
-
- 03 Dec, 2009 1 commit
-
-
Martin K. Petersen authored
The discard ioctl is used by mkfs utilities to clear a block device prior to putting metadata down. However, not all devices return zeroed blocks after a discard. Some drives return stale data, potentially containing old superblocks. It is therefore important to know whether discarded blocks are properly zeroed. Both ATA and SCSI drives have configuration bits that indicate whether zeroes are returned after a discard operation. Implement a block level interface that allows this information to be bubbled up the stack and queried via a new block device ioctl. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 03 Oct, 2009 1 commit
-
-
Martin K. Petersen authored
Not all users of the topology information want to use libblkid. Provide the topology information through bdev ioctls. Also clarify sector size comments for existing BLK ioctls. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 14 Sep, 2009 1 commit
-
-
Christoph Hellwig authored
blk_ioctl_discard duplicates large amounts of code from blkdev_issue_discard, the only difference between the two is that blkdev_issue_discard needs to send a barrier discard request and blk_ioctl_discard a non-barrier one, and blk_ioctl_discard needs to wait on the request. To facilitates this add a flags argument to blkdev_issue_discard to control both aspects of the behaviour. This will be very useful later on for using the waiting funcitonality for other callers. Based on an earlier patch from Matthew Wilcox <matthew@wil.cx>. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 22 May, 2009 2 commits
-
-
Martin K. Petersen authored
Convert all external users of queue limits to using wrapper functions instead of poking the request queue variables directly. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Martin K. Petersen authored
Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 15 Apr, 2009 1 commit
-
-
Nikanth Karthikesan authored
Remove code handling bio_alloc failure with __GFP_WAIT. GFP_KERNEL implies __GFP_WAIT. Signed-off-by:
Nikanth Karthikesan <knikanth@suse.de> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 29 Dec, 2008 1 commit
-
-
Wu Fengguang authored
There's no need to take queue_lock or kernel_lock when modifying bdi->ra_pages. So remove them. Also remove out of date comment for queue_max_sectors_store(). Signed-off-by:
Wu Fengguang <wfg@linux.intel.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 18 Nov, 2008 1 commit
-
-
Tejun Heo authored
Make add_partition() return pointer to the new hd_struct on success and ERR_PTR() value on failure. This change will be used to fix md autodetection bug. Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 21 Oct, 2008 7 commits
-
-
Al Viro authored
Now we can switch blkdev_ioctl() block_device/mode Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
We need to do bd_claim() only if file hadn't been opened with O_EXCL and then we have no need to use file itself as owner. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Most of that stuff doesn't need BKL at all; expand in the (only) caller, merge the switch into one there and leave BKL only around the stuff that might actually need it. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
convert remaining callers to __blkdev_driver_ioctl() Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
To keep the size of changesets sane we split the switch by drivers; to keep the damn thing bisectable we do the following: 1) rename the affected methods, add ones with correct prototypes, make (few) callers handle both. That's this changeset. 2) for each driver convert to new methods. *ALL* drivers are converted in this series. 3) kill the old (renamed) methods. Note that it _is_ a flagday; all in-tree drivers are converted and by the end of this series no trace of old methods remain. The only reason why we do that this way is to keep the damn thing bisectable and allow per-driver debugging if anything goes wrong. New methods: open(bdev, mode) release(disk, mode) ioctl(bdev, mode, cmd, arg) /* Called without BKL */ compat_ioctl(bdev, mode, cmd, arg) locked_ioctl(bdev, mode, cmd, arg) /* Called with BKL, legacy */ Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Analog of blkdev_driver_ioctl() with sane arguments. For now uses fake struct file, by the end of the series it won't and blkdev_driver_ioctl() will become a wrapper around it. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 09 Oct, 2008 3 commits
-
-
Tejun Heo authored
disk->__part used to be statically allocated to the maximum possible number of partitions. This patch makes partition array allocation dynamic. The added overhead is minimal as only real change is one memory dereference changed to RCU one. This saves both a bit of memory and cpu cycles iterating through unoccupied slots and makes increasing partition limit easier. Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
genhd and partition code handled disk and partitions separately. All information about the whole disk was in struct genhd and partitions in struct hd_struct. However, the whole disk (part0) and other partitions have a lot in common and the data structures end up having good number of common fields and thus separate code paths doing the same thing. Also, the partition array was indexed by partno - 1 which gets pretty confusing at times. This patch introduces partition 0 and makes the partition array indexed by partno. Following patches will unify the handling of disk and parts piece-by-piece. This patch also implements disk_partitionable() which tests whether a disk is partitionable. With coming dynamic partition array change, the most common usage of disk_max_parts() will be testing whether a disk is partitionable and the number of max partitions will become much less important. Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
disk->part[] is protected by its matching bdev's lock. However, non-critical accesses like collecting stats and printing out sysfs and proc information used to be performed without any locking. As partitions can come and go dynamically, partitions can go away underneath those non-critical accesses. As some of those accesses are writes, this theoretically can lead to silent corruption. This patch fixes the race by using RCU for the partition array and dev reference counter to hold partitions. * Rename disk->part[] to disk->__part[] to make sure no one outside genhd layer proper accesses it directly. * Use RCU for disk->__part[] dereferencing. * Implement disk_{get|put}_part() which can be used to get and put partitions from gendisk respectively. * Iterators are implemented to help iterate through all partitions safely. * Functions which require RCU readlock are marked with _rcu suffix. * Use disk_put_part() in __blkdev_put() instead of directly putting the contained kobject. Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-