- 18 Nov, 2020 1 commit
-
-
Arvind Sankar authored
Commit 4b47cdbd ("x86/head/64: Move early exception dispatch to C code") removed the usage of GET_CR2_INTO(). Drop the definition as well, and related definitions in paravirt.h and asm-offsets.h Signed-off-by:
Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201005151208.2212886-3-nivedita@alum.mit.edu
-
- 29 Oct, 2020 2 commits
-
-
Arvind Sankar authored
Commit b4e0409a ("x86: check vmlinux limits, 64-bit") added a check that the size of the 64-bit kernel is less than KERNEL_IMAGE_SIZE. The check uses (_end - _text), but this is not enough. The initial PMD used in startup_64() (level2_kernel_pgt) can only map upto KERNEL_IMAGE_SIZE from __START_KERNEL_map, not from _text, and the modules area (MODULES_VADDR) starts at KERNEL_IMAGE_SIZE. The correct check is what is currently done for 32-bit, since LOAD_OFFSET is defined appropriately for the two architectures. Just check (_end - LOAD_OFFSET) against KERNEL_IMAGE_SIZE unconditionally. Note that on 32-bit, the limit is not strict: KERNEL_IMAGE_SIZE is not really used by the main kernel. The higher the kernel is located, the less the space available for the vmalloc area. However, it is used by KASLR in the compressed stub to limit the maximum address of the kernel to a safe value. Clean up various comments to clarify that despite the name, KERNEL_IMAGE_SIZE is not a limit on the size of the kernel image, but a limit on the maximum virtual address that the image can occupy. Signed-off-by:
Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201029161903.2553528-1-nivedita@alum.mit.edu
-
Joerg Roedel authored
When SEV is enabled, the kernel requests the C-bit position again from the hypervisor to build its own page-table. Since the hypervisor is an untrusted source, the C-bit position needs to be verified before the kernel page-table is used. Call sev_verify_cbit() before writing the CR3. [ bp: Massage. ] Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Tom Lendacky <thomas.lendacky@amd.com> Link: https://lkml.kernel.org/r/20201028164659.27002-5-joro@8bytes.org
-
- 09 Sep, 2020 3 commits
-
-
Joerg Roedel authored
The APs are not ready to handle exceptions when verify_cpu() is called in secondary_startup_64(). Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-69-joro@8bytes.org
-
Joerg Roedel authored
Add the infrastructure to handle #VC exceptions when the kernel runs on virtual addresses and has mapped a GHCB. This handler will be used until the runtime #VC handler takes over. Since the handler runs very early, disable instrumentation for sev-es.c. [ bp: Make vc_ghcb_invalidate() __always_inline so that it can be inlined in noinstr functions like __sev_es_nmi_complete(). ] Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200908123816.GB3764@8bytes.org
-
Joerg Roedel authored
Setup an early handler for #VC exceptions. There is no GHCB mapped yet, so just re-use the vc_no_ghcb_handler(). It can only handle CPUID exit-codes, but that should be enough to get the kernel through verify_cpu() and __startup_64() until it runs on virtual addresses. Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Borislav Petkov <bp@suse.de> [ boot failure Error: kernel_ident_mapping_init() failed. ] Reported-by:
kernel test robot <lkp@intel.com> Link: https://lkml.kernel.org/r/20200908123517.GA3764@8bytes.org
-
- 07 Sep, 2020 6 commits
-
-
Joerg Roedel authored
Move the assembly coded dispatch between page-faults and all other exceptions to C code to make it easier to maintain and extend. Also change the return-type of early_make_pgtable() to bool and make it static. Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-36-joro@8bytes.org
-
Joerg Roedel authored
Add a separate bringup IDT for the CPU bringup code that will be used until the kernel switches to the idt_table. There are two reasons for a separate IDT: 1) When the idt_table is set up and the secondary CPUs are booted, it contains entries (e.g. IST entries) which require certain CPU state to be set up. This includes a working TSS (for IST), MSR_GS_BASE (for stack protector) or CR4.FSGSBASE (for paranoid_entry) path. By using a dedicated IDT for early boot this state need not to be set up early. 2) The idt_table is static to idt.c, so any function using/modifying must be in idt.c too. That means that all compiler driven instrumentation like tracing or KASAN is also active in this code. But during early CPU bringup the environment is not set up for this instrumentation to work correctly. To avoid all of these hassles and make early exception handling robust, use a dedicated bringup IDT. The IDT is loaded two times, first on the boot CPU while the kernel is still running on direct mapped addresses, and again later after the switch to kernel addresses has happened. The second IDT load happens on the boot and secondary CPUs. Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-34-joro@8bytes.org
-
Joerg Roedel authored
Make sure there is a stack once the kernel runs from virtual addresses. At this stage any secondary CPU which boots will have lost its stack because the kernel switched to a new page-table which does not map the real-mode stack anymore. This is needed for handling early #VC exceptions caused by instructions like CPUID. Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-33-joro@8bytes.org
-
Joerg Roedel authored
Make sure segments are properly set up before setting up an IDT and doing anything that might cause a #VC exception. This is later needed for early exception handling. Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-32-joro@8bytes.org
-
Joerg Roedel authored
Load the GDT right after switching to virtual addresses to make sure there is a defined GDT for exception handling. Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-31-joro@8bytes.org
-
Joerg Roedel authored
Handling exceptions during boot requires a working GDT. The kernel GDT can't be used on the direct mapping, so load a startup GDT and setup segments. Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-30-joro@8bytes.org
-
- 11 Jun, 2020 1 commit
-
-
Thomas Gleixner authored
Remove all the code which was there to emit the system vector stubs. All users are gone. Move the now unused GET_CR2_INTO macro muck to head_64.S where the last user is. Fixup the eye hurting comment there while at it. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Acked-by:
Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20200521202119.927433002@linutronix.de
-
- 09 Jun, 2020 2 commits
-
-
Mike Rapoport authored
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include of the latter in the middle of asm includes. Fix this up with the aid of the below script and manual adjustments here and there. import sys import re if len(sys.argv) is not 3: print "USAGE: %s <file> <header>" % (sys.argv[0]) sys.exit(1) hdr_to_move="#include <linux/%s>" % sys.argv[2] moved = False in_hdrs = False with open(sys.argv[1], "r") as f: lines = f.readlines() for _line in lines: line = _line.rstrip(' ') if line == hdr_to_move: continue if line.startswith("#include <linux/"): in_hdrs = True elif not moved and in_hdrs: moved = True print hdr_to_move print line Signed-off-by:
Mike Rapoport <rppt@linux.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin....
-
Mike Rapoport authored
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by:
Mike Rapoport <rppt@linux.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 18 Oct, 2019 4 commits
-
-
Jiri Slaby authored
Change all assembly code which is marked using END (and not ENDPROC). Switch all these to the appropriate new annotation SYM_CODE_START and SYM_CODE_END. Signed-off-by:
Jiri Slaby <jslaby@suse.cz> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> [xen bits] Cc: Andy Lutomirski <luto@kernel.org> Cc: Cao jin <caoj.fnst@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: linux-arch@vger.kernel.org Cc: Maran Wilson <maran.wilson@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Cc: xen-devel@lists.xenproject.org Link: https://lkml.kernel.org/r/20191011115108.12392-24-jslaby@suse.cz
-
Jiri Slaby authored
GLOBAL is an x86's custom macro and is going to die very soon. It was meant for global symbols, but here, it was used for functions. Instead, use the new macros SYM_FUNC_START* and SYM_CODE_START* (depending on the type of the function) which are dedicated to global functions. And since they both require a closing by SYM_*_END, do that here too. startup_64, which does not use GLOBAL but uses .globl explicitly, is converted too. "No alignments" are preserved. Signed-off-by:
Jiri Slaby <jslaby@suse.cz> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: Allison Randal <allison@lohutok.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Cao jin <caoj.fnst@cn.fujitsu.com> Cc: Enrico Weigelt <info@metux.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-arch@vger.kernel.org Cc: Maran Wilson <maran.wilson@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191011115108.12392-17-jslaby@suse.cz
-
Jiri Slaby authored
Use the new SYM_DATA, SYM_DATA_START, and SYM_DATA_END in both 32 and 64 bit head_*.S. In the 64-bit version, define also SYM_DATA_START_PAGE_ALIGNED locally using the new SYM_START. It is used in the code instead of NEXT_PAGE() which was defined in this file and had been using the obsolete macro GLOBAL(). Now, the data in the 64-bit object file look sane: Value Size Type Bind Vis Ndx Name 0000 4096 OBJECT GLOBAL DEFAULT 15 init_level4_pgt 1000 4096 OBJECT GLOBAL DEFAULT 15 level3_kernel_pgt 2000 2048 OBJECT GLOBAL DEFAULT 15 level2_kernel_pgt 3000 4096 OBJECT GLOBAL DEFAULT 15 level2_fixmap_pgt 4000 4096 OBJECT GLOBAL DEFAULT 15 level1_fixmap_pgt 5000 2 OBJECT GLOBAL DEFAULT 15 early_gdt_descr 5002 8 OBJECT LOCAL DEFAULT 15 early_gdt_descr_base 500a 8 OBJECT GLOBAL DEFAULT 15 phys_base 0000 8 OBJECT GLOBAL DEFAULT 17 initial_code 0008 8 OBJECT GLOBAL DEFAULT 17 initial_gs 0010 8 OBJECT GLOBAL DEFAULT 17 initial_stack 0000 4 OBJECT GLOBAL DEFAULT 19 early_recursion_flag 1000 4096 OBJECT GLOBAL DEFAULT 19 early_level4_pgt 2000 0x40000 OBJECT GLOBAL DEFAULT 19 early_dynamic_pgts 0000 4096 OBJECT GLOBAL DEFAULT 22 empty_zero_page All have correct size and type now. Note that this also removes implicit 16B alignment previously inserted by ENTRY: * initial_code, setup_once_ref, initial_page_table, initial_stack, boot_gdt are still aligned * early_gdt_descr is now properly aligned as was intended before ENTRY was added there long time ago * phys_base's alignment is kept by an explicitly added new alignment Signed-off-by:
Jiri Slaby <jslaby@suse.cz> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Cao jin <caoj.fnst@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: linux-arch@vger.kernel.org Cc: Maran Wilson <maran.wilson@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191011115108.12392-12-jslaby@suse.cz
-
Jiri Slaby authored
Use the newly added SYM_CODE_START_LOCAL* to annotate beginnings of all pseudo-functions (those ending with END until now) which do not have ".globl" annotation. This is needed to balance END for tools that generate debuginfo. Note that ENDs are switched to SYM_CODE_END too so that everybody can see the pairing. C-like functions (which handle frame ptr etc.) are not annotated here, hence SYM_CODE_* macros are used here, not SYM_FUNC_*. Note that the 32bit version of early_idt_handler_common already had ENDPROC -- switch that to SYM_CODE_END for the same reason as above (and to be the same as 64bit). While early_idt_handler_common is LOCAL, it's name is not prepended with ".L" as it happens to appear in call traces. bad_get_user*, and bad_put_user are now aligned, as they are separate functions. They do not mind to be aligned -- no need to be compact there. early_idt_handler_common is aligned now too, as it is after early_idt_handler_array, so as well no need to be compact there. verify_cpu is self-standing and included in other .S files, so align it too. The others have alignment preserved to what it used to be (using the _NOALIGN variant of macros). Signed-off-by:
Jiri Slaby <jslaby@suse.cz> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Allison Randal <allison@lohutok.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Cao jin <caoj.fnst@cn.fujitsu.com> Cc: Enrico Weigelt <info@metux.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: linux-arch@vger.kernel.org Cc: Maran Wilson <maran.wilson@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191011115108.12392-6-jslaby@suse.cz
-
- 05 Oct, 2019 1 commit
-
-
Jiri Slaby authored
Moving early_recursion_flag (4 bytes) after early_level4_pgt (4k) and early_dynamic_pgts (256k) saves 4k which are used for alignment of early_level4_pgt after early_recursion_flag. The real improvement is merely on the source code side. Previously it was: * __INITDATA + .balign * early_recursion_flag variable * a ton of CPP MACROS * __INITDATA (again) * early_top_pgt and early_recursion_flag variables * .data Now, it is a bit simpler: * a ton of CPP MACROS * __INITDATA + .balign * early_top_pgt and early_recursion_flag variables * early_recursion_flag variable * .data On the binary level the change looks like this: Before: (sections) 12 .init.data 00042000 0000000000000000 0000000000000000 00008000 2**12 (symbols) 000000 4 OBJECT GLOBAL DEFAULT 22 early_recursion_flag 001000 4096 OBJECT GLOBAL DEFAULT 22 early_top_pgt 002000 0x40000 OBJECT GLOBAL DEFAULT 22 early_dynamic_pgts After: (sections) 12 .init.data 00041004 0000000000000000 0000000000000000 00008000 2**12 (symbols) 000000 4096 OBJECT GLOBAL DEFAULT 22 early_top_pgt 001000 0x40000 OBJECT GLOBAL DEFAULT 22 early_dynamic_pgts 041000 4 OBJECT GLOBAL DEFAULT 22 early_recursion_flag So the resulting vmlinux is smaller by 4k with my toolchain as many other variables can be placed after early_recursion_flag to fill the rest of the page. Note that this is only .init data, so it is freed right after being booted anyway. Savings on-disk are none -- compression of zeros is easy, so the size of bzImage is the same pre and post the change. Signed-off-by:
Jiri Slaby <jslaby@suse.cz> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191003095238.29831-1-jslaby@suse.cz
-
- 22 Jul, 2019 1 commit
-
-
Cao jin authored
Commit e6401c13 ("x86/irq/64: Split the IRQ stack into its own pages") missed to update one piece of comment as it did to its peer in Xen, which will confuse people who still need to read comment. Signed-off-by:
Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190719081635.26528-1-caoj.fnst@cn.fujitsu.com
-
- 18 Jul, 2019 1 commit
-
-
Josh Poimboeuf authored
After an objtool improvement, it complains about the fact that start_cpu0() jumps to code which has an LRET instruction. arch/x86/kernel/head_64.o: warning: objtool: .head.text+0xe4: unsupported instruction in callable function Technically, start_cpu0() is callable, but it acts nothing like a callable function. Prevent objtool from treating it like one by removing its ELF function annotation. Signed-off-by:
Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/6b1b4505fcb90571a55fa1b52d71fb458ca24454.1563413318.git.jpoimboe@redhat.com
-
- 17 Jul, 2019 1 commit
-
-
Peter Zijlstra authored
The one paravirt read_cr2() implementation (Xen) is actually quite trivial and doesn't need to clobber anything other than the return register. Making read_cr2() CALLEE_SAVE avoids all the PUSH/POP nonsense and allows more convenient use from assembly. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Juergen Gross <jgross@suse.com> Cc: bp@alien8.de Cc: rostedt@goodmis.org Cc: luto@kernel.org Cc: torvalds@linux-foundation.org Cc: hpa@zytor.com Cc: dave.hansen@linux.intel.com Cc: zhe.he@windriver.com Cc: joel@joelfernandes.org Cc: devel@etsukata.com Link: https://lkml.kernel.org/r/20190711114335.887392493@infradead.org
-
- 17 Apr, 2019 1 commit
-
-
Andy Lutomirski authored
Currently, the IRQ stack is hardcoded as the first page of the percpu area, and the stack canary lives on the IRQ stack. The former gets in the way of adding an IRQ stack guard page, and the latter is a potential weakness in the stack canary mechanism. Split the IRQ stack into its own private percpu pages. [ tglx: Make 64 and 32 bit share struct irq_stack ] Signed-off-by:
Andy Lutomirski <luto@kernel.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: "Chang S. Bae" <chang.seok.bae@intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Feng Tang <feng.tang@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Joerg Roedel <jroedel@suse.de> Cc: Jordan Borgner <mail@jordan-borgner.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Maran Wilson <maran.wilson@oracle.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolai Stange <nstange@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pu Wen <puwen@hygon.cn> Cc: "Rafael Ávila de Espíndola" <rafael@espindo.la> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: x86-ml <x86@kernel.org> Cc: xen-devel@lists.xenproject.org Link: https://lkml.kernel.org/r/20190414160146.267376656@linutronix.de
-
- 13 Dec, 2018 1 commit
-
-
Maran Wilson authored
In order to pave the way for hypervisors other than Xen to use the PVH entry point for VMs, we need to factor the PVH entry code into Xen specific and hypervisor agnostic components. The first step in doing that, is to create a new config option for PVH entry that can be enabled independently from CONFIG_XEN. Signed-off-by:
Maran Wilson <maran.wilson@oracle.com> Reviewed-by:
Juergen Gross <jgross@suse.com> Acked-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com>
-
- 20 Sep, 2018 1 commit
-
-
Feng Tang authored
We met a kernel panic when enabling earlycon, which is due to the fixmap address of earlycon is not statically setup. Currently the static fixmap setup in head_64.S only covers 2M virtual address space, while it actually could be in 4M space with different kernel configurations, e.g. when VSYSCALL emulation is disabled. So increase the static space to 4M for now by defining FIXMAP_PMD_NUM to 2, and add a build time check to ensure that the fixmap is covered by the initial static page tables. Fixes: 1ad83c85 ("x86_64,vsyscall: Make vsyscall emulation configurable") Suggested-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Feng Tang <feng.tang@intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Tested-by:
kernel test robot <rong.a.chen@intel.com> Reviewed-by: Juergen Gross <jgross@suse.com> (Xen parts) Cc: H Peter Anvin <hpa@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180920025828.23699-1-feng.tang@intel.com
-
- 03 Sep, 2018 2 commits
-
-
Juergen Gross authored
Most of the paravirt ops defined in pv_mmu_ops are for Xen PV guests only. Define them only if CONFIG_PARAVIRT_XXL is set. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Cc: virtualization@lists.linux-foundation.org Cc: akataria@vmware.com Cc: rusty@rustcorp.com.au Cc: boris.ostrovsky@oracle.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20180828074026.820-15-jgross@suse.com
-
Juergen Gross authored
Most of the paravirt ops defined in pv_cpu_ops are for Xen PV guests only. Define them only if CONFIG_PARAVIRT_XXL is set. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Cc: virtualization@lists.linux-foundation.org Cc: akataria@vmware.com Cc: rusty@rustcorp.com.au Cc: boris.ostrovsky@oracle.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20180828074026.820-13-jgross@suse.com
-
- 03 Jul, 2018 1 commit
-
-
Jan Beulich authored
Some Intel CPUs don't recognize 64-bit XORs as zeroing idioms. Zeroing idioms don't require execution bandwidth, as they're being taken care of in the frontend (through register renaming). Use 32-bit XORs instead. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: herbert@gondor.apana.org.au Cc: pavel@ucw.cz Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/5B39FF1A02000078001CFB54@prv1-mh.provo.novell.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 12 Apr, 2018 1 commit
-
-
Dave Hansen authored
I was mystified as to where the _PAGE_GLOBAL in the kernel page tables for kernel text came from. I audited all the places I could find, but I missed one: head_64.S. The page tables that we create in here live for a long time, and they also have _PAGE_GLOBAL set, despite whether the processor supports it or not. It's harmless, and we got *lucky* that the pageattr code accidentally clears it when we wipe it out of __supported_pte_mask and then later try to mark kernel text read-only. Comment some of these properties to make it easier to find and understand in the future. Signed-off-by:
Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180406205513.079BB265@viggo.jf.intel.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 21 Feb, 2018 3 commits
-
-
Kirill A. Shutemov authored
By this point we have functioning boot-time switching between 4- and 5-level paging mode. But naive approach comes with cost. Numbers below are for kernel build, allmodconfig, 5 times. CONFIG_X86_5LEVEL=n: Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs): 17308719.892691 task-clock:u (msec) # 26.772 CPUs utilized ( +- 0.11% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 331,993,164 page-faults:u # 0.019 M/sec ( +- 0.01% ) 43,614,978,867,455 cycles:u # 2.520 GHz ( +- 0.01% ) 39,371,534,575,126 stalled-cycles-frontend:u # 90.27% frontend cycles idle ( +- 0.09% ) 28,363,350,152,428 instructions:u # 0.65 insn per cycle # 1.39 stalled cycles per insn ( +- 0.00% ) 6,316,784,066,413 branches:u # 364.948 M/sec ( +- 0.00% ) 250,808,144,781 branch-misses:u # 3.97% of all branches ( +- 0.01% ) 646.531974142 seconds time elapsed ( +- 1.15% ) CONFIG_X86_5LEVEL=y: Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs): 17411536.780625 task-clock:u (msec) # 26.426 CPUs utilized ( +- 0.10% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 331,868,663 page-faults:u # 0.019 M/sec ( +- 0.01% ) 43,865,909,056,301 cycles:u # 2.519 GHz ( +- 0.01% ) 39,740,130,365,581 stalled-cycles-frontend:u # 90.59% frontend cycles idle ( +- 0.05% ) 28,363,358,997,959 instructions:u # 0.65 insn per cycle # 1.40 stalled cycles per insn ( +- 0.00% ) 6,316,784,937,460 branches:u # 362.793 M/sec ( +- 0.00% ) 251,531,919,485 branch-misses:u # 3.98% of all branches ( +- 0.00% ) 658.886307752 seconds time elapsed ( +- 0.92% ) The patch tries to fix the performance regression by using cpu_feature_enabled(X86_FEATURE_LA57) instead of pgtable_l5_enabled in all hot code paths. These will statically patch the target code for additional performance. CONFIG_X86_5LEVEL=y + the patch: Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs): 17381990.268506 task-clock:u (msec) # 26.907 CPUs utilized ( +- 0.19% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 331,862,625 page-faults:u # 0.019 M/sec ( +- 0.01% ) 43,697,726,320,051 cycles:u # 2.514 GHz ( +- 0.03% ) 39,480,408,690,401 stalled-cycles-frontend:u # 90.35% frontend cycles idle ( +- 0.05% ) 28,363,394,221,388 instructions:u # 0.65 insn per cycle # 1.39 stalled cycles per insn ( +- 0.00% ) 6,316,794,985,573 branches:u # 363.410 M/sec ( +- 0.00% ) 251,013,232,547 branch-misses:u # 3.97% of all branches ( +- 0.01% ) 645.991174661 seconds time elapsed ( +- 1.19% ) Unfortunately, this approach doesn't help with text size: vmlinux.before .text size: 8190319 vmlinux.after .text size: 8200623 The .text section is increased by about 4k. Not sure if we can do anything about this. Signed-off-by:
Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180216114948.68868-4-kirill.shutemov@linux.intel.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Kirill A. Shutemov authored
With boot-time switching between paging modes, XEN_PV and XEN_PVH can be boot into 4-level paging mode. Tested-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by:
Juergen Gross <jgross@suse.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180216114948.68868-2-kirill.shutemov@linux.intel.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
The objtool retpoline validation found this indirect jump. Seeing how it's on CPU bringup before we run userspace it should be safe, annotate it. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
David Woodhouse <dwmw@amazon.co.uk> Acked-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 16 Feb, 2018 2 commits
-
-
Kirill A. Shutemov authored
Early boot code should be able to initialize page tables for both 4- and 5-level paging modes. Signed-off-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180214182542.69302-7-kirill.shutemov@linux.intel.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Kirill A. Shutemov authored
For 4- and 5-level paging we have different 'page_offset_base'. Let's initialize it at boot-time accordingly to machine capability. We also have to split __PAGE_OFFSET_BASE into two constants -- for 4- and 5-level paging. Signed-off-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180214182542.69302-4-kirill.shutemov@linux.intel.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 23 Dec, 2017 1 commit
-
-
Dave Hansen authored
Kernel page table isolation requires to have two PGDs. One for the kernel, which contains the full kernel mapping plus the user space mapping and one for user space which contains the user space mappings and the minimal set of kernel mappings which are required by the architecture to be able to transition from and to user space. Add the necessary preliminaries. [ tglx: Split out from the big kaiser dump. EFI fixup from Kirill ] Signed-off-by:
Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 02 Nov, 2017 2 commits
-
-
Greg Kroah-Hartman authored
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard...
-
Andy Lutomirski authored
These code paths will diverge soon. Signed-off-by:
Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/dccf8c7b3750199b4b30383c812d4e2931811509.1509609304.git.luto@kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 01 Nov, 2017 1 commit
-
-
Ricardo Neri authored
Both head_32.S and head_64.S utilize the same value to initialize the control register CR0. Also, other parts of the kernel might want to access this initial definition (e.g., emulation code for User-Mode Instruction Prevention uses this state to provide a sane dummy value for CR0 when emulating the smsw instruction). Thus, relocate this definition to a header file from which it can be conveniently accessed. Suggested-by:
Borislav Petkov <bp@alien8.de> Signed-off-by:
Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Andy Lutomirski <luto@kernel.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: ricardo.neri@intel.com Cc: linux-mm@kvack.org Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Shuah Khan <shuah@kernel.org> Cc: linux-arch@vger.kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jiri Slaby <jslaby@suse.cz> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/1509135945-13762-3-git-send-email-ricardo.neri-calderon@linux.intel.com
-
- 23 Oct, 2017 1 commit
-
-
Josh Poimboeuf authored
I find the '.ifeq <expression>' directive to be confusing. Reading it quickly seems to suggest its opposite meaning, or that it's missing an argument. Improve readability by replacing all of its x86 uses with '.if <expression> == 0'. Signed-off-by:
Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrei Vagin <avagin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/757da028e802c7e98d23fbab8d234b1063e161cf.1508516398.git.jpoimboe@redhat.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-