kernel 启动概述
================


kernel启动之前的动作
---------------------

kernel镜像加载到ddr的相应位置
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

kernel一般会存在于存储设备上，比如FLASH/EMMC/SDCARD。因此需要先将kernel镜像加载到RAM的位置上，CPU才可以去访问到kernel，具体实现方法由
bootloader决定，可以是自动复制，也可以是根据bootloader cmdline模式下的输入命令来复制


硬件要求
^^^^^^^^^

根据 ``arch/arm64/kernel/head.S`` 的stext(kernel的入口函数)的注释头

::


    /*
     * Kernel startup entry point.
     * ---------------------------
     *
     * The requirements are:
     *   MMU = off, D-cache = off, I-cache = on or off,
     *   x0 = physical address to the FDT blob.
     *
     * This code is mostly position independent so you call this at
     * __pa(PAGE_OFFSET + TEXT_OFFSET).
     *
     * Note that the callee-saved registers are used for storing variables
     * that are useful before the MMU is enabled. The allocations are described
     * in the entry routines.
     */
        __HEAD
    _head:
        /*
         * DO NOT MODIFY. Image header expected by Linux boot-loaders.
         */
    #ifdef CONFIG_EFI
        /*
         * This add instruction has no meaningful effect except that
         * its opcode forms the magic "MZ" signature required by UEFI.
         */
        add	x13, x18, #0x16
        b	stext
    #else
        b	stext				// branch to kernel start, magic
        .long	0				// reserved
        ...

所以有要求如下

- MMU = off

MMU是用来处理物理地址到虚拟内存地址的映射，因此软件上需要先配置其映射表(也就是常说的页表),MMU关闭的情况下，CPU的寻址都是物理地址，也就是不需要经过转化直接访问相应的硬件，
一旦打开之后，CPU的寻址都是虚拟地址，都会经过MMU映射到真正的物理地址上，即使代码中写的是一个物理地址但也会被当做虚拟地址使用

地址映射表是由kernel自己创建的，在创建映射表之前的地址都是物理地址，所以必须保证MMU是关闭状态

- D-cache = off

CACHE是CPU核内存之间的告诉缓冲器，又分成数据缓冲器D-cache和指令缓冲器I-cache，D-cache一定要关闭，否则可能kernel刚启动的过程中，去取数据的时候，从Cache中取，而这个时候RAM中
的数据还没有Cache过来，导致数据存取异常。


跳转到kernel镜像入口的对应位置
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

bootloader需要通过设置PC指针到kernel的入口代码处(也就是kernel的加载位置)来实现kernel的跳转


kernel 启动第一阶段
-------------------

linux内核启动第一阶段，也就是常说的汇编阶段，也就是stext函数的实现内容，这部分主要完成的工作：CPU ID检查，machine ID检查，创建初始化页表，设置C代码运行环境，跳转到内核第一个真正
的C函数 ``start_kernel`` 执行

kernel 入口地址的指定
^^^^^^^^^^^^^^^^^^^^^^

在 ``arch/arm64/kernel/vmlinux.lds.S`` 中

::

    OUTPUT_ARCH(aarch64)    //说明最终编译的格式为aarch64
    ENTRY(_text)            //表示入口地址为_text

    ...

    . = KIMAGE_VADDR + TEXT_OFFSET; //起始链接地址

	.head.text : {
		_text = .;
		HEAD_TEXT
	}
    .text : {           /* Real text segment        */ 

    ...


所以kernel入口地址是.head.text段的代码首地址

而.head_text段，通过include/linux/init.h中的宏定义__HEAD来表示

::

    /* For assembly routines */
    #define __HEAD		.section	".head.text","ax"
    #define __INIT		.section	".init.text","ax"
    #define __FINIT		.previous

    #define __INITDATA	.section	".init.data","aw",%progbits
    #define __INITRODATA	.section	".init.rodata","a",%progbits
    #define __FINITDATA	.previous

    #define __MEMINIT        .section	".meminit.text", "ax"
    #define __MEMINITDATA    .section	".meminit.data", "aw"
    #define __MEMINITRODATA  .section	".meminit.rodata", "a"


内核启动的入口点，在arch/arm64/kernel/head.S文件中

::

        __HEAD
    _head:
        /*
         * DO NOT MODIFY. Image header expected by Linux boot-loaders.
         */
    #ifdef CONFIG_EFI
        /*
         * This add instruction has no meaningful effect except that
         * its opcode forms the magic "MZ" signature required by UEFI.
         */
        add	x13, x18, #0x16
        b	stext
    #else
        b	stext				// branch to kernel start, magic
        .long	0				// reserved
    #endif
        le64sym	_kernel_offset_le		// Image load offset from start of RAM, little-endian
        le64sym	_kernel_size_le			// Effective size of kernel image, little-endian
        le64sym	_kernel_flags_le		// Informative flags, little-endian
        .quad	0				// reserved
        .quad	0				// reserved
        .quad	0				// reserved
        .ascii	ARM64_IMAGE_MAGIC		// Magic number
    #ifdef CONFIG_EFI
        .long	pe_header - _head		// Offset to the PE header.

    pe_header:
        __EFI_PE_HEADER
    #else
        .long	0				// reserved
    #endif

这段汇编代码中最重要的就是b stext，加载kernel镜像之后第一个运行的函数就是stext


stext函数
""""""""""

启动过程中的汇编阶段，是从arch/arm64/kernel/head.S文件开始，执行的起点是stext函数，入口函数是通过vmlinux.lds链接而成，在head.S中ENTRY(stext)指定

在汇编代码中,宏定义ENTRY和ENDPROC是成对出现的，表示定义一个函数，同时也要指定当前代码所在的段，如 __INIT

::

    #define __INIT  .section    ".init.text","ax"

        __INIT

    ENTRY(stext)
        ....
    ENPROC(stext)

内核启动的必要条件：MMU关闭，D-cache关闭，x0是传递给FDT blob的物理地址

.. note::
    数据高速缓存一定要关闭，因为在内核启动过程中取数据时会先访问高速缓存，而可能高速缓存中缓存了以前u-boot的一些数据，这些数据对于内核来说是错误的。
    而指令高速缓存可以打开，是因为U-boot和内核代码是不重叠的，不会存在指令高速缓存有冲突。


stext函数开始执行


::

    __INIT

    /*
     * The following callee saved general purpose registers are used on the
     * primary lowlevel boot path:
     *
     *  Register   Scope                      Purpose
     *  x21        stext() .. start_kernel()  FDT pointer passed at boot in x0
     *  x23        stext() .. start_kernel()  physical misalignment/KASLR offset
     *  x28        __create_page_tables()     callee preserved temp register
     *  x19/x20    __primary_switch()         callee preserved temp registers
     *  x24        __primary_switch() .. relocate_kernel()
     *                                        current RELR displacement
     */
    ENTRY(stext)
    bl	preserve_boot_args
    bl	el2_setup			// Drop to EL1, w0=cpu_boot_mode
    adrp	x23, __PHYS_OFFSET
    and	x23, x23, MIN_KIMG_ALIGN - 1	// KASLR offset, defaults to 0
    bl	set_cpu_boot_mode_flag
    bl	__create_page_tables
    /*
     * The following calls CPU setup code, see arch/arm64/mm/proc.S for
     * details.
     * On return, the CPU will be ready for the MMU to be turned on and
     * the TCR will have been set.
     */
    bl	__cpu_setup			// initialise processor
    b	__primary_switch
    ENDPROC(stext)


- preserve_boot_args

保存从bootloader传递过来的x0~x3参数到boot_args数组

::

    /*
     * Preserve the arguments passed by the bootloader in x0 .. x3
     */
    preserve_boot_args:
        mov	x21, x0				// x21=FDT  //将dtb的地址暂存在x21寄存器，释放出x0使用

        adr_l	x0, boot_args			// record the contents of   //x0保存boot_args变量的地址
        stp	x21, x1, [x0]			// x0 .. x3 at kernel entry     //将x0 x1的值保存在Boot_args[0] boot_args[1]
        stp	x2, x3, [x0, #16]       //将x2 x3的值保存在boot_args[2] boot_args[3]

        dmb	sy				// needed before dc ivac with
                            // MMU off

        mov	x1, #0x20			// 4 x 8 bytes
        b	__inval_dcache_area		// tail call
    ENDPROC(preserve_boot_args)
    

- set_cpu_boot_mode_flag

此函数用来设置__boot_cpu_mode flag,需要一个前提条件，w20寄存器中保存了CPU启动时的异常等级(exception level)

::


    /*
     * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
     * in w0. See arch/arm64/include/asm/virt.h for more info.
     */
    set_cpu_boot_mode_flag:
        adr_l	x1, __boot_cpu_mode
        cmp	w0, #BOOT_CPU_MODE_EL2
        b.ne	1f
        add	x1, x1, #4
    1:	str	w0, [x1]			// This CPU has booted in EL1
        dmb	sy
        dc	ivac, x1			// Invalidate potentially stale cache line
        ret
    ENDPROC(set_cpu_boot_mode_flag)

由于系统启动之后，需要了解CPU启动时候的exception level，因此需要一个全局变量__boot_cpu_mode来保存启动时的CPUmode

全局变量__boot_cpu_mode定义

::

    /*
     * We need to find out the CPU boot mode long after boot, so we need to
     * store it in a writable variable.
     *
     * This is not in .bss, because we set it sufficiently early that the boot-time
     * zeroing of .bss would clobber it.
     */
    ENTRY(__boot_cpu_mode)
        .long	BOOT_CPU_MODE_EL2
        .long	BOOT_CPU_MODE_EL1


- __create_page_tables

建立页初始化的过程

::

    __create_page_tables:
        mov	x28, lr

        /*
         * Invalidate the init page tables to avoid potential dirty cache lines
         * being evicted. Other page tables are allocated in rodata as part of
         * the kernel image, and thus are clean to the PoC per the boot
         * protocol.
         */
        adrp	x0, init_pg_dir
        adrp	x1, init_pg_end
        sub	x1, x1, x0
        bl	__inval_dcache_area

        /*
         * Clear the init page tables.
         */
        adrp	x0, init_pg_dir
        adrp	x1, init_pg_end
        sub	x1, x1, x0
    1:	stp	xzr, xzr, [x0], #16
        stp	xzr, xzr, [x0], #16
        stp	xzr, xzr, [x0], #16
        stp	xzr, xzr, [x0], #16
        subs	x1, x1, #64
        b.ne	1b

        mov	x7, SWAPPER_MM_MMUFLAGS

        /*
         * Create the identity mapping.
         */
        adrp	x0, idmap_pg_dir
        adrp	x3, __idmap_text_start		// __pa(__idmap_text_start)

    #ifdef CONFIG_ARM64_VA_BITS_52
        mrs_s	x6, SYS_ID_AA64MMFR2_EL1
        and	x6, x6, #(0xf << ID_AA64MMFR2_LVA_SHIFT)
        mov	x5, #52
        cbnz	x6, 1f
    #endif
        mov	x5, #VA_BITS_MIN
    1:
        adr_l	x6, vabits_actual
        str	x5, [x6]
        dmb	sy
        dc	ivac, x6		// Invalidate potentially stale cache line

        /*
         * VA_BITS may be too small to allow for an ID mapping to be created
         * that covers system RAM if that is located sufficiently high in the
         * physical address space. So for the ID map, use an extended virtual
         * range in that case, and configure an additional translation level
         * if needed.
         *
         * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the
         * entire ID map region can be mapped. As T0SZ == (64 - #bits used),
         * this number conveniently equals the number of leading zeroes in
         * the physical address of __idmap_text_end.
         */
        adrp	x5, __idmap_text_end
        clz	x5, x5
        cmp	x5, TCR_T0SZ(VA_BITS)	// default T0SZ small enough?
        b.ge	1f			// .. then skip VA range extension

        adr_l	x6, idmap_t0sz
        str	x5, [x6]
        dmb	sy
        dc	ivac, x6		// Invalidate potentially stale cache line

    #if (VA_BITS < 48)
    #define EXTRA_SHIFT	(PGDIR_SHIFT + PAGE_SHIFT - 3)
    #define EXTRA_PTRS	(1 << (PHYS_MASK_SHIFT - EXTRA_SHIFT))

        /*
         * If VA_BITS < 48, we have to configure an additional table level.
         * First, we have to verify our assumption that the current value of
         * VA_BITS was chosen such that all translation levels are fully
         * utilised, and that lowering T0SZ will always result in an additional
         * translation level to be configured.
         */
    #if VA_BITS != EXTRA_SHIFT
    #error "Mismatch between VA_BITS and page size/number of translation levels"
    #endif

        mov	x4, EXTRA_PTRS
        create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6
    #else
        /*
         * If VA_BITS == 48, we don't have to configure an additional
         * translation level, but the top-level table has more entries.
         */
        mov	x4, #1 << (PHYS_MASK_SHIFT - PGDIR_SHIFT)
        str_l	x4, idmap_ptrs_per_pgd, x5
    #endif
    1:
        ldr_l	x4, idmap_ptrs_per_pgd
        mov	x5, x3				// __pa(__idmap_text_start)
        adr_l	x6, __idmap_text_end		// __pa(__idmap_text_end)

        map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x13, x14

        /*
         * Map the kernel image (starting with PHYS_OFFSET).
         */
        adrp	x0, init_pg_dir
        mov_q	x5, KIMAGE_VADDR + TEXT_OFFSET	// compile time __va(_text)
        add	x5, x5, x23			// add KASLR displacement
        mov	x4, PTRS_PER_PGD
        adrp	x6, _end			// runtime __pa(_end)
        adrp	x3, _text			// runtime __pa(_text)
        sub	x6, x6, x3			// _end - _text
        add	x6, x6, x5			// runtime __va(_end)

        map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x13, x14

        /*
         * Since the page tables have been populated with non-cacheable
         * accesses (MMU disabled), invalidate the idmap and swapper page
         * tables again to remove any speculatively loaded cache lines.
         */
        adrp	x0, idmap_pg_dir
        adrp	x1, init_pg_end
        sub	x1, x1, x0
        dmb	sy
        bl	__inval_dcache_area

        ret	x28
    ENDPROC(__create_page_tables)


- __cpu_setup

cpu的初始化设置

::

    /* *	__cpu_setup
     *
     *	Initialise the processor for turning the MMU on.  Return in x0 the
     *	value of the SCTLR_EL1 register.
     */
        .pushsection ".idmap.text", "awx"
    ENTRY(__cpu_setup)
        tlbi	vmalle1				// Invalidate local TLB
        dsb	nsh

        mov	x0, #3 << 20
        msr	cpacr_el1, x0			// Enable FP/ASIMD
        mov	x0, #1 << 12			// Reset mdscr_el1 and disable
        msr	mdscr_el1, x0			// access to the DCC from EL0
        isb					// Unmask debug exceptions now,
        enable_dbg				// since this is per-cpu
        reset_pmuserenr_el0 x0			// Disable PMU access from EL0
        /*
         * Memory region attributes for LPAE:
         *
         *   n = AttrIndx[2:0]
         *			n	MAIR
         *   DEVICE_nGnRnE	000	00000000
         *   DEVICE_nGnRE	001	00000100
         *   DEVICE_GRE		010	00001100
         *   NORMAL_NC		011	01000100
         *   NORMAL		100	11111111
         *   NORMAL_WT		101	10111011
         */
        ldr	x5, =MAIR(0x00, MT_DEVICE_nGnRnE) | \
                 MAIR(0x04, MT_DEVICE_nGnRE) | \
                 MAIR(0x0c, MT_DEVICE_GRE) | \
                 MAIR(0x44, MT_NORMAL_NC) | \
                 MAIR(0xff, MT_NORMAL) | \
                 MAIR(0xbb, MT_NORMAL_WT)
        msr	mair_el1, x5
        /*
         * Prepare SCTLR
         */
        mov_q	x0, SCTLR_EL1_SET
        /*
         * Set/prepare TCR and TTBR. We use 512GB (39-bit) address range for
         * both user and kernel.
         */
        ldr	x10, =TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
                TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
                TCR_TBI0 | TCR_A1 | TCR_KASAN_FLAGS
        tcr_clear_errata_bits x10, x9, x5

    #ifdef CONFIG_ARM64_VA_BITS_52
        ldr_l		x9, vabits_actual
        sub		x9, xzr, x9
        add		x9, x9, #64
        tcr_set_t1sz	x10, x9
    #else
        ldr_l		x9, idmap_t0sz
    #endif
        tcr_set_t0sz	x10, x9

        /*
         * Set the IPS bits in TCR_EL1.
         */
        tcr_compute_pa_size x10, #TCR_IPS_SHIFT, x5, x6
    #ifdef CONFIG_ARM64_HW_AFDBM
        /*
         * Enable hardware update of the Access Flags bit.
         * Hardware dirty bit management is enabled later,
         * via capabilities.
         */
        mrs	x9, ID_AA64MMFR1_EL1
        and	x9, x9, #0xf
        cbz	x9, 1f
        orr	x10, x10, #TCR_HA		// hardware Access flag update
    1:
    #endif	/* CONFIG_ARM64_HW_AFDBM */
        msr	tcr_el1, x10
        ret					// return to head.S
    ENDPROC(__cpu_setup)


主要包括：

1) cache和TLB的处理
2) memory attribute lookup table的创建
3) SCTLR_EL1 TCR_EL1的设定

- __primary_switch

主要工作是为打开MMU做准备

::

    __primary_switch:
    #ifdef CONFIG_RANDOMIZE_BASE
        mov	x19, x0				// preserve new SCTLR_EL1 value
        mrs	x20, sctlr_el1			// preserve old SCTLR_EL1 value
    #endif

        adrp	x1, init_pg_dir
        bl	__enable_mmu        //打开MMU
    #ifdef CONFIG_RELOCATABLE
    #ifdef CONFIG_RELR
        mov	x24, #0				// no RELR displacement yet
    #endif
        bl	__relocate_kernel
    #ifdef CONFIG_RANDOMIZE_BASE
        ldr	x8, =__primary_switched
        adrp	x0, __PHYS_OFFSET
        blr	x8

        /*
         * If we return here, we have a KASLR displacement in x23 which we need
         * to take into account by discarding the current kernel mapping and
         * creating a new one.
         */
        pre_disable_mmu_workaround
        msr	sctlr_el1, x20			// disable the MMU
        isb
        bl	__create_page_tables		// recreate kernel mapping

        tlbi	vmalle1				// Remove any stale TLB entries
        dsb	nsh

        msr	sctlr_el1, x19			// re-enable the MMU
        isb
        ic	iallu				// flush instructions fetched
        dsb	nsh				// via old mapping
        isb

        bl	__relocate_kernel
    #endif
    #endif
        ldr	x8, =__primary_switched
        adrp	x0, __PHYS_OFFSET
        br	x8
    ENDPROC(__primary_switch)

函数中通过__enable_mmu函数来开启MMU, 并调用__primary_switched函数

::

    /*
     * The following fragment of code is executed with the MMU enabled.
     *
     *   x0 = __PHYS_OFFSET
     */
    __primary_switched:
        adrp	x4, init_thread_union
        add	sp, x4, #THREAD_SIZE
        adr_l	x5, init_task
        msr	sp_el0, x5			// Save thread_info

        adr_l	x8, vectors			// load VBAR_EL1 with virtual
        msr	vbar_el1, x8			// vector table address
        isb

        stp	xzr, x30, [sp, #-16]!
        mov	x29, sp

        str_l	x21, __fdt_pointer, x5		// Save FDT pointer

        ldr_l	x4, kimage_vaddr		// Save the offset between
        sub	x4, x4, x0			// the kernel virtual and
        str_l	x4, kimage_voffset, x5		// physical mappings

        // Clear BSS
        adr_l	x0, __bss_start
        mov	x1, xzr
        adr_l	x2, __bss_stop
        sub	x2, x2, x0
        bl	__pi_memset
        dsb	ishst				// Make zero page visible to PTW

    #ifdef CONFIG_KASAN
        bl	kasan_early_init
    #endif
    #ifdef CONFIG_RANDOMIZE_BASE
        tst	x23, ~(MIN_KIMG_ALIGN - 1)	// already running randomized?
        b.ne	0f
        mov	x0, x21				// pass FDT address in x0
        bl	kaslr_early_init		// parse FDT for KASLR options
        cbz	x0, 0f				// KASLR disabled? just proceed
        orr	x23, x23, x0			// record KASLR offset
        ldp	x29, x30, [sp], #16		// we must enable KASLR, return
        ret					// to __primary_switch()
    0:
    #endif
        add	sp, sp, #16
        mov	x29, #0
        mov	x30, #0
        b	start_kernel
    ENDPROC(__primary_switched)

此函数中进行一些C环境的准备，并在最后执行start_kernel函数，内核的启动进入到C语言环境阶段


kernel 启动第二阶段
--------------------

linux内核启动的第二阶段也就是常说的C语言阶段，从 ``start_kernel`` 函数开始。 start_kernel函数是所有linux平台进入系统内核初始化后的入口函数，主要完成剩余的与硬件平台相关的初始化
工作，这些初始化操作有的是公共的，有的需要配置才会执行，内核工作需要的模块的初始化一次被调用：如内存管理、调度系统、异常处理等


start_kenel 
^^^^^^^^^^^^

start_kernel函数在init/main.c文件中，主要完成linux子系统的初始化工作，此部分初始化内容繁多，暂时略过...

::


    asmlinkage __visible void __init start_kernel(void)
    {
        char *command_line;
        char *after_dashes;

        set_task_stack_end_magic(&init_task);
        smp_setup_processor_id();
        debug_objects_early_init();

        cgroup_init_early();

        local_irq_disable();
        early_boot_irqs_disabled = true;

        /*
         * Interrupts are still disabled. Do necessary setups, then
         * enable them.
         */
        boot_cpu_init();
        page_address_init();
        pr_notice("%s", linux_banner);
        early_security_init();
        setup_arch(&command_line);
        setup_command_line(command_line);
        setup_nr_cpu_ids();
        setup_per_cpu_areas();
        smp_prepare_boot_cpu();	/* arch-specific boot-cpu hooks */
        boot_cpu_hotplug_init();

        build_all_zonelists(NULL);
        page_alloc_init();

        pr_notice("Kernel command line: %s\n", boot_command_line);
        /* parameters may set static keys */
        jump_label_init();
        parse_early_param();
        after_dashes = parse_args("Booting kernel",
                      static_command_line, __start___param,
                      __stop___param - __start___param,
                      -1, -1, NULL, &unknown_bootoption);
        if (!IS_ERR_OR_NULL(after_dashes))
            parse_args("Setting init args", after_dashes, NULL, 0, -1, -1,
                   NULL, set_init_arg);

        /*
         * These use large bootmem allocations and must precede
         * kmem_cache_init()
         */
        setup_log_buf(0);
        vfs_caches_init_early();
        sort_main_extable();
        trap_init();
        mm_init();

        ftrace_init();

        /* trace_printk can be enabled here */
        early_trace_init();

        /*
         * Set up the scheduler prior starting any interrupts (such as the
         * timer interrupt). Full topology setup happens at smp_init()
         * time - but meanwhile we still have a functioning scheduler.
         */
        sched_init();
        /*
         * Disable preemption - early bootup scheduling is extremely
         * fragile until we cpu_idle() for the first time.
         */
        preempt_disable();
        if (WARN(!irqs_disabled(),
             "Interrupts were enabled *very* early, fixing it\n"))
            local_irq_disable();
        radix_tree_init();

        /*
         * Set up housekeeping before setting up workqueues to allow the unbound
         * workqueue to take non-housekeeping into account.
         */
        housekeeping_init();

        /*
         * Allow workqueue creation and work item queueing/cancelling
         * early.  Work item execution depends on kthreads and starts after
         * workqueue_init().
         */
        workqueue_init_early();

        rcu_init();

        /* Trace events are available after this */
        trace_init();

        if (initcall_debug)
            initcall_debug_enable();

        context_tracking_init();
        /* init some links before init_ISA_irqs() */
        early_irq_init();
        init_IRQ();
        tick_init();
        rcu_init_nohz();
        init_timers();
        hrtimers_init();
        softirq_init();
        timekeeping_init();

        /*
         * For best initial stack canary entropy, prepare it after:
         * - setup_arch() for any UEFI RNG entropy and boot cmdline access
         * - timekeeping_init() for ktime entropy used in rand_initialize()
         * - rand_initialize() to get any arch-specific entropy like RDRAND
         * - add_latent_entropy() to get any latent entropy
         * - adding command line entropy
         */
        rand_initialize();
        add_latent_entropy();
        add_device_randomness(command_line, strlen(command_line));
        boot_init_stack_canary();

        time_init();
        perf_event_init();
        profile_init();
        call_function_init();
        WARN(!irqs_disabled(), "Interrupts were enabled early\n");

        early_boot_irqs_disabled = false;
        local_irq_enable();

        kmem_cache_init_late();

        /*
         * HACK ALERT! This is early. We're enabling the console before
         * we've done PCI setups etc, and console_init() must be aware of
         * this. But we do want output early, in case something goes wrong.
         */
        console_init();
        if (panic_later)
            panic("Too many boot %s vars at `%s'", panic_later,
                  panic_param);

        lockdep_init();

        /*
         * Need to run this when irqs are enabled, because it wants
         * to self-test [hard/soft]-irqs on/off lock inversion bugs
         * too:
         */
        locking_selftest();

        /*
         * This needs to be called before any devices perform DMA
         * operations that might use the SWIOTLB bounce buffers. It will
         * mark the bounce buffers as decrypted so that their usage will
         * not cause "plain-text" data to be decrypted when accessed.
         */
        mem_encrypt_init();

    #ifdef CONFIG_BLK_DEV_INITRD
        if (initrd_start && !initrd_below_start_ok &&
            page_to_pfn(virt_to_page((void *)initrd_start)) < min_low_pfn) {
            pr_crit("initrd overwritten (0x%08lx < 0x%08lx) - disabling it.\n",
                page_to_pfn(virt_to_page((void *)initrd_start)),
                min_low_pfn);
            initrd_start = 0;
        }
    #endif
        setup_per_cpu_pageset();
        numa_policy_init();
        acpi_early_init();
        if (late_time_init)
            late_time_init();
        sched_clock_init();
        calibrate_delay();
        pid_idr_init();
        anon_vma_init();
    #ifdef CONFIG_X86
        if (efi_enabled(EFI_RUNTIME_SERVICES))
            efi_enter_virtual_mode();
    #endif
        thread_stack_cache_init();
        cred_init();
        fork_init();
        proc_caches_init();
        uts_ns_init();
        buffer_init();
        key_init();
        security_init();
        dbg_late_init();
        vfs_caches_init();
        pagecache_init();
        signals_init();
        seq_file_init();
        proc_root_init();
        nsfs_init();
        cpuset_init();
        cgroup_init();
        taskstats_init_early();
        delayacct_init();

        poking_init();
        check_bugs();

        acpi_subsystem_init();
        arch_post_acpi_subsys_init();
        sfi_init_late();

        /* Do the rest non-__init'ed, we're now alive */
        arch_call_rest_init();
    }

::

    pr_notice("%s", linux_barner);

::

    /* FIXED STRINGS! Don't touch! */
    const char linux_banner[] =
            "Linux version " UTS_RELEASE " (" LINUX_COMPILE_BY "@"
            LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION "\n";
            ")"


执行的效果是，在内核启动的初期，打印内核版本号和构建信息

::

    [    0.000000 ] Linux version 4.14.74 (jenkins@MonoCI) (gcc version 6.5.0 (Linaro GCC 6.5-2018.12)) #2 SMP PREEMPT Mon Aug 23 12:17:44 CST 2021


setup_arch
^^^^^^^^^^^

setup_arch是体系结构相关的，该函数根据处理器、硬件平台具体型号设置系统，及解析系统命令行，系统内存管理初始化，统计并注册系统各种资源等，每个体系都有自己的setup_arch函数，
是由顶层Makefile中的arch变量定义的，参数是违背初始化的内部变量command_line

::

    void __init setup_arch(char **cmdline_p)
    {
        init_mm.start_code = (unsigned long) _text;
        init_mm.end_code   = (unsigned long) _etext;
        init_mm.end_data   = (unsigned long) _edata;
        init_mm.brk	   = (unsigned long) _end;

        *cmdline_p = boot_command_line;

        early_fixmap_init();
        early_ioremap_init();

        setup_machine_fdt(__fdt_pointer);

        /*
         * Initialise the static keys early as they may be enabled by the
         * cpufeature code and early parameters.
         */
        jump_label_init();
        parse_early_param();

        /*
         * Unmask asynchronous aborts and fiq after bringing up possible
         * earlycon. (Report possible System Errors once we can report this
         * occurred).
         */
        local_daif_restore(DAIF_PROCCTX_NOIRQ);

        /*
         * TTBR0 is only used for the identity mapping at this stage. Make it
         * point to zero page to avoid speculatively fetching new entries.
         */
        cpu_uninstall_idmap();

        xen_early_init();
        efi_init();
        arm64_memblock_init();

        paging_init();

        acpi_table_upgrade();

        /* Parse the ACPI tables for possible boot-time configuration */
        acpi_boot_table_init();

        if (acpi_disabled)
            unflatten_device_tree();

        bootmem_init();

        kasan_init();

        request_standard_resources();

        early_ioremap_reset();

        if (acpi_disabled)
            psci_dt_init();
        else
            psci_acpi_init();

        cpu_read_bootcpu_ops();
        smp_init_cpus();
        smp_build_mpidr_hash();

        /* Init percpu seeds for random tags after cpus are set up. */
        kasan_init_tags();

    #ifdef CONFIG_ARM64_SW_TTBR0_PAN
        /*
         * Make sure init_thread_info.ttbr0 always generates translation
         * faults in case uaccess_enable() is inadvertently called by the init
         * thread.
         */
        init_task.thread_info.ttbr0 = __pa_symbol(empty_zero_page);
    #endif

    #ifdef CONFIG_VT
        conswitchp = &dummy_con;
    #endif
        if (boot_args[1] || boot_args[2] || boot_args[3]) {
            pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
                "\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
                "This indicates a broken bootloader or old kernel\n",
                boot_args[1], boot_args[2], boot_args[3]);
        }
    }


- setup_machine_fdt

setup_machine_fdt函数的输入参数是设备树(dtb)首地址，u-boot启动程序把设备树读取到内存中，之后在启动内核的同时，将设备树首地址传给内核，setup_machine_fdt函数的参数__fdt_pointer
就是u-boot传给内核的设备树地址，函数中的fdt表示设备树在内存中是一块连续地址存储的

::


    static void __init setup_machine_fdt(phys_addr_t dt_phys)
    {
        int size;
        void *dt_virt = fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL);  //此时已开启MMU，需要将dtb物理地址转换为虚拟地址
        const char *name;

        if (dt_virt)
            memblock_reserve(dt_phys, size);

        if (!dt_virt || !early_init_dt_scan(dt_virt)) {     //fdt扫描函数,经过此函数之后内核便可以通过调用fdt接口函数获取相关信息
            pr_crit("\n"
                "Error: invalid device tree blob at physical address %pa (virtual address 0x%p)\n"
                "The dtb must be 8-byte aligned and must not exceed 2 MB in size\n"
                "\nPlease check your bootloader.",
                &dt_phys, dt_virt);

            while (true)
                cpu_relax();
        }

        /* Early fixups are done, map the FDT as read-only now */
        fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL_RO);

        name = of_flat_dt_get_machine_name();
        if (!name)
            return;

        pr_info("Machine model: %s\n", name);
        dump_stack_set_arch_desc("%s (DT)", name);
    }

- console_init

console_init函数执行控制台的初始化工作，在console_init函数执行之前的printk打印信息，需要在console_init函数执行之后才能打印出来，在此之前printk的打印信息都被保存在一个缓存中

::

    kernel/printk/printk.c

    /*
     * Initialize the console device. This is called *early*, so
     * we can't necessarily depend on lots of kernel help here.
     * Just do some early initializations, and do the complex setup
     * later.
     */
    void __init console_init(void)
    {
        int ret;
        initcall_t call;
        initcall_entry_t *ce;

        /* Setup the default TTY line discipline. */
        n_tty_init();

        /*
         * set up the console device so that later boot sequences can
         * inform about problems etc..
         */
        ce = __con_initcall_start;
        trace_initcall_level("console");
        while (ce < __con_initcall_end) {
            call = initcall_from_entry(ce);
            trace_initcall_start(call);
            ret = call();
            trace_initcall_finish(call, ret);
            ce++;
        }
    }

此函数中会执行，__con_initcall_start和__con_initcall_end这两个地址之间的内容，这两个地址可以在vmlinux.lds中找到

::

    __con_initcall_start = .; 
    KEEP(*(.con_initcall.init)) 
    __con_initcall_end = .;

这两个地址之间，存放的是.con_initcall.init段的内容

::

    include/linux/init.h

    #define console_initcall(fn)    \
        static initcall_t __initcall_##fn##id __used  \
        __attribute__((__section_.con_initcall.init)) = fn

通过宏定义console_initcall(fn)将函数指针fn存放到.con_initcall.init段，之后在调用console_init()函数时，就会遍历__con_initcall_start和__con_initcall_end的
地址区域，依次运行存放在启动的函数fn


rest_init
^^^^^^^^^

在一系列的初始化之后，在rest_init函数中启动了三个进程 ``idle`` 、 ``kernel_init`` 、 ``kthreadd`` 来开始操作系统的正式运行

::


    noinline void __ref rest_init(void)
    {
        struct task_struct *tsk;
        int pid;

        rcu_scheduler_starting();
        /*
         * We need to spawn init first so that it obtains pid 1, however
         * the init task will end up wanting to create kthreads, which, if
         * we schedule it before we create kthreadd, will OOPS.
         */
        pid = kernel_thread(kernel_init, NULL, CLONE_FS);   //创建kernel_init内核线程，即init, 1号进程
        /*
         * Pin init on the boot CPU. Task migration is not properly working
         * until sched_init_smp() has been run. It will set the allowed
         * CPUs for init to the non isolated CPUs.
         */
        rcu_read_lock();
        tsk = find_task_by_pid_ns(pid, &init_pid_ns);
        set_cpus_allowed_ptr(tsk, cpumask_of(smp_processor_id()));
        rcu_read_unlock();

        numa_default_policy();
        pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);    //创建kthreadd内核线程，2号进程，用于管理和调度其他内核线程
        rcu_read_lock();
        kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);
        rcu_read_unlock();

        /*
         * Enable might_sleep() and smp_processor_id() checks.
         * They cannot be enabled earlier because with CONFIG_PREEMPTION=y
         * kernel_thread() would trigger might_sleep() splats. With
         * CONFIG_PREEMPT_VOLUNTARY=y the init task might have scheduled
         * already, but it's stuck on the kthreadd_done completion.
         */
        system_state = SYSTEM_SCHEDULING;

        complete(&kthreadd_done);

        /*
         * The boot idle thread must execute schedule()
         * at least once to get things moving:
         */
        schedule_preempt_disabled();    //调用进程调度，并禁止内核抢占
        /* Call into cpu_idle with preempt disabled */
        cpu_startup_entry(CPUHP_ONLINE);    //0号进程完成kernel初始化工作，进入idle循环
    }

1) idle进程是操作系统的空闲进程，CPU空闲的时候会去运行它
2) kernel_init进程最开始只是一个函数，作为进程被启动，init进程是永远存在的,PID是1
3) kthreadd是内核守护进程，始终运行在内核空间，负责所有内核线程的调度和管理，PID是2

也就是说，系统启动后的第一个进程是IDLE，idle进程是唯一没有通过kernel_thread或fork产生的进程，idle创建了kernel_init进程作为1号进程，创建了kthreadd进程作为2号进程

kernel_init
^^^^^^^^^^^^

kernel_init函数在创建kernel_init进程时，作为进程被启动,虽然kernel_init最开始只是一个函数，但是在最后，通过系统调用将读取根文件系统下的init进程，完成从内核态到用户态的转变，
转变为用户态的1号进程，这个init进程是所有用户态进程的父进程，产生了大量的子进程，init进程是1号进程，是永远存在的

kernel_init_freeable
""""""""""""""""""""""

此函数主要工作如下

1) 等待内核线程kthreadd创建完成
2) 注册内核驱动模块 do_basic_setup
3) 启动默认控制台/dev/console

::


    static noinline void __init kernel_init_freeable(void)
    {
        /*
         * Wait until kthreadd is all set-up.
         */
        wait_for_completion(&kthreadd_done);
        //虽然kernel_init进程先创建，但是要在kthreadd线程创建完成才能执行

        /* Now the scheduler is fully set up and can do blocking allocations */
        gfp_allowed_mask = __GFP_BITS_MASK;

        /*
         * init can allocate pages on any node
         */
        set_mems_allowed(node_states[N_MEMORY]);

        cad_pid = task_pid(current);

        smp_prepare_cpus(setup_max_cpus);

        workqueue_init();

        init_mm_internals();

        do_pre_smp_initcalls();
        lockup_detector_init();

        smp_init();
        sched_init_smp();

        page_alloc_init_late();
        /* Initialize page ext after all struct pages are initialized. */
        page_ext_init();

        do_basic_setup();

        /* Open the /dev/console on the rootfs, this should never fail */
        if (ksys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
            pr_err("Warning: unable to open an initial console.\n");

        (void) ksys_dup(0);
        (void) ksys_dup(0);
        /*
         * check if there is an early userspace init.  If yes, let it do all
         * the work
         */

        if (!ramdisk_execute_command)
            ramdisk_execute_command = "/init";

        if (ksys_access((const char __user *)
                ramdisk_execute_command, 0) != 0) {
            ramdisk_execute_command = NULL;
            prepare_namespace();
        }

        /*
         * Ok, we have completed the initial bootup, and
         * we're essentially up and running. Get rid of the
         * initmem segments and start the user-mode stuff..
         *
         * rootfs is available now, try loading the public keys
         * and default modules
         */

        integrity_load_keys();
    }

- bo_basic_setup

::

    /*
     * Ok, the machine is now initialized. None of the devices
     * have been touched yet, but the CPU subsystem is up and
     * running, and memory and process management works.
     *
     * Now we can finally start doing some real work..
     */
    static void __init do_basic_setup(void)
    {
        cpuset_init_smp();
        driver_init();
        init_irq_proc();
        do_ctors();
        usermodehelper_enable();
        do_initcalls();
    }

driver_init函数完成了与驱动程序相关的所有子系统的创建，实现了linux设备驱动的一个整体框架，但是它只是建立了目录结构，是设备驱动程序初始化的一部分
具体驱动模块的装载在do_initcalls函数中实现

::

    /**
     * driver_init - initialize driver model.
     *
     * Call the driver model init functions to initialize their
     * subsystems. Called early from init/main.c.
     */
    void __init driver_init(void)
    {
        /* These are the core pieces */
        devtmpfs_init();    //注册devtmpfs文件系统，启动devtmpfsd进程
        devices_init();     //初始化驱动模型中部分子系统，/dev/devices，/dev/cha，/dev/block
        buses_init();       //初始化驱动模型中的bus子系统
        classes_init();     //初始化驱动模型中的class子系统
        firmware_init();    //初始化驱动模型中的firmware子系统
        hypervisor_init();  //初始化驱动模型中的hypervisor子系统

        /* These are also core pieces, but must come after the
         * core core pieces.
         */
        of_core_init();     //初始化设备树访问过程
        platform_bus_init();    //初始化设备驱动模型中的bus/platform子系统，此节点是所有platform设备和驱动的总线模型
                                //所有的platform设备和驱动都会挂载到这个总线上
        cpu_dev_init();     //初始化驱动模型中的device/system/cpu子系统，该节点包含CPU相关属性
        memory_dev_init();  //初始化驱动模型中的device/system/memory子系统，该节点包含了内存相关属性
        container_dev_init();   //初始化系统总线类型为容器
    }


- do_initcalls

编译器在编译内核时，将一系列模块初始化函数的其实地址按照一定的顺序放在指定的section中，在内核启动的初始化阶段，do_initcalls函数中以函数指针的形式取出这些函数的起始地址
依次运行，以完成相应模块的初始化工作，这是设备驱动程序初始化的第二部分，由于内核模块可能存在依赖关系，因此这些模块的初始化顺序非常重要

::

    // init/main.c

    static void __init do_initcalls(void)
    {
        int level;

        for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++)
            do_initcall_level(level);
    }

对同一个level等级下的函数，依次遍历执行

::

    static void __init do_initcall_level(int level)
    {
        initcall_entry_t *fn;

        strcpy(initcall_command_line, saved_command_line);
        parse_args(initcall_level_names[level],
               initcall_command_line, __start___param,
               __stop___param - __start___param,
               level, level,
               NULL, &repair_env_string);

        trace_initcall_level(initcall_level_names[level]);
        for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++)
            do_one_initcall(initcall_from_entry(fn));
    }

执行某一个确定的函数

::

    int __init_or_module do_one_initcall(initcall_t fn)
    {
        int count = preempt_count();
        char msgbuf[64];
        int ret;

        if (initcall_blacklisted(fn))
            return -EPERM;

        do_trace_initcall_start(fn);
        ret = fn();
        do_trace_initcall_finish(fn, ret);

        msgbuf[0] = 0;

        if (preempt_count() != count) {
            sprintf(msgbuf, "preemption imbalance ");
            preempt_count_set(count);
        }
        if (irqs_disabled()) {
            strlcat(msgbuf, "disabled interrupts ", sizeof(msgbuf));
            local_irq_enable();
        }
        WARN(msgbuf[0], "initcall %pS returned with %s\n", fn, msgbuf);

        add_latent_entropy();
        return ret;
    }


::

    // include/linux/init.h

    #define __define_initcall(fn, id)   \
        static initcall_t __initcall##fn_id __used \
        __attribute__((__section__(".initcall" #id ".init"))) = fn;

__attribute__((__section__())) 表示把对象放在这个由括号中的名称所指代的section中


__define_initcall()宏的含义是

1) 声明一个名为__initcall_##fn的函数指针(其中##表示将两边的变量链接为有一个变量)
2) 将这个函数指针初始化为fn
3) 编译时，要将这个函数指针放到名为".initcall"#id".init"的section中

__define_initcall宏并不会被直接使用，而是被定义为其他的宏定义形式使用

::

    // include/linux/init.h


    #define pure_initcall(fn)		__define_initcall(fn, 0)

    #define core_initcall(fn)		__define_initcall(fn, 1)
    #define core_initcall_sync(fn)		__define_initcall(fn, 1s)
    #define postcore_initcall(fn)		__define_initcall(fn, 2)
    #define postcore_initcall_sync(fn)	__define_initcall(fn, 2s)
    #define arch_initcall(fn)		__define_initcall(fn, 3)
    #define arch_initcall_sync(fn)		__define_initcall(fn, 3s)
    #define subsys_initcall(fn)		__define_initcall(fn, 4)
    #define subsys_initcall_sync(fn)	__define_initcall(fn, 4s)
    #define fs_initcall(fn)			__define_initcall(fn, 5)
    #define fs_initcall_sync(fn)		__define_initcall(fn, 5s)
    #define rootfs_initcall(fn)		__define_initcall(fn, rootfs)
    #define device_initcall(fn)		__define_initcall(fn, 6)
    #define device_initcall_sync(fn)	__define_initcall(fn, 6s)
    #define late_initcall(fn)		__define_initcall(fn, 7)
    #define late_initcall_sync(fn)		__define_initcall(fn, 7s)


在编译生成的vmlinux.lds文件中可以找到initcall相关的定义

::


  __initcall_start = .; 
  KEEP(*(.initcallearly.init))
  __initcall0_start = .; 
  KEEP(*(.initcall0.init)) 
  KEEP(*(.initcall0s.init)) 
  __initcall1_start = .; 
  KEEP(*(.initcall1.init)) 
  KEEP(*(.initcall1s.init)) 
  __initcall2_start = .; 
  KEEP(*(.initcall2.init)) 
  KEEP(*(.initcall2s.init)) 
  __initcall3_start = .; 
  KEEP(*(.initcall3.init)) 
  KEEP(*(.initcall3s.init)) 
  __initcall4_start = .; 
  KEEP(*(.initcall4.init)) 
  KEEP(*(.initcall4s.init)) 
  __initcall5_start = .; 
  KEEP(*(.initcall5.init)) 
  KEEP(*(.initcall5s.init)) 
  __initcallrootfs_start = .; 
  KEEP(*(.initcallrootfs.init)) 
  KEEP(*(.initcallrootfss.init)) 
  __initcall6_start = .; 
  KEEP(*(.initcall6.init)) 
  KEEP(*(.initcall6s.init)) 
  __initcall7_start = .; 
  KEEP(*(.initcall7.init)) 
  KEEP(*(.initcall7s.init)) 
  __initcall_end = .;

这些section中总的开始位置被标识为__initcall_start，而在结尾被标识为__initcall_end

free_initmem
""""""""""""""

free_initmem函数用来释放所有init段中的内存

::

    // arch/arm64/mm/init.c

    void free_initmem(void)
    {
        free_reserved_area(lm_alias(__init_begin),
                   lm_alias(__init_end),
                   0, "unused kernel");
        /*
         * Unmap the __init region but leave the VM area in place. This
         * prevents the region from being reused for kernel modules, which
         * is not supported by kallsyms.
         */
        unmap_kernel_range((u64)__init_begin, (u64)(__init_end - __init_begin));
    }


启动用户态init进程
""""""""""""""""""""

::

	if (!try_to_run_init_process("/sbin/init") ||
	    !try_to_run_init_process("/etc/init") ||
	    !try_to_run_init_process("/bin/init") ||
	    !try_to_run_init_process("/bin/sh"))
		return 0;


::

    static int try_to_run_init_process(const char *init_filename)
    {
        int ret;

        ret = run_init_process(init_filename);

        if (ret && ret != -ENOENT) {
            pr_err("Starting init: %s exists but couldn't execute it (error %d)\n",
                   init_filename, ret);
        }

        return ret;
    }

::

    static int run_init_process(const char *init_filename)
    {
        argv_init[0] = init_filename;
        pr_info("Run %s as init process\n", init_filename);
        return do_execve(getname_kernel(init_filename),
            (const char __user *const __user *)argv_init,
            (const char __user *const __user *)envp_init);
    }

在大多数系统中，bootloader会传递参数给内核的main函数，这些参数中会包含init=/linuxrc参数，于是在kernel_init进程中，如果有execute_command = "linuxrc"，在经过
run_init_process函数的解析之后，得到需要运行的linuxrc，通过do_execve函数进入用户态，开始文件系统的初始化init进程

如果没有传递，则系统开始顺序执行/sbin/init /etc/init /bin/init /bin/sh 程序


init进程进行的工作

1) 为init设置信号处理过程
2) 初始化控制台
3) 解析/etc/inittab文件
4) 执行系统初始化命令，一般情况下会使用/etc/init.d/rcS
5) 执行所有导致init暂停的inittab命令(动作类型: wait)
6) 执行所有仅执行一次的inittab命令 (动作类型: once)

执行完以上工作后，init进程会循环执行以下进程

1) 执行所有终止时必须重新启动的inittab命令(动作类型: respawn)
2) 执行所有终止时必须重新启动但启动前必须询问用户的inittab命令(动作类型: askfirst)

- inittab

init程序会解析/etc/inittab初始化配置文件

::

    # /etc/inittab: init(8) configuration.
    # $Id: inittab,v 1.91 2002/01/25 13:35:21 miquels Exp $

    # The default runlevel.
    id:5:initdefault:

    # Boot-time system configuration/initialization script.
    # This is run first except when booting in emergency (-b) mode.
    si::sysinit:/etc/init.d/rcS

    # What to do in single-user mode.
    ~:S:wait:/sbin/sulogin

    # /etc/init.d executes the S and K scripts upon change
    # of runlevel.
    #
    # Runlevel 0 is halt.
    # Runlevel 1 is single-user.
    # Runlevels 2-5 are multi-user.
    # Runlevel 6 is reboot.

    l0:0:wait:/etc/init.d/rc 0
    l1:1:wait:/etc/init.d/rc 1
    l2:2:wait:/etc/init.d/rc 2
    l3:3:wait:/etc/init.d/rc 3
    l4:4:wait:/etc/init.d/rc 4
    l5:5:wait:/etc/init.d/rc 5
    l6:6:wait:/etc/init.d/rc 6
    # Normally not reached, but fallthrough in case of emergency.
    z6:6:respawn:/sbin/sulogin
    # AMA0:12345:respawn:/bin/start_getty 38400 ttyAMA0 vt102
    S0:12345:respawn:/bin/start_getty 0 ttyS0 vt102


inittab的内容以行为单位，行与行之间没有关联，每行都是一个独立的配置项，每一行的配置项都是由3个冒号分隔开的4个配置值组成，冒号是分隔符

inittab文件中的代码格式

::

    <id>:<runlevels>:<action>:<process>

.. note::
    id:  /dev/id ,用作终端的terminal:stdin、stdout、stderr、printf、scanf、err
    runlevels:
    action:执行时机，包括：sysinit、respawd、askfirst、waite、once、restart、ctriatdel、shutdown
    process: 应用程序和脚本