844 Matching Annotations
  1. Oct 2023
    1. 规整的指令编码。RISC-V的指令集编码非常的规整,指令所需的通用寄存器的索引(Index)都被放在固定的位置。因此指令译码器(Instruction Decoder)可以非常便捷的译码出寄存器索引然后读取通用寄存器组(Register File,Regfile)。

      [!NOTE] RISC-V 的指令编码有什么特点?

      flashcard

      非常规整 - 指令所需的通用寄存器的索引(Index)都被放在固定的位置。因此指令译码器(Instruction Decoder)可以非常便捷的译码出寄存器索引然后读取通用寄存器组(Register File,Regfile)。

    2. RISC-V最基本也是唯一强制要求实现的指令集部分是由I字母表示的基本整数指令子集,使用该整数指令子集,便能够实现完整的软件编译器。其他的指令子集部分均为可选的模块,具有代表性的模块包括M/A/F/D/C。

      [!NOTE] RV32I 最后的 I 是什么意思?

      flashcard

      整数指令集模块, 其他具有代表性的模块包括M/A/F/D/C

    1. S-Mode虚存的地址转换 S、U-Mode中虚拟地址会以从根部遍历页表的方式转换为物理地址: satp.PPN 给出了一级页表基址, VA [31:22] 给出了一级页号,CPU会读取位于地址(satp. PPN × 4096 + VA[31: 22] × 4)页表项。 PTE 包含二级页表基址,VA[21:12]给出了二级页号,CPU读取位于地址(PTE. PPN × 4096 + VA[21: 12] × 4)叶节点页表项。 叶节点页表项的PPN字段和页内偏移(原始虚址的最低 12 个有效位)组成了最终结果:物理地址(LeafPTE.PPN×4096+VA[11: 0])

      [!NOTE] 虚拟内存的地址转换在寄存器层面是如何实现的?

      flashcard

      satp 的不同段

    2. satp(Supervisor Address Translation and Protection,监管者地址转换和保护)S模式控制状态寄存器控制分页。satp 有三个域: MODE 域可以开启分页并选择页表级数 ASID(Address Space Identifier,地址空间标识符)域是可选的,避免了切换进程时将TLB刷新的问题,降低上下文切换的开销 PPN 字段保存了根页表的物理页号

      [!NOTE] 虚拟内存由什么 CSR 控制?

      flashcard

      satp

    3. mcause CSR寄存器 当发生异常时,mcause CSR中被写入一个指示导致异常的事件的代码,如果事件由中断引起,则置上Interrupt位,Exception Code字段包含指示最后一个异常的编码

      [!NOTE] mstatus CSR 有什么功能?

      flashcard

      • 第一位表示是否中断(0 表示异常)
      • 其他位是 "Exception Code"
    4. mstatus(Machine Status)保存全局中断以及其他的状态 SIE控制S-Mode下全局中断,MIE控制M-Mode下全局中断。 SPIE、MPIE记录发生中断之前MIE和SIE的值。 SPP表示变化之前的特权级别是S-Mode还是U-Mode MPP表示变化之前是S-Mode还是U-Mode还是M-Mode

      [!NOTE] CSR mstatus 有什么用?

      flashcard

      保存全局中断以及其他的状态 - 控制中断 - 记录中断前的值 - 记录中断前的特权级别

    5. PP:Previous Privilege

      [!NOTE] mstatus 的编码中 PP 表示什么?

      flashcard

      Previous Privilege

    6. M-Mode的中断控制和状态寄存器 mtvec(MachineTrapVector)保存发生中断/异常时要跳转到的中断处理例程入口地址 mepc(Machine Exception PC)指向发生中断/异常时的指令 mcause(Machine Exception Cause)指示发生中断/异常的种类 mie(Machine Interrupt Enable)中断使能寄存器 mip(Machine Interrupt Pending)中断请求寄存器 mtval(Machine Trap Value)保存陷入(trap)附加信息 mscratch(Machine Scratch)它暂时存放一个字大小的数据 mstatus(Machine Status)保存全局中断以及其他的状态

      [!NOTE] M-Mode 中的中断涉及到哪些 CSR?

      flashcard

      • mtvec(MachineTrapVector)保存发生中断/异常时要跳转到的中断处理例程入口地址
      • mepc(Machine Exception PC)指向发生中断/异常时的指令
      • mcause(Machine Exception Cause)指示发生中断/异常的种类
      • mie(Machine Interrupt Enable)中断使能寄存器
      • mip(Machine Interrupt Pending)中断请求寄存器
      • mtval(Machine Trap Value)保存陷入(trap)附加信息
      • mscratch(Machine Scratch)它暂时存放一个字大小的数据
      • mstatus(Machine Status)保存全局中断以及其他的状态
    7. 中断/异常开销 建立中断/异常/系统调用号与对应服务的开销; 内核堆栈的建立; 验证系统调用参数; 内核态数据拷贝到用户态; 内存状态改变(Cache/TLB 刷新的开销)。

      [!NOTE] 要实现异常机制,需要做哪些事情?

      flashcard

      1. 建立中断/异常/系统调用号与对应服务的开销;
      2. 内核堆栈的建立;
      3. 验证系统调用参数;
      4. 内核态数据拷贝到用户态;
      5. 内存状态改变(Cache/TLB 刷新的开销)。
    8. 内存状态改变(Cache/TLB 刷新的开销)

      [!NOTE] QEMU 与真实板子有哪些常见差别?

      flashcard

      • 内存状态维护?
    9. RISC-V 要求实现精确异常:保证异常之前的所有指令都完整执行,后续指令都没有开始执行

      [!NOTE] RISC-V 要求实现的精确异常具体是什么含义?

      flashcard

      保证 - 异常之前的所有指令都完整执行, - 后续指令都没有开始执行

    10. Risc-V中异常和中断统称Trap

      [!NOTE] RISC-V 中,Trap 包含哪些异常?

      flashcard

      异常和中断统称 Trap

    11. CSR寄存器功能 信息类:主要用于获取当前芯片id和cpu核id等信息。 Trap设置:用于设置中断和异常相关寄存器。 Trap处理:用于处理中断和异常相关寄存器。 内存保护:有效保护内存资源

      [!NOTE] CSR 有哪些主要功能?

      flashcard

      • 信息查询
      • Trap 设置+处理
      • 内存保护
    12. OS通过硬件隔离手段(三防)来保障计算机的安全可靠 设置 CSR(控制状态寄存器) 实现隔离 权力:防止应用访问系统管控相关寄存器 地址空间配置寄存器:mstatus/sstatus CSR(中断及状态) 时间:防止应用长期使用 100%的 CPU 中断配置寄存器:sstatus/stvec CSR(中断跳转地址) 数据:防止应用破坏窃取数据 地址空间相关寄存器:sstatus/stvec/satp CSR (分页系统)

      [!NOTE] OS 通过什么实现对用户态应用的三种隔离?

      flashcard

      设置对应的三类 SCR

    13. 控制状态寄存器(CSR:Control and Status Registers) 通过控制状态寄存器指令访问,可以有4096个CSR 运行在用户态的应用程序不能访问大部分的CSR寄存器 运行在内核态的操作系统通过访问CSR寄存器控制计算机

      [!NOTE] 计算机系统中,CSR 是什么?

      flashcard

      运行在内核态的操作系统通过访问CSR寄存器控制计算机

    14. M-Mode(Machine Mode, Physical Machine Mode) 特权级模式:控制物理内存,直接关机 是Bootloader/BIOS运行的Machine Mode CPU执行模式 能执行M-Mode特权指令,能直接影响上述其他软件的执行

      [!NOTE] Machine Mode 对应的程序以及权限是什么?

      flashcard

      Bootloader/BIOS M-Mode 特权指令+能干涉所有其他软件+能操作硬件

    15. H-Mode(Hypervisor Mode, Virtual Machine Mode,虚拟机监控器) 特权级模式:限制OS访问的内存空间 是虚拟机监控器运行的Hypervisor Mode CPU执行模式,能执行H-Mode特权指令,能直接影响OS执行

      [!NOTE] User Mode 对应的程序以及权限是什么?

      flashcard

      虚拟机监控器 H-Mode 特权指令+能干涉 OS

    16. S-Mode(Supervisor Mode, Kernel Mode,内核态,内核模式) 在内核态的操作系统具有足够强大的硬件控制能力 特权级模式(Privileged Mode):限制APP的执行与内存访问 是操作系统运行的内核态CPU执行模式 能执行内核态特权指令,能直接影响应用程序执行

      [!NOTE] User Mode 对应的程序以及权限是什么?

      flashcard

      操作系统 内核态特权指令+能干涉应用程序

    17. U-Mode (User Mode,用户模式、用户态) 非特权级模式(Unprivileged Mode):基本计算 是应用程序运行的用户态CPU执行模式 不能执行特权指令,不能直接影响其他应用程序执行

      [!NOTE] User Mode 对应的程序以及权限是什么?

      flashcard

      应用程序 无特权指令+不能干涉其他应用程序

    18. M Mode:小型设备(蓝牙耳机等) U+M Mode:嵌入式设备(电视遥控器、刷卡机等)

      [!QUESTION] 嵌入式设备通常有 U+M Mode,而耳机等小型设备只有 M Mode,为什么要有这种差别?

      flashcard

      防止嵌入式设备破坏主设备?

  2. Sep 2023
    1. 如果是正常推出,uCore 会自动关闭 qemu,但如果 os 跑飞了,我们不能通过 Ctrl + C 来推出。此时可以先按下 Ctrl+A ,再按下 X 来退出 Qemu。

      [!NOTE] QEMU 如果陷入死循环,该如何退出?

      flashcard

      Ctrl+A -> X

    1. ld rd, offset(rs1) x[rd] = M[x[rs1] + sext(offset)][63:0]双字加载 (Load Doubleword). I-type, RV32I and RV64I.从地址 x[rs1] + sign-extend(offset)读取八个字节,写入 x[rd]。压缩形式:c.ldsp rd, offset; c.ld rd, offset(rs1)offset[11:0] rs1 011 rd 0000011

      [!NOTE] ld rd, offset(rs1) 的功能为?

      flashcard

      双字加载 (Load Doubleword) 从地址 x[rs1] + sign-extend(offset) 读取八个字节,写入 x[rd]

    2. mv rd, rs1 x[rd] = x[rs1]移动(Move). 伪指令(Pseudoinstruction), RV32I and RV64I.把寄存器 x[rs1]复制到 x[rd]中。实际被扩展为 addi rd, rs1, 0

      [!NOTE] RISC-V 中,mv rd, rs1 的功能为?

      flashcard

      把寄存器 x[rs1] 复制到 x[rd]中。 实际被扩展为 addi rd, rs1, 0

    3. 保存寄存器和临时寄存器为什么不是连续编号的?为了支持 RV32E——一个只有 16 个寄存器的嵌入式版本的 RISC-V(参见第 11 章),只使用寄存器 x0 到 x15——一部分保存寄存器和一部分临时寄存器都在这个范围内。

      [!NOTE] RISC-V 中,保存寄存器和临时寄存器为什么不是连续编号的?

      flashcard

      为了支持 RV32E——一个只有 16 个寄存器的嵌入式版本的 RISC-V(参见第 11 章),只使用寄存器 x0 到 x15——一部分保存寄存器和一部分临时寄存器都在这个范围内

    4. 保存返回地址(ra 寄存器)

      [!NOTE] RISC-V 调用中,保存/加载 ra 寄存器的值时,对于偏移量需要注意什么?

      flashcard

      起始(较低)地址为 sp + framesize - 4 (4B=32b)

    5. 根据 ABI 规范,我们来看看标准的 RV32I 函数入口和出口。下面是函数的开头:

      [!NOTE] RISC-V ABI 规定的标准 RV32I 函数入口和出口的汇编代码为?

      flashcard

    6. jalr rd, offset(rs1) t =pc+4; pc=(x[rs1]+sext(offset))&~1; x[rd]=t跳转并寄存器链接 (Jump and Link Register). I-type, RV32I and RV64I.把 pc 设置为 x[rs1] + sign-extend(offset),把计算出的地址的最低有效位设为 0,并将原 pc+4的值写入 f[rd]。rd 默认为 x1。压缩形式:c.jr rs1; c.jalr rs1offset[11:0] rs1 010 rd 110011

      [!NOTE] RISC-V 中,jalr rd, offset(rs1) 的功能为?

      flashcard

      跳转并寄存器链接 (Jump and Link Register) 1. 把 pc 设置为 x[rs1] + sign-extend(offset), 2. 把计算出的地址的最低有效位设为 0, 3. 并将原 pc+4 的值写入 x[rd]rd 默认为 x1

    7. ret pc = x[1]返回(Return). 伪指令(Pseudoinstruction), RV32I and RV64I.从子过程返回。实际被扩展为 jalr x0, 0(x1)。

      [!NOTE] RISC-V 中,ret 实际被扩展为?

      flashcard

      jalr x0, 0(x1)

    8. bne rs1, rs2, offset if (rs1 ≠ rs2) pc += sext(offset)不相等时分支 (Branch if Not Equal). B-type, RV32I and RV64I.若寄存器 x[rs1]和寄存器 x[rs2]的值不相等,把 pc 的值设为当前值加上符号位扩展的偏移offset。压缩形式:c.bnez rs1, offsetoffset[12|10:5] rs2 rs1 001 offset[4:1|11] 1100011

      [!NOTE] RISC-V 中,bne rs1, rs2, offset 系列指令的功能为?

      flashcard

      不相等时分支 (Branch if Not Equal) 若寄存器 x[rs1] 和寄存器 x[rs2] 的值不相等,把 pc 的值设为当前值加上符号位扩展的偏移 offset

    9. 图 3.2 列出了寄存器的 RISC-V 应用程序二进制接口(ABI)名称和它们在函数调用中是否保留的规定。

      [!NOTE] RISC-V 中哪些寄存器需要在函数调用前后保持不变?

      flashcard

      sp fp s<num>

    10. RISC-V 有足够多的寄存器来达到两全其美的结果:既能将操作数存放在寄存器中,同时也能减少保存和恢复寄存器的次数。其中的关键在于,在函数调用的过程中不保留部分寄存器存储的值,称它们为临时寄存器;另一些寄存器则对应地称为保存寄存器。不再调用其它函数的函数称为叶函数。当一个叶函数只有少量的参数和局部变量时,它们可以都被存储在寄存器中,而不会“溢出(spilling)”到内存中。但如果函数参数和局部变量很多,程序还是需要把寄存器的值保存在内存中,不过这种情况并不多见。

      [!NOTE] RISC-V 函数调用中,存储数据的寄存器分为哪两类?

      flashcard

      • 在函数调用的过程中不保留部分寄 存器存储的值,称它们为临时寄存器;
      • 另一些寄存器则对应地称为保存寄存器
    11. 第三章 RISC-V 汇编语言

      [!NOTE] RISC-V 汇编语言有什么优质学习材料?

      flashcard

      RISC-V Reader

    12. 图 2.4:RV32I 的寄存器

      [!NOTE] RV32I 有哪些寄存器?

      flashcard

      如图

    13. 为常量 0 单独分配一个寄存器是 RISC-V ISA 能如此简单的一个很大的因素。第 3 章的第 36 页的图 3 给出了许多 ARM-32 和 x86-32 的原生指令操作,这两个指令集中没有零寄存器。我们可以用 RV32I 指令完成功能相同的操作,只需使用零寄存器作为操作数。

      [!NOTE] RISC-V 寄存器分配有什么独特之处?

      flashcard

      0 寄存器

    14. sd rs2, offset(rs1) M[x[rs1] + sext(offset) = x[rs2][63: 0]存双字(Store Doubleword). S-type, RV64I only.将 x[rs2]中的 8 字节存入内存地址 x[rs1]+sign-extend(offset)。压缩形式:c.sdsp rs2, offset; c.sd rs2, offset(rs1)offset[11:5] rs2 rs1 011 offset[4:0] 0100011

      [!NOTE] RISC-V 中,sd 指令有什么用?

      flashcard

      存储双字 1. 将 x[rs2] 中的 8 字节 2. 存入内存地址 x[rs1]+sign-extend(offset)

    1. s0/fp Saved register/frame pointer

      [!NOTE] RISC-V 中,要获取当前栈的栈底地址,可以使用?

      flashcard

      s0/fp

    2. In RV64, 32-bit types, such as int, are stored in integer registers as proper sign extensions of their32-bit values; that is, bits 63..31 are all equal. This restriction holds even for unsigned 32-bit types.

      [!NOTE] RV64 中,32位数据类型是如何存储的?

      flashcard

      存储在(64位)整数寄存器中,使用恰当的符号扩展

    3. Table 18.1 summarizes the datatypes natively supported by RISC-V C programs. In both RV32and RV64 C compilers, the C type int is 32 bits wide. longs and pointers, on the other hand, areboth as wide as a integer register, so in RV32, both are 32 bits wide, while in RV64, both are 64bits wide. Equivalently, RV32 employs an ILP32 integer model, while RV64 is LP64. In both RV32and RV64, the C type long long is a 64-bit integer, float is a 32-bit IEEE 754-2008 floating-pointnumber, double is a 64-bit IEEE 754-2008 floating-point number, and long double is a 128-bitIEEE floating-point number.The C types char and unsigned char are 8-bit unsigned integers and are zero-extended whenstored in a RISC-V integer register. unsigned short is a 16-bit unsigned integer and is zero-extended when stored in a RISC-V integer register. signed char is an 8-bit signed integer and issign-extended when stored in a RISC-V integer register, i.e. bits (XLEN-1)..7 are all equal. shortis a 16-bit signed integer and is sign-extended when stored in a register.

      [!NOTE] RISC-V 中 C 基本数据类型的 bit 长度分别为?

      flashcard

      注意:long 和指针类型长度均为整数寄存器的长度

  3. mit-public-courses-cn-translatio.gitbook.io mit-public-courses-cn-translatio.gitbook.io
    1. 如果对某一个Stack Frame感兴趣,可以先定位到那个frame再输入info frame,假设对syscall的Stack Frame感兴趣。

      [!NOTE] gdb 中,使用 backtrace 后,要定位到查询得到的第 i 个栈帧,可以使用?

      flashcard

      frame i

    2. 如果输入backtrace(简写bt)可以看到从当前调用栈开始的所有Stack Frame。

      [!NOTE] gdb 中,要查询从当前调用栈开始的所有Stack Frame 的简略信息,可以使用?

      flashcard

      backtrace / bt

    3. 如果我们在gdb中输入info frame,可以看到有关当前Stack Frame许多有用的信息。Stack level 0,表明这是调用栈的最底层pc,当前的程序计数器saved pc,demo4的位置,表明当前函数要返回的位置source language c,表明这是C代码Arglist at,表明参数的起始地址。当前的参数都在寄存器中,可以看到argc=3,argv是一个地址

      [!NOTE] gdb 中,要查询当前栈帧信息,可以使用?

      flashcard

      info frame

    4. 学生提问,为什在最开始要对sp寄存器减16?TA:是为了Stack Frame创建空间。减16相当于内存地址向前移16,这样对于我们自己的Stack Frame就有了空间,我们可以在那个空间存数据。我们并不想覆盖原来在Stack Pointer位置的数据。学生提问:为什么不减4呢?TA:我认为我们不需要减16那么多,但是4个也太少了,你至少需要减8,因为接下来要存的ra寄存器是64bit(8字节)。这里的习惯是用16字节,因为我们要存Return address和指向上一个Stack Frame的地址,只不过我们这里没有存指向上一个Stack Frame的地址。如果你看kernel.asm,你可以发现16个字节通常就是编译器的给的值。

      [!NOTE] RV64 中,栈帧通常至少开辟多大空间?

      flashcard

      16 字节 = 返回地址 + 上一个栈帧地址

    5. ra寄存器是0x80006392,它指向demo2函数,也就是sum_then_double的调用函数

      [!NOTE] 进入被调用函数时,ra 寄存器储存的值为?

      flashcard

      调用命令后的位置,即接下来要返回的位置

    6. leaf函数是指不调用别的函数的函数,它的特别之处在于它不用担心保存自己的Return address或者任何其他的Caller Saved寄存器,因为它不会调用别的函数。

      [!NOTE] 不调用任何别的函数的函数可以称为?

      flashcard

      leaf 函数

    7. 在汇编代码中,函数的最开始你们可以看到Function prologue,之后是函数的本体,最后是Epilogue。这就是一个汇编函数通常的样子。

      [!NOTE] 汇编函数有哪些基本组成部分?

      flashcard

      1. function prologue (保存原状态)
      2. body
      3. epilogue (恢复原状态)
    8. SP(Stack Pointer),它指向Stack的底部并代表了当前Stack Frame的位置。

      [!NOTE] RISC-V 中,sp 指向哪里?

      flashcard

      当前栈帧的地址最低处

    9. FP(Frame Pointer),它指向当前Stack Frame的顶部

      [!NOTE] RISC-V 中,fp 指向什么?

      flashcard

      当前栈帧的地址最高处

    10. Return address总是会出现在Stack Frame的第一位

      [!NOTE] 栈帧的底部(地址最高)储存了什么数据?

      flashcard

      返回地址(也即上一个栈帧的底部?)

    11. Stack Frame大小并不总是一样,即使在这个图里面看起来是一样大的。不同的函数有不同数量的本地变量,不同的寄存器,所以Stack Frame的大小是不一样的。

      [!NOTE] Stack Frame 的大小是固定的吗?

      flashcard

      不是

    12. 如果函数的参数多于8个,额外的参数会出现在Stack中。

      [!NOTE] RISC-V 中,什么条件下参数才会被保存到栈中?

      flashcard

      参数多于 8 个

    1. For the C language, the asm keyword is a GNU extension. When writing C code that can be compiled with -ansi and the -std options that select C dialects without GNU extensions, use __asm__ instead of asm (see Alternate Keywords). For the C++ language, asm is a standard keyword, but __asm__ can be used for code compiled with -fno-asm.

      [!NOTE] C 内联汇编的关键词是什么?

      flashcard

      • asm 需要 GNU 标准库支持
      • __asm__ 更为通用
    2. an extended asm statement (see Extended Asm - Assembler Instructions with C Expression Operands) includes one or more operands. The extended form is preferred for mixing C and assembly language within a function, but to include assembly language at top level you must use basic asm.

      [!NOTE] C 的扩展汇编一般用于什么位置?

      flashcard

      在函数内混合使用 C 和汇编

    1. QEMU有两种运行模式: User mode 模式,即用户态模拟,如 qemu-riscv64 程序, 能够模拟不同处理器的用户态指令的执行,并可以直接解析ELF可执行文件, 加载运行那些为不同处理器编译的用户级Linux应用程序。 System mode 模式,即系统态模式,如 qemu-system-riscv64 程序, 能够模拟一个完整的基于不同CPU的硬件系统,包括处理器、内存及其他外部设备,支持运行完整的操作系统。

      [!NOTE] QEMU 有哪些运行模式?

      flashcard

      U+S

    1. #[panic_handler] 告知编译器采用我们的实现: // os/src/lang_items.rs use core::panic::PanicInfo; #[panic_handler] fn panic(_info: &PanicInfo) -> ! { loop {} }

      [!NOTE] Rust 中,如何告知编译器为 handler 采用特定实现?

      flashcard

      形如 #[panic_handler]

    2. 在 main.rs 的开头加上一行 #![no_std], 告诉 Rust 编译器不使用 Rust 标准库 std 转而使用核心库 core。

      [!NOTE] Rust 中,如何禁用标准库 std

      flashcard

      main.rs 的开头加上一行 #![no_std]

    3. 这种编译器运行的平台(x86_64)与可执行文件运行的目标平台不同的情况,称为 交叉编译 (Cross Compile)。

      [!NOTE] 交叉编译的含义是?

      flashcard

      编译器运行的平台(例如 x86_64)与可执行文件运行的目标平台(例如 rCore)不同

    1. 除了 std 之外,Rust 还有一个不需要任何操作系统支持的核心库 core, 它包含了 Rust 语言相当一部分核心机制,可以满足本门课程的需求。 有很多第三方库也不依赖标准库 std,而仅仅依赖核心库 core。

      [!NOTE] Rust 最核心的库是?

      flashcard

      core 而非 std

    2. riscv64gc-unknown-none-elf 的 CPU 架构是 riscv64gc,厂商是 unknown,操作系统是 none, elf 表示没有标准的运行时库。没有任何系统调用的封装支持,但可以生成 ELF 格式的执行程序。 我们不选择有 linux-gnu 支持的 riscv64gc-unknown-linux-gnu,是因为我们的目标是开发操作系统内核,而非在 linux 系统上运行的应用程序。

      [!NOTE] riscv64gc-unknown-none-elf 表示什么?

      flashcard

      CPU 架构是 riscv64gc,厂商是 unknown,操作系统是 none, elf 表示没有标准的运行时库

    3. 目标三元组 (Target Triplet) 描述了目标平台的 CPU 指令集、操作系统类型和标准运行时库。

      [!NOTE] 目标三元组 (Target Triplet) 是什么?

      flashcard

    1. -w, --nodelist={<node_name_list>|<filename>}Request a specific list of hosts.

      [!NOTE] srun 中,如何指定要使用的 node?

      flashcard

      -w, --nodelist={<node_name_list>|<filename>}

    2. -N, --nodes=<minnodes>[-maxnodes]|<size_string>Request that a minimum of minnodes nodes be allocated to this job.

      [!NOTE] srun--nodes 的语义为?

      flashcard

      指定需要的节点数的范围

    1. GresUsedGeneric resources (gres) currently in use on the nodes.

      [!NOTE] SLURM 如何查看显存占用情况?

      flashcard

      GresUsed 字段

    2. -t, --states=<states>List nodes only having the given state(s). Multiple states may be comma separated and the comparison is case insensitive. If the states are separated by '&', then the nodes must be in all states. Possible values include (case insensitive): ALLOC, ALLOCATED, CLOUD, COMP, COMPLETING, DOWN, DRAIN (for node in DRAINING or DRAINED states), DRAINED, DRAINING, FAIL, FUTURE, FUTR, IDLE, MAINT, MIX, MIXED, NO_RESPOND, NPC, PERFCTRS, PLANNED, POWER_DOWN, POWERING_DOWN, POWERED_DOWN, POWERING_UP, REBOOT_ISSUED, REBOOT_REQUESTED, RESV, RESERVED, UNK, and UNKNOWN. By default nodes in the specified state are reported whether they are responding or not. The --dead and --responding options may be used to filter nodes by the corresponding flag.

      [!NOTE] sinfo 中,要过滤节点状态,可以使用?

      flashcard

      -t, --states=<states>

    3. --jsonDump all node information as JSON. All information is dumped, even if it would normally not be. Formatting options are ignored; however, filtering arguments are still used.

      [!NOTE] sinfo 如何将所有信息存入文件?

      flashcard

      sinfo --json 默认输出所有字段,但仍会过滤数据

    4. The format of each field is "%[[.]size]type[suffix]" sizeMinimum field size. If no size is specified, whatever is needed to print the information will be used. .Indicates the output should be right justified and size must be specified. By default output is left justified. suffixArbitrary string to append to the end of the field.

      [!NOTE] sinfo 指定格式时,每个字段的设置格式为?

      flashcard

      "%[[.]size]type[suffix]"

    5. -O, --Format=<output_format>Specify the information to be displayed. Also see the -o <output_format>, --format=<output_format> option (which supports greater flexibility in formatting, but does not support access to all fields because we ran out of letters)

      [!NOTE] sinfo 中,大写和小写的 -O, --Format 有什么差别?

      flashcard

      大写的控制更加细致,但不支持全部字段(因为字母不够用...)

    1. 之所以叫做二进制接口,是因为它和在同一种编程语言内部调用接口不同,是汇编指令级的一种接口。事实上 M/S/U 三个特权级的软件可能分别由不同的编程语言实现,即使是用同一种编程语言实现的,其调用也并不是普通的函数调用执行流,而是**陷入异常控制流** ,在该过程中会 切换 CPU 特权级。因此只有将接口下降到汇编指令级才能够满足其通用性和灵活性。

      [!NOTE] ABI/SBI 为什么叫二进制接口 (BI)?为什么要如此设置?

      flashcard

      因为是汇编指令级的接口 其调用并不是普通的函数调用执行流,而是陷入异常控制流 ,在该过程中会 切换 CPU 特权级

    2. 之前我们给出过支持应用程序运行的一套 执行环境栈 ,现在我们站在特权级架构的角度去重新看待它: 和之前一样,白色块表示一层执行环境,黑色块表示相邻两层执行环境之间的接口。这张图片给出了能够支持运行 Unix 这类复杂系统的软件栈。其中 内核代码运行在 S 模式上;应用程序运行在 U 模式上。运行在 M 模式上的软件被称为 监督模式执行环境 (SEE, Supervisor Execution Environment) ,这是站在运行在 S 模式上的软件的视角来看,它的下面也需要一层执行环境支撑,因此被命名为 SEE,它需要在相比 S 模式更高的特权级下运行, 一般情况下在 M 模式上运行。

      [!NOTE] 如何从特权级结构的视角组织 SBI/ABI 等执行环境层级之间的关系?

      flashcard

      U-ABI-S-SBI-M

    3. RISC-V 架构中一共定义了 4 种特权级: RISC-V 特权级¶ 级别 编码 名称 0 00 用户/应用模式 (U, User/Application) 1 01 监督模式 (S, Supervisor) 2 10 H, Hypervisor 3 11 机器模式 (M, Machine) 其中,级别的数值越大,特权级越高,掌控硬件的能力越强。从表中可以看出, M 模式处在最高的特权级,而 U 模式处于最低的特权级。

      [!NOTE] RISC-V 架构定义了哪些特权级别?

      flashcard

      U<S<H<M

    1. 如果测例仓库有所更新或者你切换了代码仓库的分支,你可能需要清理掉测例仓库原版的编译结果,此时需要执行 $ make -C user clean 它的作用基本等价于如下写法,但是更简便 $ cd user $ make clean $ cd ..

      [!NOTE] make 中,要临时切换到特定目录执行操作,可以使用?

      flashcard

      make -C /path/to/source/directory "-C" 是一个选项,用于指定Make执行时的工作目录(或称为切换目录)。

    1. 每次在make run之前,尽量先执行make clean以删除缓存,特别是在切换ch分支之后。

      [!NOTE] make run 之前有什么好习惯?

      flashcard

      make clean 清理缓存

    1. --meEquivalent to --user=<my username>.

      [!NOTE] squeue 中如何方便地查看当前用户的队列?

      flashcard

      --me

    1. The batch script may contain options preceded with "#SBATCH" before any executable commands in the script. sbatch will stop processing further #SBATCH directives once the first non-comment non-whitespace line has been reached in the script.

      [!NOTE] sbatch#SBATCH ... 的位置有什么要求?

      flashcard

      在任何可执行命令之前

    2. --export={[ALL,]<environment_variables>|ALL|NIL|NONE}Identify which environment variables from the submission environment are propagated to the launched application. Note that SLURM_* variables are always propagated. --export=ALLDefault mode if --export is not specified. All of the user's environment will be loaded (either from the caller's environment or from a clean environment if --get-user-env is specified).

      [!NOTE] sbatch 中,如何将提交时的环境变量传播到启动的 SLURM 进程中?默认行为是?

      flashcard

      --export=...,有多种可选值 默认行为: - SLURM_*variables are always propagated - --export=ALL Default mode if --export is not specified.

    3. filename pattern sbatch allows for a filename pattern to contain one or more replacement symbols, which are a percent sign "%" followed by a letter (e.g. %j).

      [!NOTE] SLURM 中,文件名占位符成为?

      flashcard

      filename pattern

    4. Use the "B:" option to signal only the batch shell, none of the other processes will be signaled. By default all job steps will be signaled, but not the batch shell itself.

      [!NOTE] SLURM 的 --signal=... 中,选项 B: 表示什么?默认如何处理?

      flashcard

      使用 B: 表示仅对 batch shell 发送信号 默认对所有 job step 发送信号

    5. Use the "R:" option to allow this job to overlap with a reservation with MaxStartDelay set.

      [!NOTE] SLURM 的 --signal=... 中,选项 R: 表示什么?

      flashcard

      允许该 job 与“保留资源并延迟开始”重叠

    6. Due to the resolution of event handling by Slurm, the signal may be sent up to 60 seconds earlier than specified.

      [!NOTE] SLURM 中,要对时间进行精细操作,需要注意什么?

      flashcard

      SLURM 的事件处理有最小间隔,应该为 60s?

    7. --signal=[{R|B}:]<sig_num>[@sig_time]When a job is within sig_time seconds of its end time, send it the signal sig_num.

      [!NOTE] SLURM 中,--signal=<sig_num>[@sig_time] 表示什么?

      flashcard

      When a job is within sig_time seconds of its end time, send it the signal sig_num

    8. --mem=<size>[units]Specify the real memory required per node.

      [!NOTE] SLURM 中,--mem 选项用于指定?

      flashcard

      每个节点可以使用的内存上限

    9. use an enforcement mechanism. See ConstrainRAMSpace in the cgroup.conf(5) man page and OverMemoryKill in the slurm.conf(5) man page for more details.

      [!NOTE] SLURM 可以对内存采取哪些限制措施?

      flashcard

      • 限制 RAM 空间
      • kill 内存占用过多的进程
    1. Bootloader初始化外设,在存储设备上找到OS,加载执行OS

      [!NOTE] Bootloader 有什么主要功能?

      flashcard

      1. 初始化外设,
      2. 在存储设备上找到 OS,加载执行OS
    1. The weighted amount of a resource can be adjusted by adding a suffix of K,M,G,T or P after the billing weight. For example, a memory weight of "mem=.25" on a job allocated 8GB will be billed 2048 (8192MB *.25) units. A memory weight of "mem=.25G" on the same job will be billed 2 (8192MB * (.25/1024)) units.

      [!NOTE] TRESBillingWeights=...,mem=0.25G.... 表示什么?

      flashcard

      每使用 1GB 内存对计费的影响是 0.25 个计费单位

    1. You can launch projects from a repository on GitHub.com to your server by using a deploy key, which is an SSH key that grants access to a single repository.

      [!NOTE] GitHub 要给单个仓库添加 SSH key,可以使用?

      flashcard

      Deploy key

    1. The general form for this is “git-credential-foo [args] <action>.” The stdin/stdout protocol is the same as git-credential, but they use a slightly different set of actions: get is a request for a username/password pair. store is a request to save a set of credentials in this helper’s memory. erase purge the credentials for the given properties from this helper’s memory. For the store and erase actions, no response is required (Git ignores it anyway). For the get action, however, Git is very interested in what the helper has to say. If the helper doesn’t know anything useful, it can simply exit with no output, but if it does know, it should augment the provided information with the information it has stored. The output is treated like a series of assignment statements; anything provided will replace what Git already knows.

      [!NOTE] git-credential-foo [args] <action> 如何使用?

      flashcard

      get / store / erase

    2. The credential system is actually invoking a program that’s separate from Git itself; which one and how depends on the credential.helper configuration value. There are several forms it can take: Configuration Value Behavior foo Runs git-credential-foo foo -a --opt=bcd Runs git-credential-foo -a --opt=bcd /absolute/path/foo -xyz Runs /absolute/path/foo -xyz !f() { echo "password=s3cre7"; }; f Code after ! evaluated in shell So the helpers described above are actually named git-credential-cache, git-credential-store, and so on, and we can configure them to take command-line arguments.

      [!NOTE] Git 中,credential.helper <val> 的行为具体是怎么样的?

      flashcard

      • git credential-<val>
      • <val=!...> -> eval ...
    3. The osxkeychain and wincred helpers use the native format of their backing stores, while cache uses its own in-memory format (which no other process can read).

      [!NOTE] Git 中,credential.helper cache 将 key 储存在哪里?

      flashcard

      内存中,故其他进程无法访问

    4. Git even allows you to configure several helpers. When looking for credentials for a particular host, Git will query them in order, and stop after the first answer is provided. When saving credentials, Git will send the username and password to all of the helpers in the list, and they can choose what to do with them.

      [!NOTE] Git 若设置了多个 credential.helper 会有什么行为?

      flashcard

      • 使用:按顺序退化
      • 保存:全部
    5. The “store” helper can take a --file <path> argument, which customizes where the plain-text file is saved (the default is ~/.git-credentials).

      [!NOTE] Git credential.helper store 如何指定存储的文件?默认值为?

      flashcard

      git config --global credential.helper 'store --file ~/.my-credentials' 默认值为 ~/.git-credentials

    6. Git has a few options provided in the box: The default is not to cache at all. Every connection will prompt you for your username and password. The “cache” mode keeps credentials in memory for a certain period of time. None of the passwords are ever stored on disk, and they are purged from the cache after 15 minutes. The “store” mode saves the credentials to a plain-text file on disk, and they never expire. This means that until you change your password for the Git host, you won’t ever have to type in your credentials again. The downside of this approach is that your passwords are stored in cleartext in a plain file in your home directory.

      [!NOTE] Git 提供了哪几种 credential 处理方式?分别有什么功能?

      flashcard

      • none:不保存
      • "cache": 保留 15min
      • "store":保存在文件中
    1. If you authenticate without GitHub CLI, you must authenticate with a personal access token. When Git prompts you for your password, enter your personal access token. Alternatively, you can use a credential helper like Git Credential Manager. Password-based authentication for Git has been removed in favor of more secure authentication methods.

      [!NOTE] GitHub 能否使用密码验证身份?

      flashcard

      已经被移除了

    1. If someone runs your script and you have to set anonymous="allow":Auto-create temporary account: W&B checks for an account that's already signed in. If there's no account, we automatically create a new anonymous account and save that API key for the session.Log results quickly: The user can run and re-run the script, and automatically see results show up in the W&B dashboard UI. These unclaimed anonymous runs will be available for 7 days.Claim data when it's useful: Once the user finds valuable results in W&B, they can easily click a button in the banner at the top of the page to save their run data to a real account. If they don't claim a run, it will be deleted after 7 days.

      [!NOTE] WandB 中,匿名模式提供哪些功能?

      flashcard

      1. 自动创建临时匿名账户
      2. 所有记录默认保存 7 天
      3. 其他用户可以一键收集产生的记录
    2. What happens to users with existing accounts?​If you set anonymous="allow" in your script, we will check to make sure there's not an existing account first, before creating an anonymous account. This means that if a W&B user finds your script and runs it, their results will be logged correctly to their account, just like a normal run.

      [!NOTE] WandB 中,anonymous="allow" 在已经有账户登录时的行为是?

      flashcard

      使用已登录的账户 而不会创建匿名账户

    1. There is currently no option to list multiple api keys in a single netrc file. If you were to do this manually, wandb will default to the last listed api key to authenticate the user.

      [!NOTE] WandB 如果在 ~/netrc 中列出多个 API Keys,会发生什么?

      flashcard

      WandB 默认会使用最后一个

    1. The WandbCallback that Trainer uses will call wandb.init under the hood when Trainer is initialized. You can alternatively set up your runs manually by calling wandb.init before theTrainer is initialized. This gives you full control over your W&B run configuration.

      [!NOTE] 🤗 Trainer 中,如何自定义 wandb.init()

      flashcard

      WandbCallback

    1. Multiple wandb users on shared machines​If you're using a shared machine and another person is a wandb user, it's easy to make sure your runs are always logged to the proper account. Set the WANDB_API_KEY environment variable to authenticate. If you source it in your env, when you log in you'll have the right credentials, or you can set the environment variable from your script.Run this command export WANDB_API_KEY=X where X is your API key. When you're logged in, you can find your API key at wandb.ai/authorize.

      [!NOTE] WandB 中,如何实现进程组级的登录?

      flashcard

      WANDB_API_KEY

    2. WANDB_TAGSA comma separated list of tags to be applied to the run.

      [!NOTE] WandB 如何通过环境变量控制 tags?

      flashcard

      WANDB_TAGS

    3. WANDB_RESUMEBy default this is set to never. If set to auto wandb will automatically resume failed runs. If set to must forces the run to exist on startup. If you want to always generate your own unique ids, set this to allow and always set WANDB_RUN_ID.

      [!NOTE] WandB 如何使用环境变量设置 resume?

      flashcard

      1. WANDB_RESUME='must'
      2. WANDB_RUN_ID='<run_id>'
    1. You can also turn on Source Mode globally, instead of on a document-by-document basis. This is helpful if you need to copy/paste a lot of data, or want to see what the data in your files looks like “behind the scenes”. To do this, open settings in Obsidian, and look under “Editor”. You should see an option that says “Properties in Document”. Change that to “Source”:

      [!NOTE] Obsidian 中,如何关闭 properties?

      flashcard

      open settings in Obsidian, and look under “Editor”. You should see an option that says “Properties in Document”. Change that to “Source”:

    1. # Set the priority when loading # e.g., zsh-syntax-highlighting must be loaded # after executing compinit command and sourcing other plugins # (If the defer tag is given 2 or above, run after compinit command) zplug "zsh-users/zsh-syntax-highlighting", defer:2

      [!NOTE] zplug load 中,要设置加载优先级,可以使用?

      flashcard

      defer tag - If the defer tag is given 2 or above, run after compinit command

    1. 在输出中野蛮地强制插入约束短语 is fast 的问题在于,大多数情况下,你最终会得到像上面的 The is fast 这样的无意义输出。我们需要解决这个问题。你可以从 huggingface/transformers 代码库中的这个 问题 中了解更多有关这个问题及其复杂性的深入讨论。 组方法通过在满足约束和产生合理输出两者之间取得平衡来解决这个问题。 我们把所有候选波束按照其 满足了多少步约束分到不同的组中,其中组 $n$ 里包含的是 满足了 $n$ 步约束的波束列表 。然后我们按照顺序轮流选择各组的候选波束。在上图中,我们先从组 2 (Bank 2) 中选择概率最大的输出,然后从组 1 (Bank 1) 中选择概率最大的输出,最后从组 0 (Bank 0) 中选择最大的输出; 接着我们从组 2 (Bank 2) 中选择概率次大的输出,从组 1 (Bank 1) 中选择概率次大的输出,依此类推。因为我们使用的是 num_beams=3,所以我们只需执行上述过程三次,就可以得到 ["The is fast", "The dog is", "The dog and"]。 这样,即使我们 强制 模型考虑我们手动添加的约束词分支,我们依然会跟踪其他可能更有意义的高概率序列。尽管 The is fast 完全满足约束,但这并不是一个有意义的短语。幸运的是,我们有 "The dog is" 和 "The dog and" 可以在未来的步骤中使用,希望在将来这会产生更有意义的输出。 图示如下 (以上例的第 3 步为例): 请注意,上图中不需要强制添加 "The is fast",因为它已经被包含在概率排序中了。另外,请注意像 "The dog is slow" 或 "The dog is mad" 这样的波束实际上是属于组 0 (Bank 0) 的,为什么呢?因为尽管它包含词 "is" ,但它不可用于生成 "is fast" ,因为 fast 的位子已经被 slow 或 mad 占掉了,也就杜绝了后续能生成 "is fast" 的可能性。从另一个角度讲,因为 slow 这样的词的加入,该分支 满足约束的进度 被重置成了 0。

      [!NOTE] 包含特定短语的约束波束搜索一般是如何实现的?

      flashcard

      bank method - 在每一步的候选 beam 中,注入特定短语中的一个 token - 分为 n 个 bank,其中 bank i 表示该组内的 beam 包含了目标短语的 i 个 token - 在每个 bank 里选概率最大的 beam

    1. repetition_penalty 可用于对生成重复的单词这一行为进行惩罚。它首先由 Keskar 等人 (2019) 引入,在 Welleck 等人 (2019) 的工作中,它是训练目标的一部分。它可以非常有效地防止重复,但似乎对模型和用户场景非常敏感,其中一个例子见 Github 上的 讨论。

      [!NOTE] repetition_penalty 有什么缺点?

      flashcard

      似乎对模型和用户场景非常敏感,其中一个例子见 Github 上的 讨论。

    2. 生成的文本看起来不错 - 但仔细观察会发现它不是很连贯。3-grams new hand sense 和 local batte harness 非常奇怪,看起来不像是人写的。这就是对单词序列进行采样时的大问题: 模型通常会产生不连贯的乱码,参见 Ari Holtzman 等人 (2019) 的论文。

      [!NOTE] 单纯采样的生成策略有什么常见问题?

      flashcard

      连贯性不好

    3. 在机器翻译或摘要等任务中,因为所需生成的长度或多或少都是可预测的,所以波束搜索效果比较好 - 参见 Murray 等人 (2018) 和 Yang 等人 (2018) 的工作。但开放域文本生成情况有所不同,其输出文本长度可能会有很大差异,如对话和故事生成的输出文本长度就有很大不同。

      [!NOTE] 生成文本长度与 beam search 有什么关系?

      flashcard

      搜索宽度 num_beams 越大,beam search 越难结束,生成序列平均越长 这不适合开放域文本生成,可能是因为开放域文本生成容易出现(在合理长度内)无法停止的倾向

    4. 正如 Ari Holtzman 等人 (2019) 所论证的那样,高质量的人类语言并不遵循最大概率法则。换句话说,作为人类,我们希望生成的文本能让我们感到惊喜,而可预测的文本使人感觉无聊。论文作者画了一个概率图,很好地展示了这一点,从图中可以看出人类文本带来的惊喜度比波束搜索好不少。

      [!NOTE] 序列概率是否是生成质量的良好指标?

      flashcard

      详见 Ari Holtzman 等人 (2019)

    5. 波束搜索通过在每个时间步保留最可能的 num_beams 个词,并从中最终选择出概率最高的序列来降低丢失潜在的高概率序列的风险。以 num_beams=2 为例: 在时间步 1,除了最有可能的假设 $(\text{“The”}, \text{“nice”})$,波束搜索还跟踪第二可能的假设 $(\text{“The”}, \text{“dog”})$。在时间步 2,波束搜索发现序列 $(\text{“The”}, \text{“dog”}, \text{“has”})$ 概率为$0.36$,比 $(\text{“The”}, \text{“nice”}, \text{“woman”})$ 的 $0.2$ 更高。太棒了,在我们的例子中它已经找到了最有可能的序列!

      [!NOTE] beam search 的基本实现为?

      flashcard

      • 每个时间步,在全局保留最可能的 num_beams 个词,因此时刻维护 num_beams 个 beam
      • 选择词/beam 的原则:序列总概率最高
    6. 虽然结果比贪心搜索更流畅,但输出中仍然包含重复。一个简单的补救措施是引入 n-grams (即连续 n 个词的词序列) 惩罚,该方法是由 Paulus 等人 (2017) 和 Klein 等人 (2017) 引入的。最常见的 n-grams 惩罚是确保每个 n-gram 都只出现一次,方法是如果看到当前候选词与其上文所组成的 n-gram 已经出现过了,就将该候选词的概率设置为 0。

      [!NOTE] n-grams 惩罚如何缓解模型的重复输出问题?

      flashcard

      将重复出现的 n-gram 概率中的候选词概率置为 0

    7. 模型很快开始输出重复的文本!这在语言生成中是一个非常普遍的问题,在贪心搜索和波束搜索中似乎更是如此 - 详见 Vijayakumar 等人,2016 和 Shao 等人,2017 的论文。

      [!NOTE] 贪心/波束搜索有什么常见问题?

      flashcard

      容易生成重复文本

    1. The only caveat seems to be that train loss doesn’t exist at this point so the wandb plots are offset. class EvaluateFirstStepCallback(TrainerCallback): def on_step_begin(self, args, state, control, **kwargs): if state.global_step == 1: control.should_evaluate = True trainer.add_callback(EvaluateFirstStepCallback())

      [!NOTE] 🤗 Trainer 中如何安全地实现在训练前 evaluate?

      flashcard

      官方工程师:直接调用 trainer.evaluate(),实测会引起 DeepSpeed loss 无法正确计算? 实测使用 callback 可以实现,但似乎存在 offset 问题?

  4. huggingface.co huggingface.co
    1. auto_find_batch_size (bool, optional, defaults to False) — Whether to find a batch size that will fit into memory automatically through exponential decay, avoiding CUDA Out-of-Memory errors. Requires accelerate to be installed (pip install accelerate)

      [!NOTE] 🤗 Trainer 中,要自动寻找合适的 batch size,可以使用?

      flashcard

      auto_find_batch_size = True

    2. full_determinism (bool, optional, defaults to False) — If True, enable_full_determinism() is called instead of set_seed() to ensure reproducible results in distributed training. Important: this will negatively impact the performance, so only use it for debugging.

      [!NOTE] 🤗 Trainer 中,要实现完全确定的结果,可以使用?

      flashcard

      full_determinism = True 参数

    3. dataloader_num_workers (int, optional, defaults to 0) — Number of subprocesses to use for data loading (PyTorch only). 0 means that the data will be loaded in the main process.

      [!NOTE] 🤗 Trainer 中,dataloader_num_workers 的具体含义为?

      flashcard

      使用多少个子进程来加载数据

    4. debug (str or list of DebugOption, optional, defaults to "") — Enable one or more debug features. This is an experimental feature. Possible options are: "underflow_overflow": detects overflow in model’s input/outputs and reports the last frames that led to the event "tpu_metrics_debug": print debug metrics on TPU The options should be separated by whitespaces.

      [!NOTE] 🤗 TrainingArguments 中,要使用调试功能,需要注意什么?

      flashcard

      存在内置的 debug 参数,值为选项字符串 需要自定义其他名字的参数

    5. seed (int, optional, defaults to 42) — Random seed that will be set at the beginning of training. To ensure reproducibility across runs, use the ~Trainer.model_init function to instantiate the model if it has some randomly initialized parameters.

      [!NOTE] 🤗 Trainer 中的 seed 具体用在哪里?

      flashcard

      模型初始化 若模型没有随机参数,则没有作用

    6. seed (int, optional, defaults to 42) — Random seed that will be set at the beginning of training. To ensure reproducibility across runs, use the ~Trainer.model_init function to instantiate the model if it has some randomly initialized parameters. data_seed (int, optional) — Random seed to be used with data samplers. If not set, random generators for data sampling will use the same seed as seed. This can be used to ensure reproducibility of data sampling, independent of the model seed.

      [!NOTE] 🤗 Trainer 中,如何设置随机种子?

      flashcard

      1. seed
      2. data_seed
    7. logging_first_step (bool, optional, defaults to False) — Whether to log and evaluate the first global_step or not.

      [!NOTE] 🤗 Trainer 中,如何 evaluate 并 log 第一步?

      flashcard

      设置训练参数 logging_first_step=True

    8. By default Trainer will use logging.INFO for the main process and logging.WARNING for the replicas if any. These defaults can be overridden to use any of the 5 logging levels with TrainingArguments’s arguments: log_level - for the main process log_level_replica - for the replicas

      [!NOTE] 🤗 Trainer 的 logging level 默认为?如何设置?

      flashcard

      训练参数

    9. By default, Trainer will save all checkpoints in the output_dir you set in the TrainingArguments you are using. Those will go in subfolder named checkpoint-xxx with xxx being the step at which the training was at.

      [!NOTE] 🤗 Trainer 默认将 ckpt 储存在?

      flashcard

      output_dir/checkpoint-<step>

    10. sharded_ddp (bool, str or list of ShardedDDPOption, optional, defaults to '') — Use Sharded DDP training from FairScale (in distributed training only). This is an experimental feature.

      [!NOTE] 🤗 Trainer 中,要使用 Sharded DDP,可以使用?

      flashcard

      训练参数 sharded_ddp=...

    11. compute_metrics (Callable[[EvalPrediction], Dict], optional) — The function that will be used to compute metrics at evaluation. Must take a EvalPrediction and return a dictionary string to metric values.

      [!NOTE] 🤗 Trainer compute_metrics 的参数与返回值要求为?

      flashcard

      • 参数:EvalPrediction,其成员均为 NumPy 数组
      • 返回值:映射字符串到 metric 的字典
    12. preprocess_logits_for_metrics (Callable[[torch.Tensor, torch.Tensor], torch.Tensor], optional) — A function that preprocess the logits right before caching them at each evaluation step. Must take two tensors, the logits and the labels, and return the logits once processed as desired. The modifications made by this function will be reflected in the predictions received by compute_metrics.

      [!NOTE] 🤗 Trainer 中,要自定义 compute_metrics 前如何预处理 logits,可以使用?

      flashcard

      设置参数 preprocess_logits_for_metrics 该函数的参数为 (logits, labels) 返回值为 compute_metrics 使用的 predictions

    13. remove_unused_columns (bool, optional, defaults to True) — Whether or not to automatically remove the columns unused by the model forward method. (Note that this behavior is not implemented for TFTrainer yet.)

      [!NOTE] 🤗 Trainer 默认会如何处理模型 forward() 中用不到的列?

      flashcard

      remove_unused_columns: bool = True

    1. ($@) Expands to the positional parameters, starting from one. In contexts where word splitting is performed, this expands each positional parameter to a separate word; if not within double quotes, these words are subject to word splitting. In contexts where word splitting is not performed, this expands to a single word with each positional parameter separated by a space. When the expansion occurs within double quotes, and word splitting is performed, each parameter expands to a separate word. That is, "$@" is equivalent to "$1" "$2" …. If the double-quoted expansion occurs within a word, the expansion of the first parameter is joined with the beginning part of the original word, and the expansion of the last parameter is joined with the last part of the original word. When there are no positional parameters, "$@" and $@ expand to nothing (i.e., they are removed).

      [!NOTE] $@ 会展开成什么?

      flashcard

      语义上,$@ 扩展为位置参数;当没有位置参数时,"$@"$@ 扩展为空字符串。 具体使用中,需要注意单词划分双引号

    1. the modulecmd outputs valid shell commands to stdout which manipulates the shell's environment. Any text that is meant to be seen by the user must be sent to stderr. For example: puts stderr "\n\tSome Text to Show\n"

      [!NOTE] modulecmd 有什么用?

      flashcard

      输出 Shell 命令到标准输出,来操控 Shell 的环境

    1. Path elements registered in the MODULEPATH environment variable may contain reference to environment variables which are converted to their corresponding value by module command each time it looks at the MODULEPATH value. If an environment variable referred in a path element is not defined, its reference is converted to an empty string. MODULERCFILE

      [!NOTE] 如何在 MODULEPATH 中使用环境变量?

      flashcard

      直接加入即可,module 会自动解析

    2. MODULEPATH¶ The path that the module command searches when looking for modulefiles. Typically, it is set to the main modulefiles directory, /usr/share/Modules/modulefiles, by the initialization script.

      [!NOTE] Environment Modules 中,MODULEPATH 有什么用?

      flashcard

      module 搜索 modulefiles 的位置,类似 PATH

    1. Once the Modules package is initialized, the environment can be modified on a per-module basis using the module command which interprets modulefiles. Typically modulefiles instruct the module command to alter or set shell environment variables such as PATH, MANPATH, etc.

      [!NOTE] Environment Modules 的工作原理为?

      flashcard

      根据 modulefile 的配置自动设置 PATH 等环境变量

    1. 操作系统结构

      [!NOTE] 操作系统有哪些常见结构?

      flashcard

      • 单体
      • 分层
      • 微内核
      • 外核
      • 虚拟机
      • 混合
    2. 微内核结构(Micro Kernel) 尽可能把内核功能移到用户空间 用户模块间的通信使用消息传递 好处: 灵活/安全... 缺点: 性能

      [!NOTE] 操作系统中,微内核的含义是?

      flashcard

      尽可能把内核功能转移到用户空间 内核仅负责最基本的通信调度(?)

    1. stingy QoS is for users who want to run jobs with the lowest priority without charging (UsageFactor=0). These jobs may be preempted (suspended, re-queued or terminated) when higher priority jobs are submitted to the system.

      [!NOTE] QoS 中,stingy 表示?

      flashcard

      优先级最低

    1. IncompleteOperate only on jobs (or tasks of a job array) which have not completed. Specifically only jobs in the following states will be requeued: CONFIGURING, RUNNING, STOPPED or SUSPENDED.

      [!NOTE] scontrol requeue 中,要只重新排队未完成的 job,可以使用?

      flashcard

      Incomplete 选项

    1. In your training program, you are supposed to call the following function at the beginning to start the distributed backend. It is strongly recommended that init_method=env://. Other init methods (e.g. tcp://) may work, but env:// is the one that is officially supported by this module. >>> torch.distributed.init_process_group(backend='YOUR BACKEND', >>> init_method='env://')

      [!NOTE] torch.distributed.init_process_group 中,init_method 有什么用?一般取什么值?

      flashcard

      初始化时的通信方式? 一般取值为 'env://'

    2. torch.distributed.barrier(group=None, async_op=False, device_ids=None)[source] Synchronizes all processes. This collective blocks processes until the whole group enters this function, if async_op is False, or if async work handle is called on wait().

      [!NOTE] torch.distributed 中,要同步一个 group 内的进程,可以使用?

      flashcard

      torch.distributed.barrier(group, ...)

    1. 可以使用其他的device map映射方式,通过设置device_map参数(例如"auto", "balanced", "balanced_low_0", "sequential"),或者手工设置这个字典(如果控制欲很强,或者有特殊需求)。读者可以操控模型在meta设备上的所有层(计算device_map)。当读者没有足够GPU显存来加载完整的模型(因为都会按照先占满GPU显存,再占满CPU/内存资源,最后占据硬盘的顺序来完成模型加载),上面所有的选项得到的层和设备对应结果将会相同。当读者有足够的GPU资源来加载模型,那么上面4个选项得到的结果会有所不同。"auto" 和 "balanced" 将会在所有的GPU上平衡切分模型,那么可以计算批尺寸大于1的输入"balanced_low_0" 会在除了第一个GPU上的其它GPU上平衡划分模型,并且在第一个GPU上占据较少资源。这个选项符合需要在第一个GPU上进行额外操作的需求,例如需要在第一个GPU执行generate函数(迭代过程)。"sequential" 按照GPU的顺序分配模型分片,从GPU 0开始,直到最后的GPU(那么最后的GPU往往不会被占满,和"balanced_low_0"的区别就是第一个还是最后一个,以及非均衡填充)这里"auto"和"balanced" 会得到相同的结果,但是未来"auto"模式可能会被改变,主要是有可能发现更高效的分配策略。"balanced"参数的功能则保持稳定。

      [!NOTE] 🤗 Accelerate 中 device_map 有什么预设选项?

      flashcard

      • "auto" 和 "balanced" 将会在所有的GPU上平衡切分模型,那么可以计算批尺寸大于1的输入
      • "balanced_low_0" 会在除了第一个GPU上的其它GPU上平衡划分模型,并且在第一个GPU上占据较少资源。这个选项符合需要在第一个GPU上进行额外操作的需求,例如需要在第一个GPU执行generate函数(迭代过程)。
      • "sequential" 按照GPU的顺序分配模型分片,从GPU 0开始,直到最后的GPU(那么最后的GPU往往不会被占满,和"balanced_low_0"的区别就是第一个还是最后一个,以及非均衡填充)
    1. A maximum number of simultaneously running tasks from the job array may be specified using a "%" separator. For example "--array=0-15%4" will limit the number of simultaneously running tasks from this job array to 4.

      [!NOTE] SLURM array 中,如何设置最多同时运行几个 job?

      flashcard

      --array=...%<num>

    1. 出现问题的条件 在 pytorch 1.5 + 以上的版本 在多卡训练 在import torch 在 import numpy 之前 原因 如果在 numpy 之前导入了 torch,那么这里的子进程将获得一个 GNU 线程层(即使父进程没有定义变量) 但是如果 numpy 在 Torch 之前被导入,子进程将获得一个 INTEL 线程层,这种情况会导致线程之间打架

      [!NOTE] PyTorch 多进程计算中,torchnumpy 导入顺序有什么需要注意的?

      flashcard

      若 import torch 在 import numpy 之前,会导致线程冲突, 即 Error: mkl-service + Intel® MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library. Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it. 具体来说 如果在 numpy 之前导入了 torch,那么这里的子进程将获得一个 GNU 线程层(即使父进程没有定义变量) 但是如果 numpy 在 Torch 之前被导入,子进程将获得一个 INTEL 线程层,这种情况会导致线程之间打架

    1. OS 原理与设计思想 操作系统结构 中断及系统调用 内存管理 进程管理 处理机调度 同步互斥 文件系统 I/O 子系统

      [!NOTE] 操作系统有哪些核心内容?

      flashcard

    1. wandb login [OPTIONS] [KEY]...

      [!NOTE] 命令行 wandb login 如何直接使用 key 登录?

      flashcard

      wandb login [OPTIONS] [KEY]... 最后加上 key 即可

    1. fallback selects an available policy by priority. The availability is tested by accessing an URL, just like an auto url-test group.

      [!NOTE] Clash 的 proxy-groups 中,fallback 策略是如何实现的?

      flashcard

      间隔 intervalurl 进行测试来执行 fallback

    2. you can use RESTful API to switch proxy is recommended for use in GUI.

      [!NOTE] Clash 如何选择代理节点?

      flashcard

      似乎只能使用 GUI

    1. If you prefer to launch your training job using MPI (e.g., mpirun), we provide support for this. It should be noted that DeepSpeed will still use the torch distributed NCCL backend and not the MPI backend.

      [!NOTE] DeepSpeed 能否启动 MPI 的多机训练?需要注意什么?

      flashcard

      能, 但需要注意:使用的是 torch 分布式的 NCCL 后端

    2. If you would like to propagate additional variables you can specify them in a dot-file named .deepspeed_env that contains a new-line separated list of VAR=VAL entries. The DeepSpeed launcher will look in the local path you are executing from and also in your home directory (~/). If you would like to override the default name of this file or path and name with your own, you can specify this with the environment variable, DS_ENV_FILE.

      [!NOTE] DeepSpeed 中,要自定义要广播的环境变量,可以使用?

      flashcard

      文件 .deepspeed_env - 内容:new-line separated list of VAR=VAL entries - 位置:1) 默认为 ./ > ~/ 2) 环境变量 DS_ENV_FILE 自定义

    3. By default DeepSpeed will propagate all NCCL and PYTHON related environment variables that are set.

      [!NOTE] DeepSpeed 默认会 propagate 哪些环境变量?

      flashcard

      与 NCCL/PYTHON 相关的环境变量

    4. Important: all processes must call this method and not just the process with rank 0. It is because each process needs to save its master weights and scheduler+optimizer states. This method will hang waiting to synchronize with other processes if it’s called just for the process with rank 0.

      [!NOTE] DeepSpeed 保存 ckpt 时,关于调用的进程,有什么需要注意的?

      flashcard

      每个进程都需要调用 model_engine.save_checkpoint 与常见的主进程保存不同

    5. the user may want to save additional data that are unique to a given model training. To support these items, save_checkpoint accepts a client state dictionary client_sd for saving. These items can be retrieved from load_checkpoint as a return argument.

      [!NOTE] DeepSpeed 如何保存/读取自定义的 ckpt 附带信息?

      flashcard

      client_sd: dict 参数/返回值

    6. Learning Rate Scheduler: when using a DeepSpeed’s learning rate scheduler (specified in the ds_config.json file), DeepSpeed calls the step() method of the scheduler at every training step (when model_engine.step() is executed). When not using DeepSpeed’s learning rate scheduler: if the schedule is supposed to execute at every training step, then the user can pass the scheduler to deepspeed.initialize when initializing the DeepSpeed engine and let DeepSpeed manage it for update or save/restore. if the schedule is supposed to execute at any other interval (e.g., training epochs), then the user should NOT pass the scheduler to DeepSpeed during initialization and must manage it explicitly.

      [!NOTE] DeepSpeed 的代码中,如何执行学习率调度?

      flashcard

      • 如果要每步调度一次,可以通过 DeepSpeed config 使用了 DeepSpeed 的调度器/将 scheduler 传入 deepspeed.initialize(),这会在 model_engine.step() 中执行学习率调度
      • 否则就需要显式处理(如何修改学习率?)
    7. Once the DeepSpeed engine has been initialized, it can be used to train the model using three simple APIs for forward propagation (callable object), backward propagation (backward), and weight updates (step). for step, batch in enumerate(data_loader): #forward() method loss = model_engine(batch) #runs backpropagation model_engine.backward(loss) #weight update model_engine.step()

      [!NOTE] DeepSpeed 如何编写 training loop?

      flashcard

      前向+后向+更新

    8. To initialize the DeepSpeed engine: model_engine, optimizer, _, _ = deepspeed.initialize(args=cmd_args, model=model, model_parameters=params) deepspeed.initialize ensures that all of the necessary setup required for distributed data parallel or mixed precision training are done appropriately under the hood. In addition to wrapping the model, DeepSpeed can construct and manage the training optimizer, data loader, and the learning rate scheduler based on the parameters passed to deepspeed.initialize and the DeepSpeed configuration file.

      [!NOTE] DeepSpeed 如何初始化模型、优化器等要素?

      flashcard

      deepspeed.initialize()

    9. If you already have a distributed environment setup, you’d need to replace: torch.distributed.init_process_group(...) with: deepspeed.init_distributed()

      [!NOTE] DeepSpeed 如何初始化分布式环境?

      flashcard

      deepspeed.init_distributed()

    10. But if you don’t need the distributed environment setup until after deepspeed.initialize() you don’t have to use this function, as DeepSpeed will automatically initialize the distributed environment during its initialize.

      [!NOTE] DeepSpeed 何时需要初始化分布式环境

      flashcard

      仅当需要在 deepspeed.initialize() 前使用分布式环境时 否则 deepspeed.initialize() 会自动初始化分布式环境

    11. The default is to use the NCCL backend, which DeepSpeed has been thoroughly tested with, but you can also override the default.

      [!NOTE] DeepSpeed 默认使用什么分布式后端

      flashcard

      NCCL

    12. DeepSpeed on AMD can be used via our ROCm images, e.g., docker pull deepspeed/rocm501:ds060_pytorch110.

      [!NOTE] 在 AMD 上如何使用 DeepSpeed?

      flashcard

      docker image

    13. If no hostfile is specified, DeepSpeed searches for /job/hostfile. If no hostfile is specified or found, DeepSpeed queries the number of GPUs on the local machine to discover the number of local slots available.

      [!NOTE] DeepSpeed 多机计算中,hostfile 的默认位置为?

      flashcard

      默认位置 /job/hostfile

    14. A hostfile is a list of hostnames (or SSH aliases), which are machines accessible via passwordless SSH, and slot counts, which specify the number of GPUs available on the system. For example, worker-1 slots=4 worker-2 slots=4 specifies that two machines named worker-1 and worker-2 each have four GPUs to use for training.

      [!NOTE] DeepSpeed 多机计算中,hostfile 的内容为?

      flashcard

      • hostname/SSH alias (要求无需密码
      • slots=<num> 对应 node 可用的 GPU 数
    1. By default DeepSpeed expects that a multi-node environment uses a shared storage. If this is not the case and each node can only see the local filesystem, you need to adjust the config file to include a checkpoint_section with the following setting: Copied { "checkpoint": { "use_node_local_storage": true } } Alternatively, you can also use the Trainer’s --save_on_each_node argument, and the above config will be added automatically for you.

      [!NOTE] DeepSpeed 多机计算中,关于文件系统,有什么需要注意的?

      flashcard

      DeepSpeed 默认多节点共享同一个存储 如果不是,需要特别设置

    2. For example, to use torch.distributed.run, you could do: Copied python -m torch.distributed.run --nproc_per_node=8 --nnode=2 --node_rank=0 --master_addr=hostname1 \ --master_port=9901 your_program.py <normal cl args> --deepspeed ds_config.json You have to ssh to each node and run this same command on each one of them! There is no rush, the launcher will wait until both nodes will synchronize.

      [!NOTE] torch.distributed.run 如何启动多机计算?

      flashcard

      需要在每个节点上输入一遍命令

    3. The information in this section isn’t not specific to the DeepSpeed integration and is applicable to any multi-node program. But DeepSpeed provides a deepspeed launcher that is easier to use than other launchers unless you are in a SLURM environment.

      [!NOTE] 多机计算中,哪个框架比较简单易用?

      flashcard

      DeepSpeed

    4. Stage 0 is disabling all types of sharding and just using DeepSpeed as DDP. You can turn it on with: Copied { "zero_optimization": { "stage": 0 } } This will essentially disable ZeRO without you needing to change anything else.

      [!NOTE] DeepSpeed 中,stage 0 (DDP) 的 config 有什么需要注意的?

      flashcard

      除了 "zero_optimization" 外还需要其他字段,至少包括: - "train_micro_batch_size_per_gpu"

    1. MATCH,policy routes the rest of the packets to policy. This rule is required and is usually used as the last rule.

      [!NOTE] Clash rules 中,MATCH 为?

      flashcard

      默认选项

    1. curl -X PUT -H "Content-Type: application/json" -d '{"name":"节点名"}' http://localhost:port/proxies/:Selector -H 添加 HTTP 请求的标头Content-Type: application/json,根据链接2,不设置标头为application/json可能会有问题。 -d 参数用于发送 POST 请求的数据体。 最后的网址为clash的external-controller的网址端口,最后Selector为要选择的proxy-groups的名称。 实际指令类似下面这条: 1curl -X PUT -H "Content-Type: application/json" -d '{"name":"HongKong"}' http://127.0.0.1:9090/proxies/Proxy

      [!NOTE] Clash API 如何使用 curl 更改代理节点?

      flashcard

      curl -X PUT -H "Content-Type: application/json" -d '{"name":"节点名"}' http://localhost:port/proxies/:<proxy-group>

    2. PUT 切换 Selector 中选中的代理/proxies/:name (这边的:name可以为节点名称,也可以为Selector。只要在proxies/后直接加上字符串就可以,不需要引号或者:)

      [!NOTE] Clash API 如何切换代理?

      flashcard

      PUT /proxies/...

    1. External Controller enables users to control Clash programmatically with the HTTP RESTful API. The third-party Clash GUIs are heavily based on this feature. Enable this feature by specifying an address in external-controller.

      [!NOTE] Clash 的外部控制方式为?

      flashcard

      RESTful API

    1. 看一看 youtube.com 这条规则加在那个分组里 - DOMAIN-SUFFIX,youtube.com,🌍 国外媒体可以看到 youtube 是在 国外媒体分组里,那么在这里要看国外媒体选择的是那个节点可以看到国外媒体选择的是 香港2-5 这个节点,选择是的延迟最低的节点,所以 youtube 走的是这个代理。

      [!NOTE] Clash 是如何决定使用什么分组的什么代理节点的?

      flashcard

      根据 rules 匹配分组,再根据分组的 type 选择节点

    1. While Clash is meant to be run in the background, there's currently no elegant way to implement daemons with Golang, hence we recommend you to daemonize Clash with third-party tools.

      [!NOTE] Golang 对 daemon 的支持如何?

      flashcard

      截至 2023-9-14 还没有优雅的实现方式

    1. If you want to place your configurations elsewhere (e.g. /etc/clash), you can use command-line option -d to specify a configuration directory:shellclash -d . # current directory clash -d /etc/clashOr, you can use option -f to specify a configuration file:shellclash -f ./config.yaml clash -f /etc/clash/config.yaml

      [!NOTE] clash 如何指定配置目录或配置文件?

      flashcard

      -d / -f

    2. The main configuration file is called config.yaml. By default, Clash reads the configuration files at $HOME/.config/clash.

      [!NOTE] Clash 的默认配置文件为?

      flashcard

      ~/.config/clash/config.yaml

    1. ShadowsocksR(简称SSR)是一种基于Socks5代理方式的加密传输协议,也可以指实现这个协议的各种开发包。

      [!NOTE] ShadowsocksR (SSR) 是什么?

      flashcard

      • 一种基于Socks5代理方式的加密传输协议
      • 也可以指实现这个协议的各种开发包
    1. 脂溢性皮炎患者皮肤上都有大量的马拉色菌定植,它们能释放出溶脂酶,而溶质酶可以将皮脂中的甘油三酯分解成油酸和花生四烯酸等游离脂肪酸,这些代谢物会导致角质细胞的分化异常,影响表皮和角质层的正常代谢功能,造成屏障功能受损和角化不全。其中,角化不全会导致角栓形成,也就是堆积的角质+皮脂+微生物及其代谢产物形成的毛囊栓塞,即黑头的基础;屏障功能受损会导致更多的炎症,炎症反应又会进一步影响角质分化,然后恶性循环。。。为了帮助大家理解,我简化一下这个逻辑链吧:马拉色菌(某品种真菌)——产生溶脂酶——皮脂被分解成油酸、花生四烯酸等游离脂肪酸——角质细胞分化异常——角栓形成/屏障功能受损——黑头/炎症——炎症进一步影响角质分化——加剧角栓——进入恶性循环~而甲硝唑是明确有效的抗真菌药,可以干掉定植在人皮肤表面的马拉色菌,自然后面描述的这串恶性循环也就无法开启了~“黑头宝宝”也就失去了赖以生存的肥沃土壤了。事实上,在大量翻阅资料的过程中我发现,并不仅是马拉色菌有这个分解甘油三酯的能力,痘肌宝宝都听过的痤疮丙酸杆菌也可以

      [!NOTE] 马拉色菌、痤疮丙酸杆菌等微生物引起痤疮的主要机理为?

      flashcard

      释放出溶脂酶,将皮脂中的甘油三酯分解成油酸和花生四烯酸等游离脂肪酸,导致皮肤角化异常

    2. 皮肤相关的主要作用:抗厌氧菌(比如痤疮丙酸杆菌)+抗真菌(比如马拉色菌)+抗毛囊虫(即蠕形螨,俗称的螨虫)+免疫调节+抗炎抗氧化;作用机理:抗厌氧菌——甲硝唑的硝基在无氧环境中可以还原成氨基(等中间产物),并作用于厌氧菌的DNA,使其链断裂,或阻断其转录、复制等过程,从而杀灭厌氧菌;绝大多数专性厌氧菌能被8~16ug/g或更低浓度的甲硝唑所杀灭(这数据感觉屌屌的);在厌氧菌细胞内被铁硫蛋白还原,其还原产物可抑制细菌DNA的合成;抗真菌——直接损伤真菌的胞质膜(比如核膜、线粒体膜、溶酶体膜等),使其通透性改变,造成细胞内重要物质的摄取障碍或漏失而致真菌死亡;抗毛囊虫——甲硝唑的硝基在无氧环境中可以还原成氨基(等中间产物),与蠕形螨虫体细胞分子发生反应而导致蠕形螨死亡;免疫调节——甲硝唑可以通过其选择素与综合素作用干扰免疫系统,包括干扰白细胞和内皮细胞的粘附作用、抑制炎症细胞的移行等,从而达到免疫调节功能;抗炎抗氧化——甲硝唑可灭活、减少炎症中产生的ROS,达到局部抗炎的效果。

      [!NOTE] 甲硝唑有什么关于皮肤的作用?对应的机理是怎样的?

      flashcard

      • 抗菌、抗虫
      • 消炎
    3. 黑色素说则要比污垢说晚一百年,主要观点是认为你看到的黑头表面的黑色主要来源是由角质细胞带上来的黑色素。对,就是你们深恶痛绝的黑素细胞产生的那种黑色素!(当然,也可能同时夹杂有皮脂及其衍生物、细菌及其代谢物、甚至不小心和它们吸附在一起的极少量污垢颗粒之类的,只是这些东西并非主要因素)不过,关于黑色素说其实也有争论,在这个假说出现的几年后有人报道说通过对黑头的超微结构观察后发现,并没有看到黑头里有黑色素颗粒的存在,而只看到了越靠近黑头表面(也就是最黑的那部分),角质细胞的堆积越致密且高度定向,所以,他们猜测黑头表面的黑色更可能是这种高度定向的角质细胞致密堆积后造成的

      [!QUESTION] “黑头”的黑色产生自什么?

      flashcard

      • 带有黑色素的胶质细胞的致密堆积?
    1. The moderations endpoint is a tool you can use to check whether content complies with OpenAI's usage policies. Developers can thus identify content that our usage policies prohibits and take action, for instance by filtering it.The models classifies the following categories:

      [!NOTE] OpenAI API 提供了什么检测模型响应安全性的工具?

      flashcard

      moderations endpoint

    1. The frequency and presence penalties found in the Chat completions API and Legacy Completions API can be used to reduce the likelihood of sampling repetitive sequences of tokens. They work by directly modifying the logits (un-normalized log-probabilities) with an additive contribution.mu[j] -> mu[j] - c[j] * alpha_frequency - float(c[j] > 0) * alpha_presence

      [!NOTE] OpenAI API 的 frequency and presence penalties 是如何工作的?

      flashcard

      操作 logits

    1. For reference, gpt-3.5-turbo performs at a similar capability level to text-davinci-003 but at 10% the price per token!

      [!NOTE] gpt-3.5-turbo 性价比有多高?

      flashcard

      能力类似 text-davinci-003 token 单价为其 10%

    1. The response format is similar to the response format of the Chat completions API but also includes the optional field logprobs.

      [!NOTE] OpenAI API 什么时候有 logprobs 字段?

      flashcard

      completions API 有 chat API 没有

    1. For example, if you want to use secondary GPU, put "1".(add a new line to webui-user.bat not in COMMANDLINE_ARGS): set CUDA_VISIBLE_DEVICES=0Alternatively, just use --device-id flag in COMMANDLINE_ARGS.

      [!NOTE] SD Web UI 通过什么方式设置使用的设备?

      flashcard

      --device-id = CUDA_VISIBLE_DEVICES

    1. If you want to force the model to call a specific function you can do so by setting function_call: {"name": "<insert-function-name>"}. You can also force the model to generate a user-facing message by setting function_call: "none". Note that the default behavior (function_call: "auto") is for the model to decide on its own whether to call a function and if so which function to call.

      [!NOTE] OpenAI API 的 function_call 字段有哪些设置?

      flashcard

      • "<func_name>" 要求特定函数
      • "none" 禁止
      • "auto" 模型自行决定
    2. Hallucinated outputs in function calls can often be mitigated with a system message. For example, if you find that a model is generating function calls with functions that weren't provided to it, try using a system message that says: "Only use the functions you have been provided with."

      [!QUESTION] 如何减轻 LLM 对函数输出的幻觉?

      flashcard

      1. 在系统信息中强调限制
    3. You can see these steps in action through the example below:

      [!NOTE] 如何使用带 function 的 OpenAI API?

      flashcard

      示例代码如下

    1. In the process of training a neural network, there are multiple stages where randomness is used, for example random initialization of weights of the network before the training starts. regularization, e.g. dropout, which involves randomly dropping nodes in the network while training. optimization process like stochastic gradient descent, RMSProp or Adam also include random initializations.

      [!NOTE] DL 中,主要有哪些随机性来源?

      flashcard

      • 模型权重的随机初始化
      • 正则化,例如 dropout
      • 优化过程,例如优化器的随机初始化
    1. Under the hood, functions are injected into the system message in a syntax the model has been trained on. This means functions count against the model's context limit and are billed as input tokens.

      [!NOTE] OpenAI API JSON function 的实现原理是?

      flashcard

      微调+系统消息 会导致额外的成本

    2. The latest models (gpt-3.5-turbo-0613 and gpt-4-0613) have been fine-tuned to both detect when a function should to be called (depending on the input) and to respond with JSON that adheres to the function signature.

      [!NOTE] OpenAI API 如何提供 JSON 格式的输出?

      flashcard

      0613 后的模型被微调以识别函数并生成 JSON 格式的响应

    1. For most models, this is 4,096 tokens or about 3,000 words. As a rough rule of thumb, 1 token is approximately 4 characters or 0.75 words for English text.

      [!NOTE] OpenAI tiktoken 的 token/word 比例约为?

      flashcard

      4/3

    1. The completions endpoint is the core of our API and provides a simple interface that’s extremely flexible and powerful.

      [!NOTE] OpenAI API 的核心 endpoint 为?

      flashcard

      /completions

    1. As of PyTorch v1.6.0, features in torch.distributed can be categorized into three main components:

      [!NOTE] torch.distributed 的 features 主要有哪些组成部分?

      flashcard

      • Distributed Data-Parallel Training (DDP)
      • RPC-Based Distributed Training (RPC)
      • Collective Communication (c10d)
    1. We're using the time since last checkpoint to determine if we should checkpoint this - this will very slightly vary between the different processes. Eventually, there comes a batch during which one of these two processes enters the intra-epoch checkpoint saving block and the other does not. In the intra-epoch saving block, the run_on_main function has a ddp.barrier synchronisation routine. I think in this case we don't need synchronisation, we just need to run the saving on the main process exclusively. If that is so, the fix would luckily be simple.

      [!NOTE] 分布式计算与保存中,进程之间的计时可能导致什么问题?

      flashcard

      不同进程的计时可能存在微小的差别, 这可能导致部分进程进入保存 block 而其他进程没有, 这时的同步操作可能引发错误 但强制仅在主进程运行保存,并且不执行同步,即可解决问题?

    1. In this case, rank 6 is performing a different collective than other ranks. It’s performing a barrier while the others are performing a broadcast. Could it be that you have an uneven dataset and rank 6 is done while the other still going? If this is the case, you can try to use the join context manager 48.

      [!NOTE] 分布式计算中,数据集不平衡,部分 rank 提前结束导致错误,可能可以如何解决?

      flashcard

      PyTorch join context manager?

    2. The most common issues with collectives are shape and ordering mismatch. This is why Pritam’s suggestion of printing the input shapes could help you understand which issue you’re facing here.

      [!NOTE] 分布式计算中,collective 最常见的问题是?

      flashcard

      shape/ordering mismatch

    1. To enable RoPE scaling, simply pass --rope-scaling, --max-input-length and --rope-factors flags when running through CLI. --rope-scaling can take the values linear or dynamic. If your model is not fine-tuned to a longer sequence length, use dynamic. --rope-factor is the ratio between the intended max sequence length and the model’s original max sequence length.

      [!NOTE] 🤗 中,如何使用 RoPE Scaling?

      flashcard

      config.rope_scaling = {"type": "linear", "factor": scaling_factor}

    1. The answer is yes, but you need two things as discussed in the issue mentioned above: use find_unused_parameters=False make sure that all model parameters contribute to the calculation of the loss. Here's an example

      [!NOTE] 在多设备上使用 gradient checkpointing,需要满足什么条件?

      flashcard

      1. use find_unused_parameters=False
      2. make sure that all model parameters contribute to the calculation of the loss.
    1. You need to have dpp_find_unused_parameters=False when using gradient_checkpointing=True.

      [!NOTE] 使用 gradient checkpointing + DDP 时,需要注意什么?

      flashcard

      find_unused_parameters 不能与 gc 共存

    1. no_sync()[source] A context manager to disable gradient synchronizations across DDP processes. Within this context, gradients will be accumulated on module variables, which will later be synchronized in the first forward-backward pass exiting the context.

      [!NOTE] PyTorch DDP 中,要禁止同步梯度,可以使用?

      flashcard

      with no_sync(): forward(); backward()

    1. In order to support this arbitrary inclusion/exclusion our launcher sets the appropriate CUDA_VISIBLE_DEVICES at process launch time on each node. This means that if the user sets their own CUDA_VISIBLE_DEVICES on the launching node it's not clear if they want to set this value on the local node or all nodes.

      [!NOTE] 使用 DeepSpeed 时,自行设置 CUDA_VISIBLE_DEVICES 可能导致什么问题?

      flashcard

      不确定是要在本地节点还是所有节点上设置

    2. If you do not provide the deepspeed launcher with a hostfile (via -H/--hostfile) it will only launch processes within that local node.

      [!NOTE] DeepSpeed 中,如何指定多机信息?

      flashcard

      -H / --hostfile

    1. deepspeed.zero.GatheredParameters是DeepSpeed提供的一个上下文管理器, # 它可以将分布在多个设备上的参数收集到一起。这部分参数保存在CPU上。

      [!NOTE] DeepSpeed 中,with deepspeed.zero.GatheredParameters(params): 有什么用?

      flashcard

      params gather 到 CPU 上

    1. DeepSpeed can automatically detect the following external parameter scenarios:

      [!NOTE] DeepSpeed(>=0.3.15) 能够自动登记哪些外部参数?

      flashcard

      __init__()/forward() 方法中? 1. 在其他函数中被调用的模型参数,例如 embeddings 2. 被返回的模型参数,例如 CustomLinear.bias

    2. he tensor embeddings.weight is used in both embeddings.forward() and compute_logits(). We call embeddings.weight an external parameter because it is used in the training loop outside of its owning module’s forward pass.

      [!NOTE] ZeRO-3 中外部参数有哪些例子?

      flashcard

      1. embeddings.weight is used in both embeddings.forward() and compute_logits()
    3. However, in some cases a parameter may be used outside of its module’s forward pass. We call these external parameters. ZeRO-3 can coordinate these parameters if they are registered either automatically or manually.

      [!QUESTION] ZeRO-3 为什么要登记外部参数?

      flashcard

      在使用时 gather?

    4. Some models partitioned with deepspeed.zero.Init may need to access a module’s weights outside of the class constructor or its forward() method. We refer to these weights as external parameters, since these parameters are accessed outside of the module that created them.

      [!NOTE] DeepSpeed 中,外部参数(external parameters)是指?

      flashcard

      在创建它们的 module (__init__() 或· forward())外被访问的参数

  5. huggingface.co huggingface.co
    1. enable_input_require_grads < source > ( ) Enables the gradients for the input embeddings. This is useful for fine-tuning adapter weights while keeping the model weights fixed.

      [!QUESTION] 输入 embedding 矩阵需要梯度,与微调 adapter 有什么关系?

      flashcard

    1. clean_up_tokenization_spaces (bool, optional, defaults to True) — Whether or not the model should cleanup the spaces that were added when splitting the input text during the tokenization process.

      [!NOTE] 🤗 tokenizer.decode() 中,如何简单地清理多余的空格?

      flashcard

      设置参数 clean_up_tokenization_spaces=True