9.6 KiB
服务器的基本信息如下:
root@wdd-ai-server:# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 7 5700X 8-Core Processor
CPU family: 25
Model: 33
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 2
Frequency boost: enabled
CPU max MHz: 3400.0000
CPU min MHz: 2200.0000
BogoMIPS: 6800.35
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse
36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rd
tscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aper
fmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic
movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapi
c cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt
tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_
l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 av
x2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb s
ha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_to
tal cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lb
rv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists
pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pk
u ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 256 KiB (8 instances)
L1i: 256 KiB (8 instances)
L2: 4 MiB (8 instances)
L3: 32 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Not affected
Spec rstack overflow: Mitigation; safe RET, no microcode
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, R
SB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Not affected
root@wdd-ai-server:# lsmem
RANGE SIZE STATE REMOVABLE BLOCK
0x0000000000000000-0x00000000bfffffff 3G online yes 0-23
0x0000000100000000-0x000000043fffffff 13G online yes 32-135
Memory block size: 128M
Total online memory: 16G
Total offline memory: 0B
root@wdd-ai-server:# lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Starship/Matisse IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:05.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 7
01:00.0 Non-Volatile memory controller: MAXIO Technology (Hangzhou) Ltd. NVMe SSD Controller MAP1001 (rev 01)
03:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller (rev 01)
03:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller (rev 01)
03:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge (rev 01)
20:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
20:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
20:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
22:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
26:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070] (rev a1)
26:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1)
27:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function
28:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP
28:00.1 Encryption controller: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP
28:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
28:00.4 Audio device: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller
root@wdd-ai-server:# cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
root@wdd-ai-server:# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 63.9M 1 loop /snap/core20/2105
loop1 7:1 0 87M 1 loop /snap/lxd/27037
loop2 7:2 0 40.4M 1 loop /snap/snapd/20671
nvme0n1 259:0 0 476.9G 0 disk
├─nvme0n1p1 259:1 0 1G 0 part /boot/efi
├─nvme0n1p2 259:2 0 2G 0 part /boot
└─nvme0n1p3 259:3 0 473.9G 0 part
└─ubuntu--vg-ubuntu--lv 253:0 0 473.9G 0 lvm /
root@wdd-ai-server:# df -TH
Filesystem Type Size Used Avail Use% Mounted on
tmpfs tmpfs 1.7G 1.6M 1.7G 1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv ext4 500G 7.5G 467G 2% /
tmpfs tmpfs 8.4G 0 8.4G 0% /dev/shm
tmpfs tmpfs 5.3M 0 5.3M 0% /run/lock
/dev/nvme0n1p2 ext4 2.1G 137M 1.8G 8% /boot
/dev/nvme0n1p1 vfat 1.2G 6.4M 1.2G 1% /boot/efi
tmpfs
GPU显卡的具体型号为 Nvidia Geforce RTX 3070 已经被扩容到16GB的显存
此服务器处于中国大陆境内,如果涉及到无法直连的情况,需要考虑通过局域网内的代理实现连接
代理地址为 192.168.233.206:7899
请你给出完整的操作方案,实现如下的内容
请分阶段实现,每次只给出一个阶段的具体步骤,给出提示以后再输出第二段
- 需要实现nvdia驱动的安装,实现nvidia-smi的使用
- 需要使用vllm的方式运行 Qwen3.6-35B-A3B的模型,需要考虑当前物理服务器的限制,在16GB内存和16GB显存,运行尽可能高精度的模型
- 需要给出调用vllm模型的方式
这张3070在windows环境经常出现黑屏闪屏的情况,需要nvidia控制面板设置最高性能模式才可以,现在linux环境下如何设置呢,我现在无法ssh上去了,看着像卡死了是否是同种原因造成的呢