修复GPU节点容器启动问题
此时发现的Cgroup Driver类型是cgroupfs。问题描述
# service kubelet stop
Redirecting to /bin/systemctl stop kubelet.service
# service docker stop
Redirecting to /bin/systemctl stop docker.service
# service docker start
Redirecting to /bin/systemctl start docker.service
# service kubelet start
Redirecting to /bin/systemctl start kubelet.service
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
问题定位
docker info | grep -i cgroup
Cgroup Driver: cgroupfs
解决方案
cat >/etc/docker/daemon.json <<-EOF
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "10"
},
"oom-score-adjust": -1000,
"storage-driver": "overlay2",
"storage-opts":["overlay2.override_kernel_check=true"],
"live-restore": true
}
EOF
# service kubelet stop
Redirecting to /bin/systemctl stop kubelet.service
# service docker restart
Redirecting to /bin/systemctl restart docker.service
# service kubelet start
Redirecting to /bin/systemctl start kubelet.service
# docker info | grep -i cgroup
Cgroup Driver: systemd
发布者:佚, 佚名,转转请注明出处:https://www.cms2.cn/aliyun/csk/4970.html