在Kubernetes/OpenShift上运行Cephoxbox for Rook
如何在kubernetes/OpenShift运行的Rook Ceph集群上运行Ceph命令?
如果我们在带有车的Kubernetes上使用Ceph,我们需要一种方法来访问Ceph命令行工具,以便在出现时进行故障排除问题。
Rook是CNCF认证,生产准备好,开源云 - 本机存储解决方案,适用于Kubernetes。
它可以简化文件,块和对象存储的管理。
Rook-Ceph工具箱是一个用于Ceph调试和测试的常用工具的POD。
在kubernetes上运行车辆时,我们可以直接配置ceph。
这是通过使用来自车Cephoxbox Pod的Ceph的CLI来实现的。
从工具箱容器中,我们可以更改Ceph配置,启用Manager模块,创建用户和池等。
在kubernetes上运行ceph工具箱
Rook Toolbox可以在Kubernetes集群中作为部署运行。
在确保我们有一个运行的Kubernetes群集,并且已部署Rook,请启动Rook-Cephol-Tools Pod。
创建工具箱部署文件:
$vim toolbox.yaml
将以下数据添加到文件:
apiVersion: apps/v1 kind: Deployment metadata: name: rook-ceph-tools namespace: rook-ceph labels: app: rook-ceph-tools spec: replicas: 1 selector: matchLabels: app: rook-ceph-tools template: metadata: labels: app: rook-ceph-tools spec: dnsPolicy: ClusterFirstWithHostNet containers: - name: rook-ceph-tools image: rook/ceph:master command: ["/tini"] args: ["-g", "--", "/usr/local/bin/toolbox.sh"] imagePullPolicy: IfNotPresent env: - name: ROOK_ADMIN_SECRET valueFrom: secretKeyRef: name: rook-ceph-mon key: admin-secret volumeMounts: - mountPath: /etc/ceph name: ceph-config - name: mon-endpoint-volume mountPath: /etc/rook volumes: - name: mon-endpoint-volume configMap: name: rook-ceph-mon-endpoints items: - key: data path: mon-endpoints - name: ceph-config emptyDir: {} tolerations: - key: "node.kubernetes.io/unreachable" operator: "Exists" effect: "NoExecute" tolerationSeconds: 5
保存文件后,启动Rook-Ceph-Tools Pod:
kubectl create -f toolbox.yaml
等待工具箱POD下载其容器并到达运行状态:
kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"
一旦Rook-Ceph-Tools Pod运行,我们可以使用:
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
使用工具箱完成后,可以删除部署:
kubectl -n rook-ceph delete deployment rook-ceph-tools
OpenShift容器存储(OCS)V4.2 +上运行Ceph Toolbox
如果我们正在运行使用车辆的OpenShift容器存储。
首先通过运行以下命令启用Ceph工具。
oc patch OCSInitialization ocsinit -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'
创建一个新文件:
$vi toolbox.yaml
将以下内容添加到创建的文件中。
apiVersion: apps/v1 kind: Deployment metadata: name: rook-ceph-tools namespace: openshift-storage labels: app: rook-ceph-tools spec: replicas: 1 selector: matchLabels: app: rook-ceph-tools template: metadata: labels: app: rook-ceph-tools spec: dnsPolicy: ClusterFirstWithHostNet containers: - name: rook-ceph-tools image: registry.redhat.io/ocs4/rook-ceph-rhel8-operator:latest command: ["/tini"] args: ["-g", "--", "/usr/local/bin/toolbox.sh"] imagePullPolicy: IfNotPresent env: - name: ROOK_ADMIN_SECRET valueFrom: secretKeyRef: name: rook-ceph-mon key: admin-secret securityContext: privileged: true volumeMounts: - mountPath: /dev name: dev - mountPath: /sys/bus name: sysbus - mountPath: /lib/modules name: libmodules - name: mon-endpoint-volume mountPath: /etc/rook # if hostNetwork: false, the "rbd map" command hangs, see https://github.com/rook/rook/issues/2021 hostNetwork: true volumes: - name: dev hostPath: path: /dev - name: sysbus hostPath: path: /sys/bus - name: libmodules hostPath: path: /lib/modules - name: mon-endpoint-volume configMap: name: rook-ceph-mon-endpoints items: - key: data path: mon-endpoints
之后启动Rook-Cephoxbox Pod
oc create -f toolbox.yaml
等待工具箱POD下载其容器并到达运行状态:
$oc -n openshift-storage get pod -l "app=rook-ceph-tools" NAME READY STATUS RESTARTS AGE rook-ceph-tools-86cbb6dddb-vnht9 1/1 Running 0 6m49s
运行Rook-Ceph工具箱POD后,我们可以使用以下方式连接到它:
oc -n openshift-storage exec -it $(oc -n openshift-storage get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
常见的Ceph命令进行故障排除
以下是一些常用命令来解决CEPH集群: ceph status
ceph osd status
ceph osd df
ceph osd utilization
ceph osd pool stats
ceph osd tree
ceph pg stat
所有命令都可以在工具箱容器上执行。
请参见下面的示例。
# ceph -s cluster: id: 58a41eac-5550-42a2-b7b2-b97c7909a833 health: HEALTH_WARN 1 osds down 1 host (1 osds) down 1 rack (1 osds) down Degraded data redundancy: 91080/273240 objects degraded (33.333%), 80 pgs degraded, 104 pgs undersized services: mon: 3 daemons, quorum a,b,c (age 2h) mgr: a(active, since 2h) mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-a=up:active} 1 up:standby-replay osd: 3 osds: 2 up (since 2h), 3 in (since 4w) rgw: 1 daemon active (ocs.storagecluster.cephobjectstore.a) task status: data: pools: 10 pools, 104 pgs objects: 91.08k objects, 335 GiB usage: 670 GiB used, 3.3 TiB/4.0 TiB avail pgs: 91080/273240 objects degraded (33.333%) 80 active+undersized+degraded 24 active+undersized io: client: 7.7 KiB/s rd, 24 MiB/s wr, 3 op/s rd, 236 op/s wr
检查OSD树。
# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 5.99698 root default -4 1.99899 rack rack0 -3 1.99899 host ocs-deviceset-0-0-prf65 0 ssd 1.99899 osd.0 down 1.00000 1.00000 -12 1.99899 rack rack1 -11 1.99899 host ocs-deviceset-1-0-mfgmx 2 ssd 1.99899 osd.2 up 1.00000 1.00000 -8 1.99899 rack rack2 -7 1.99899 host ocs-deviceset-2-0-b96pk 1 ssd 1.99899 osd.1 up 1.00000 1.00000
获取池列表。
# ceph osd lspools 1 ocs-storagecluster-cephblockpool 2 ocs-storagecluster-cephobjectstore.rgw.control 3 ocs-storagecluster-cephfilesystem-metadata 4 ocs-storagecluster-cephobjectstore.rgw.meta 5 ocs-storagecluster-cephfilesystem-data0 6 ocs-storagecluster-cephobjectstore.rgw.log 7 .rgw.root 8 ocs-storagecluster-cephobjectstore.rgw.buckets.index 9 ocs-storagecluster-cephobjectstore.rgw.buckets.non-ec 10 ocs-storagecluster-cephobjectstore.rgw.buckets.data