Fargate for EKS is great serverless service from AWS, but the nature of the serverless service is ephemeral, and there is no way to run daemon set on Fargate, so no way to run Datadog to collect metrics and log etc. when something goes wrong it will be great if we could still check the log even after the pod and ec2 node has gone. we could mount EFS to the pod running on fargate , if we could write log to EFS system then the information will be still available after pod get destroyed
There are already blogs to talk about how to enalbe EFS on Fargate, like https://aws.amazon.com/blogs/aws/new-aws-fargate-for-amazon-eks-now-supports-amazon-efs/, but it's works for newly created eks cluster only, and assume CSI driver already been installed in the cluster, for my case I have an really old EKS cluster which is upgraded to newer version. so the CSI driver may not get installed, we have to install them manually, below is the steps I have to perform to make EFS work on my Fargate for EKS
create iam policy for CSI driver
- save content below to
efs-csi-iam-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"elasticfilesystem:DescribeAccessPoints",
"elasticfilesystem:DescribeFileSystems"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"elasticfilesystem:CreateAccessPoint"
],
"Resource": "*",
"Condition": {
"StringLike": {
"aws:RequestTag/efs.csi.aws.com/cluster": "true"
}
}
},
{
"Effect": "Allow",
"Action": "elasticfilesystem:DeleteAccessPoint",
"Resource": "*",
"Condition": {
"StringEquals": {
"aws:ResourceTag/efs.csi.aws.com/cluster": "true"
}
}
}
]
}
- create policy by
aws iam create-policy \
--policy-name AmazonEKS_EFS_CSI_Driver_Policy \
--policy-document file://efs-csi-iam-policy.json
create iam role for the CSI driver
- get OIDC provider
aws eks describe-cluster --name acme-Development --query "cluster.identity.oidc.issuer" --output text
- create role policy, by saving content below to efs-csi-iam-role.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::xxxx:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/xxxx"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.us-east-1.amazonaws.com/id/xxxx:sub": "system:serviceaccount:kube-system:efs-csi-controller-sa"
}
}
}
]
}
- create the iam role
aws iam create-role \
--role-name AmazonEKS_EFS_CSI_DriverRole \
--assume-role-policy-document file://efs-csi-iam-role.json
- attach policy to role
attach-role-policy:
aws iam attach-role-policy \
--policy-arn arn:aws:iam::xxxxx:policy/AmazonEKS_EFS_CSI_Driver_Policy \
--role-name AmazonEKS_EFS_CSI_DriverRole
install CSI driver
- create service account
apiVersion: v1
kind: ServiceAccount
metadata:
name: efs-csi-controller-sa
namespace: kube-system
labels:
app.kubernetes.io/name: aws-efs-csi-driver
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::xxxxx:role/AmazonEKS_EFS_CSI_DriverRole
- get the driver.yaml as below
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/name: aws-efs-csi-driver
name: efs-csi-external-provisioner-role
rules:
- apiGroups:
- ""
resources:
- persistentvolumes
verbs:
- get
- list
- watch
- create
- delete
- apiGroups:
- ""
resources:
- persistentvolumeclaims
verbs:
- get
- list
- watch
- update
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- list
- watch
- create
- apiGroups:
- storage.k8s.io
resources:
- csinodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- get
- watch
- list
- delete
- update
- create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/name: aws-efs-csi-driver
name: efs-csi-provisioner-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: efs-csi-external-provisioner-role
subjects:
- kind: ServiceAccount
name: efs-csi-controller-sa
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: aws-efs-csi-driver
name: efs-csi-controller
namespace: kube-system
spec:
replicas: 2
selector:
matchLabels:
app: efs-csi-controller
app.kubernetes.io/instance: kustomize
app.kubernetes.io/name: aws-efs-csi-driver
template:
metadata:
labels:
app: efs-csi-controller
app.kubernetes.io/instance: kustomize
app.kubernetes.io/name: aws-efs-csi-driver
spec:
containers:
- args:
- --endpoint=$(CSI_ENDPOINT)
- --logtostderr
- --v=2
- --delete-access-point-root-dir=false
env:
- name: CSI_ENDPOINT
value: unix:///var/lib/csi/sockets/pluginproxy/csi.sock
image: 602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/aws-efs-csi-driver:v1.2.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 5
httpGet:
path: /healthz
port: healthz
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 3
name: efs-plugin
ports:
- containerPort: 9909
name: healthz
protocol: TCP
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/lib/csi/sockets/pluginproxy/
name: socket-dir
- args:
- --csi-address=$(ADDRESS)
- --v=2
- --feature-gates=Topology=true
- --leader-election
env:
- name: ADDRESS
value: /var/lib/csi/sockets/pluginproxy/csi.sock
image: public.ecr.aws/eks-distro/kubernetes-csi/external-provisioner:v2.1.1-eks-1-18-2
name: csi-provisioner
volumeMounts:
- mountPath: /var/lib/csi/sockets/pluginproxy/
name: socket-dir
- args:
- --csi-address=/csi/csi.sock
- --health-port=9909
image: public.ecr.aws/eks-distro/kubernetes-csi/livenessprobe:v2.2.0-eks-1-18-2
name: liveness-probe
volumeMounts:
- mountPath: /csi
name: socket-dir
hostNetwork: true
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: efs-csi-controller-sa
tolerations:
- operator: Exists
volumes:
- emptyDir: {}
name: socket-dir
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app.kubernetes.io/name: aws-efs-csi-driver
name: efs-csi-node
namespace: kube-system
spec:
selector:
matchLabels:
app: efs-csi-node
app.kubernetes.io/instance: kustomize
app.kubernetes.io/name: aws-efs-csi-driver
template:
metadata:
labels:
app: efs-csi-node
app.kubernetes.io/instance: kustomize
app.kubernetes.io/name: aws-efs-csi-driver
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: eks.amazonaws.com/compute-type
operator: NotIn
values:
- fargate
containers:
- args:
- --endpoint=$(CSI_ENDPOINT)
- --logtostderr
- --v=2
env:
- name: CSI_ENDPOINT
value: unix:/csi/csi.sock
image: 602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/aws-efs-csi-driver:v1.2.1
livenessProbe:
failureThreshold: 5
httpGet:
path: /healthz
port: healthz
initialDelaySeconds: 10
periodSeconds: 2
timeoutSeconds: 3
name: efs-plugin
ports:
- containerPort: 9809
name: healthz
protocol: TCP
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/lib/kubelet
mountPropagation: Bidirectional
name: kubelet-dir
- mountPath: /csi
name: plugin-dir
- mountPath: /var/run/efs
name: efs-state-dir
- mountPath: /var/amazon/efs
name: efs-utils-config
- mountPath: /etc/amazon/efs-legacy
name: efs-utils-config-legacy
- args:
- --csi-address=$(ADDRESS)
- --kubelet-registration-path=$(DRIVER_REG_SOCK_PATH)
- --v=2
env:
- name: ADDRESS
value: /csi/csi.sock
- name: DRIVER_REG_SOCK_PATH
value: /var/lib/kubelet/plugins/efs.csi.aws.com/csi.sock
- name: KUBE_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
image: public.ecr.aws/eks-distro/kubernetes-csi/node-driver-registrar:v2.1.0-eks-1-18-2
name: csi-driver-registrar
volumeMounts:
- mountPath: /csi
name: plugin-dir
- mountPath: /registration
name: registration-dir
- args:
- --csi-address=/csi/csi.sock
- --health-port=9809
- --v=2
image: public.ecr.aws/eks-distro/kubernetes-csi/livenessprobe:v2.2.0-eks-1-18-2
name: liveness-probe
volumeMounts:
- mountPath: /csi
name: plugin-dir
hostNetwork: true
nodeSelector:
beta.kubernetes.io/os: linux
priorityClassName: system-node-critical
tolerations:
- operator: Exists
volumes:
- hostPath:
path: /var/lib/kubelet
type: Directory
name: kubelet-dir
- hostPath:
path: /var/lib/kubelet/plugins/efs.csi.aws.com/
type: DirectoryOrCreate
name: plugin-dir
- hostPath:
path: /var/lib/kubelet/plugins_registry/
type: Directory
name: registration-dir
- hostPath:
path: /var/run/efs
type: DirectoryOrCreate
name: efs-state-dir
- hostPath:
path: /var/amazon/efs
type: DirectoryOrCreate
name: efs-utils-config
- hostPath:
path: /etc/amazon/efs
type: DirectoryOrCreate
name: efs-utils-config-legacy
---
apiVersion: storage.k8s.io/v1beta1
kind: CSIDriver
metadata:
annotations:
helm.sh/hook: pre-install, pre-upgrade
helm.sh/hook-delete-policy: before-hook-creation
helm.sh/resource-policy: keep
name: efs.csi.aws.com
namespace: kube-system
spec:
attachRequired: false
create the storageclass
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
parameters:
provisioningMode: efs-ap
fileSystemId: fs-xxxxx # this is the EFS system id could be found from aws console
directoryPerms: "700"
# gidRangeStart: "1000" # optional
# gidRangeEnd: "2000" # optional
# basePath: "/dynamic_provisioning" # optional
create pv and pvc
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs-pv-flow
namespace: flow
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
csi:
driver: efs.csi.aws.com
volumeHandle: fs-xxxx
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-claim-flow
namespace: flow
spec:
accessModes:
- ReadWriteMany
storageClassName: efs-sc
resources:
requests:
storage: 5Gi
create the pod to test EFS
apiVersion: v1
kind: Pod
metadata:
name: efs-app
namespace: flow
labels:
infrastructure: fargate
spec:
containers:
- name: app
image: centos
command: ["/bin/sh"]
args: ["-c", "while true; do echo $(date -u) >> /data/out; sleep 5; done"]
volumeMounts:
- name: persistent-storage
mountPath: /data
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: efs-claim-flow
I've created an fargate profile to match label infrastructure: fargate
on namespace flow
to schedule pod on faragte
after the pod get created, we could login to the pod to check the file system, the data folder has mounted efs, so we could access efs from pod now!
kubectl exec -it efs-app -n flow -- bash
[root@efs-app /]# df -h
Filesystem Size Used Avail Use% Mounted on
overlay 30G 10G 18G 36% /
tmpfs 64M 0 64M 0% /dev
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
127.0.0.1:/ 8.0E 402G 8.0E 1% /data
overlay 30G 10G 18G 36% /etc/hosts
/dev/xvdcz 30G 10G 18G 36% /etc/hostname
shm 64M 0 64M 0% /dev/shm
tmpfs 2.0G 12K 2.0G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 2.0G 0 2.0G 0% /proc/acpi
tmpfs 2.0G 0 2.0G 0% /proc/scsi
tmpfs 2.0G 0 2.0G 0% /sys/firmware