Automate with Kubectl, Bash and CRON.

Print
Category: IT
Published Date Written by Administrator

Lately I've stumbled upon requirement - get data from pod and put it into cloud storage. Looked simple enough, so I got into it straight away.
    Nothing fancy from infra side:

bastion -> k8s -> bastion -> tar data -> gcs.

    To accomplish that, we will need several things, like proper access to k8s cluster, so I started with getting proper rights to access cluster. There are at least several ways how to do it, but I decided to go with ServiceAccount. This approach works ok, it gives some flexibility in modifying access control further down the line (in case if it's needed). When I've set it up, I've moved to installing packages which are necessary
1. gcloud - to log in to google resources
2. kubectl - to interact with k8s cluster
3. gsutil - to send data to GCS
4. gcloud-auth-plugin - to authorize access to GKE.
    So far so good.
    After above, I've started to write a bash script, which simple enough looked something like below:

#!/bin/bash

CURRENT_DATE=`date -I`

GCS=$BUCKET_NAME

echo "Activating proper service account..."
gcloud auth activate-service-account --key-file=$PATH_TO_YOUR_SA
gcloud container clusters get-credentials $CLUSTER_NAME --region $REGION --project $PROJECT

echo "Copying data into local folder..."
/home/$USER/google-cloud-sdk/bin/kubectl cp namespace/pod-name:/data local-data/

echo "Packing data..."
tar czf backup-data-${CURRENT_DATE}.tar.gz local-data/

echo "Sending data to gcs..."
gcloud storage cp backup-data-${CURRENT_DATE}.tar.gz gs://${GCS}

echo "Removing dumps ..."
rm local-data/*
rm backup-data-${CURRENT_DATE}.tar.gz

echo "Finished ! ${CURRENT_DATE}"

So after that, i've just made my script executable via simple chmod +x script_name.sh and started testing from console. It works as expected, so happy days. Just one more thing to modify - CRON, and voilla. I've just modified crontab via crontab -e and that was it - set proper time when script needs to be executed and we're done ! ... well, not exactly. I've looked next day and script worked as well as cron, but the archive was...empty. Why ? Well, due to several factors which are not so easy to figure out. First thing which I've missed, was gke executables are not in the default $PATH, hence script run from cron was just not executing them, not kubectl anyway. Ok,I figured that I'll put absolute path to the executable. Still no joy ! But why ? That's due some inconsistency on kubectl side. Trhrough trial and error I finally managed to fiind some clues and resolve the problem. That's how it progressed. First, I've put test script to check why kubectl is not executing, so I've written something like this and put it into cron:

#!/bin/bash

echo "Activating proper service account..."
gcloud auth ...
gcloud container ...

echo "Get pods lists for ..."
/home/$USER/google-cloud-sdk/bin/kubectl get pods  -n namespace 2>&1  > /home/$USER/test.log

Execute it and...Fail ! It throws error about passing arguments in wrong order. This is very strange, as executing it from console works fine. I figured that only reasonable workaround would be to move namespace section before getting pods like that:

/home/$USER/google-cloud-sdk/bin/kubectl -n namespace  get pods  2>&1  > /home/$USER/test.log

and it worked. Ok...strange behaviour, but whatever, let's get to the point how to fix main script. I've applied the same to the prototype, and guess what - still doesn't work ! Ok...so I've moved to another debugging via 2>&1  and it happened, that gke plugin is not found nor installed ! Another mystery, so I've finally added google-sdk executable dir into $PATH via:

export PATH="/home/$USER/google-cloud-sdk/bin:$PATH"

and it worked, as it should. All required data is being downloaded to local directory and uploaded to GCS.