What is Argo Workflows?

Argo Workflows is a general solution to the spectrum of problems. Any problem which requires defining workflow and steps can use Argo workflows as a ready-made solution. 

The pipeline is an excellent example of that case. The benefit of the Argo Workflows is that you are given enough independence to customize things and interact with the external platforms in the process – which is a thing for almost all pipelines. And this is done conveniently.

Defining a pipeline using Argo Workflows can be used for small projects or even large ones because of the flexibility given by the tool itself. All the pipelines are stored in the Git repository and can be modified and read by everyone who understands YAML.

It can even be used parallel with the ArgoCD for managing pipelines as code. It defines pipelines in a declarative way and syncs them with ArgoCD to Argo Workflows. 

Outcome:

  • Easy management of pipeline
  • Managed through version control
  • Easy rollback

If anyone understands how Kubernetes is working, It would not be a hassle to onboard on ArgoEvents and create workflows easily.

Argo Workflows is an open-source container-native workflow engine for orchestrating parallel jobs on Kubernetes. What it means is that It allows modeling processes as workflows. Each step, initially, in a workflow is defined as a container. Other steps were added along the way but more on that later.

On the other side, Argo Events is an event-driven workflow automation framework for Kubernetes which helps you trigger K8s objects, Argo Workflows, Serverless workloads, etc., on various events.

Argo Workflow and Argo events are two separate projects – one could integrate one with another, which can trigger workflows on particular events. It is also possible to run standalone versions of Argo Workflows or Argo events independently.

Argo Workflows

As we mentioned before Argo Workflows, in the later text workflows, are giving the possibility to define workflows using custom resources on k8s. Let’s introduce core concepts. I suggest jumping here, but if you are lazy – you can get an overview of the core concepts below.

There are 3 crucial concepts:

  • Workflow 
    • The workflow defines the workflow to be executed using the custom resource.
  • Template
    • The template is an actual step that defines container and action/s to be executed.
  • Template invocators
    • Template invocators are responsible for triggering templates.

The workflow defines the workflow to be executed using the custom resource. The most important fields of the CR are in  workflow.spec, and they are spec.templates and spec.entrypoint.

Workflows define 4 types of templates:

  • Container
    • Container template allows you to run the container and define args.
  • Script
    • A convenience wrapper around a container template – allows defining script for the running.
  • Resource
    • Define Kubernetes resource (even CR) on Kubernetes.
  • Suspend
    • Waiting for n seconds or manual intervention.

And 2 types of Template invocators:

  • Steps
    • Steps define a list of templates that are executed sequentially.
  • DAG
    • Directed acyclic graph which executes templates following rules of dependencies and flow.

 The first group of templates allows definitions of actions using containers, scripts (a wrapper around container), resource creation on k8s (k8s objects), and suspend templates (just sleep n seconds). These templates cannot do anything on their own. This is why there are Steps and DAGs. They allow controlling execution of the templates and are called Template invocators. 

Let’s see an example of the workflow definition:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-world-  # Name of this Workflow
spec:
  entrypoint: whalesay        # Defines "whalesay" as the "main" template
  templates:
  - name: whalesay            # Defining the "whalesay" template
    container:
      image: docker/whalesay
      command: [cowsay]
      args: ["hello world"]   # This template runs "cowsay" in the "whalesay" image with arguments "hello world"

The above k8s resource defines the next important things:

  • Template (Container) named whalesay
  • Entrypoint -> whalesay template

This workflow will execute the template whalesay and run the container with the image and arguments defined in the workflow.

Another neat feature is to define WorkflowTemplates CR and reference them in workflow.spec. WorkflowTemplates follow the same structure as workflow.spec.

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: workflow-template-ws
spec:
  entrypoint: whalesay-template     
  arguments:
    parameters:
      - name: message
        value: hello world
  templates:
    - name: whalesay-template
      inputs:
        parameters:
          - name: message
      container:
        image: docker/whalesay
        command: [cowsay]
        args: ["{{inputs.parameters.message}}"]

And now this WorkflowTemplate is referenced from the actual Workflow.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: workflow-template-hello-world-
spec:
  entrypoint: whalesay-template
  arguments:
    parameters:
      - name: message
        value: "from workflow"
  workflowTemplateRef:
    name: workflow-template-ws

It’s important to mention that Workflows are live objects. They represent the definition of the workflow but also the state of the workflow. They can be understood as Classes and Objects.

This is pretty much 101 for workflows. To see more details visit the official docs and skim through them to get more familiar with the concepts: https://argoproj.github.io/argo-workflows/workflow-concepts/

Argo Events

Argo Events, later in the text events, is Argo’s product for reacting to various external events. What makes it distinguishable is the fact that it is also k8s native and can interact with the workflows but also other entities.

“Argo Events is an event-driven workflow automation framework for Kubernetes which helps you trigger K8s objects, Argo Workflows, Serverless workloads, etc. on events from various sources like webhooks, S3, schedules, messaging queues, gcp pubsub, sns, sqs, etc.”

As can be seen from the official citation from docs – it has various integrations with the most popular services out of the box.

Now let’s introduce the core concepts of the events:

  • EventSource
    • EventSource defines the configuration for consuming events from an external source.
  • Sensor
    • Sensor defines a set of Events to occur for the action to be started.
  • EventBus
    • EventBus is an internal communication bus for Argo Events.
  • Trigger
    • Triggers are a set of actions that will occur after dependencies are met.

For the detailed product documentation, see the official docs.

EventSource is a definition of the external source which will trigger events. It can be webhook, Github/Gitlab event, etcetera. There are 26 event sources available.

Sensors are the glue between EventSource and Triggers. Sensor defines the set of the dependencies which need to occur for Sensor to trigger the action. 

Some may confuse Argo Trigger as triggers for a specific event, but they are not – EventSources are responsible for that. Argo Trigger is a set of actions that will occur after an EventSource occurs (1) or Multiple EventSources (2) in case Sensor dependencies are defined as multiple event sources.

Graphically represented: 

(1) EventSource -> Sensor -> Trigger

(2) EventSource —---

     …. |----> Sensor -> Trigger

     EventSource —---

The EventBus acts as the transport layer of Argo-Events by connecting the event sources and sensors.

And the last, but most important workhorse of events are Triggers. ​​A Trigger is the resource/workload executed by the Sensor once the event dependencies are resolved.

Argo Events custom resources will be introduced via examples from this blog.

Agenda

We will follow the example of building and deploying a simple GO API on the k3s cluster using helm. A proper configuration will be deployed before the helm is installed. Following this scenario, It will be demonstrated what is needed to implement CI/CD using Argo Workflows and Argo Events.

A fully runnable example with the proper Setup of the lab is available in the Github repositories at the end of the article.

Follow the README.md in Argo-workflows repository or follow the instructions in the Setup of the laboratory section to run the examples.

Setup of the laboratory

To run the CI/CD example – a proper lab is required. 

Pre-Reqs

For the sake of setting up the laboratory, the following items are required:

  • Stable network connection (Pull and push docker images)
  • 4GB Memory, 4CPU, and 10GB storage (You can use lower reqs, but UX is not guaranteed)
  • Docker hub account created and personal access token created for the repository push.

Installation and setup (MacOS)

brew install multipass --cask
git clone https://github.com/adnanselimovic-abh/argo-workflows.git 
cd argo-workflows
multipass launch -n k3s --mem 4G --disk 10G --cpus 4 --cloud-init cloud-config.yaml

Installation and setup (Linux)

snap install multipass
git clone https://github.com/adnanselimovic-abh/argo-workflows.git 
cd argo-workflows
multipass launch -n k3s --mem 4G --disk 10G --cpus 4 --cloud-init cloud-config.yaml

How to configure laboratory

You will need to create dockerhub registry account to allow ci-cd flow pushing container images on the remote repository.

export DOCKER_USERNAME=[USERNAME]
export DOCKER_TOKEN=[PERSONAL_ACCESS_TOKEN]
kubectl create secret generic docker-config --from-literal="config.json={\"auths\": {\"https://index.docker.io/v1/\": {\"auth\": \"$(echo -n $DOCKER_USERNAME:$DOCKER_TOKEN|base64)\"}}}" --namespace argo-events

After multipass launches k3s VM – shell can be spawned. One more step is needed before laboratory is configured. Fetch the IP of the VM created.

multipass list k3s
Name                    State             IPv4             Image
k3s                     Running           192.168.64.2     Ubuntu 20.04 LTS
                                          10.42.0.0
                                          10.42.0.1
                                          172.17.0.1

In this case, It is 192.168.64.2. Now host file needs to be edited.

Linux

sudo echo "\n192.168.64.2 local.k3s" >> /etc/hosts

MacOS

sudo echo "\n192.168.64.2 local.k3s" >> /private/etc/hosts

Replace 192.168.64.2 with the IP address you get from multipass output!

Spawn shell

multipass shell k3s
[email protected]:~$ k get pods -n kube-system
NAME                                      READY   STATUS      RESTARTS   AGE
coredns-d76bd69b-rxzfg                    1/1     Running     0          6h7m
local-path-provisioner-6c79684f77-k8mh2   1/1     Running     0          6h7m
helm-install-traefik-crd-mjzw6            0/1     Completed   0          6h7m
metrics-server-7cd5fcb6b7-4krsv           1/1     Running     0          6h7m
svclb-traefik-6a62f4c4-86rvd              2/2     Running     0          6h5m
helm-install-traefik-s6b7t                0/1     Completed   2          6h7m
traefik-df4ff85d6-mjzkt                   1/1     Running     0          6h5m

The output should look like this. If everything is configured correctly:

  • k3s is installed
  • traefik is available and local.k3s is serving ingress endpoints
  • argo-workflows and argo-events are installed and pre-configured.

Test if everything is configured

From the host machine terminal, execute

curl local.k3s
404 page not found

Traefik returned a 404 page not found, so everything works!

Note: It can take 5-10 minutes for the virtual machine to be provisioned and everything to be deployed. It can take longer if you tweak the CPU and Memory in the multipass launch command.

How to use Argo workflows?

Hit the URL in the browser of your preference:

http://local.k3s/argo

To login to Argo UI follow the instructions below:

SECRET=$(kubectl get sa atlantbh-argo-workflows-server -o=jsonpath='{.secrets[0].name}' --namespace argo)
ARGO_TOKEN="Bearer $(kubectl get secret $SECRET -o=jsonpath='{.data.token}' --namespace argo | base64 -d)"
echo "$ARGO_TOKEN"

Batteries included

VM provisioning will install k3s, configure ingresses, create the namespaces and deploy Argo-workflows and argo-events with needed CRs for demonstration. 

Traefik will work out of the box, and ingresses can be served on the cluster.

If you are interested in details of how k3s networking is set up and how Traefik v2 can be utilized on k3s, check this out.

This example can be easily extended, and experimentation with the setup to solve other problems is possible.

CI/CD

Argo Workflows

To create the pipeline, WorkflowTemplate will be created and used to create Workflow using ArgoUI and Argo Events.

All the concepts introduced in the beginning will be employed to utilize the example of CI/CD. Cloudinit will prepare k3s cluster for us (batteries included) so that we don’t worry about the Kubernetes and focus can be on the Argo Workflows and Events.

WorkflowTemplate

Let’s start with the workflow template. The pipeline will be defined as DAG (Directed acyclic graph), which means that steps will be executed in the manner that dependencies need to be completed before the next step starts execution, and no closed loop can occur.

DAG can be created using Workflow or WorkflowTemplate. WorkflowTemplate example is given in the example below. 

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: cicd-workflow-template
  namespace: argo-events
  annotations:
    workflows.argoproj.io/description: |
      This workflow is implementing CICD for simple GO API.
Spec:
  entrypoint: main
  templates:
    - name: main
      dag:
        tasks:
          - name: clone
            template: clone
          - name: deps
            template: deps
            dependencies:
              - clone
          - name: build
            template: build
            dependencies:
              - deps
          - name: docker-build
            template: docker-build
            dependencies:
              - build
          - name: create-config
            template: create-config-k8s
            dependencies:
              - docker-build
          - name: create-secret
            template: create-secret-k8s
            dependencies:
              - docker-build
          - name: deploy
            template: deploy
            dependencies:
              - create-config
              - create-secret
### Rest of the templates stripped down for the readeabilty

As can be seen from the YAML above pipeline is defined as DAG. These are Template invocators, referencing templates defined in the workflow template stripped down from the example for better readability. It’s easy to control execution flow using dependencies. The graph can be directed easily, and the parallel execution is straightforward. Eg. In this example, create-config and create-secret will run in parallel.

Let’s check out two types of templates referenced from the template invocators above: Container and Resource.

- name: build
      Container:
        image: golang:1.18
        volumeMounts:
          - mountPath: /go/src/github.com/go-api
            name: work
            subPath: src
          - mountPath: /go/pkg/mod
            name: work
            subPath: GOMODCACHE
          - mountPath: /root/.cache/go-build
            name: work
            subPath: GOCACHE
        workingDir: /go/src/github.com/go-api
        command: [ sh, -xuce ]
        args: |
            CGO_ENABLED=0 go build main.go

This template is referenced to build the GO API binary artifact needed for the rest of the pipeline steps, e.g., Docker build. All the steps share the same working PVC hence the next step can easily mount the working PVC and use artifacts produced from the previous steps.

Another type of template which is used in the example is resource templated. YAML is given below.

- name: create-config-k8s
  resource:
    action: apply
    manifest: |
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: go-api-config
        namespace: go-api
      data:
        branch: {{workflow.parameters.branch}}

This template will create a configuration map in the go-api namespace. It’s easy to define the k8s resource through Argo Workflow. Pre-configuration for the pod can be handled elegantly through the pipeline. It’s possible to define any k8s object.

To see a full workflow template and all DAG steps defining the CI/CD pipeline – click here.

Inputs, arguments, and parameters

The resource template can be templated via dynamic variables in the same definition. Argo Workflow makes this possible via arguments. Eg.

{{workflow.parameters.log-level}}

To define arguments for the workflow in the spec field one would put parameters array. Eg. 

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: global-parameters-
spec:
  arguments:
    parameters:
    - name: log-level
      value: INFO
# Rest of the template ...

The parameter log-level can be used throughout the workflow in the same way {{workflow.parameters.branch}} is used in the CI/CD example.

Running CI/CD via ArgoUI

Argo Workflows provide easy-to-use UI, which is available under the local.k3s/argo in the laboratory. Workflow templates can be submitted via UI or via other mediums.

Let’s list created workflows on the UI. It can be seen that WorkflowTemplate we defined in YAML is listed in the UI.

WorkflowTemplate

Now we can submit this WorkflowTemplate to create a running Workflow. As you can see, all the parameters defined in the arguments field are available in the UI before submitting the workflow.

Argo Workflows

After the Workflow is started, you can track workflow progress via UI. The completed CI/CD pipeline is shown in the image below.

The completed CI/CD pipeline

For each step in this DAG, Argo Workflows create the container in the argo-events namespace. Don’t let the argo-events namespace confuse you. All this is happening while events are not leveraged at all.

It is also possible to see logs directly via UI. For example, the docker-build step. Logs are available only while the pod is available where the container was started.

 logs

CI/CD example is building GO API, which serves some endpoints. After completing the pipeline, let’s check the k8s status in the go-api namespace.

TERMINAL FROM UBUNTU VM
[email protected]:~$ k get pods -n go-api
NAME                        READY   STATUS    RESTARTS   AGE
atlantbh-58c967b685-8ctkg   1/1     Running   0          41m
[email protected]:~$ k get ingress/atlantbh -n go-api
NAME       CLASS    HOSTS       ADDRESS        PORTS   AGE
atlantbh   <none>   local.k3s   192.168.64.2   80      41m
[email protected]:~$ k get svc/atlantbh -n go-api
NAME       TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
atlantbh   ClusterIP   10.43.52.80   <none>        80/TCP    41m

GO API is successfully deployed in the go-api namespace. Let’s observe the curl output from the host machine.

TERMINAL FROM HOST MACHINE
➜  ~ curl local.k3s/go-api/healthz --verbose
*   Trying 192.168.64.2:80...
* Connected to local.k3s (192.168.64.2) port 80 (#0)
> GET /go-api/healthz HTTP/1.1
> Host: local.k3s
> User-Agent: curl/7.79.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 2
< Content-Type: application/json; charset=utf-8
< Date: Wed, 13 Jul 2022 14:18:28 GMT
<
* Connection #0 to host local.k3s left intact
{
    "msg": "I am healthy!"
}%

As you can see, the app is returning 200 OK – which means a successful CI/CD pipeline using ArgoWorkflows. 

Argo events

Let’s use the examples from this blog to understand events.

First – the creation of EventSource.

apiVersion: argoproj.io/v1alpha1
kind: EventSource
metadata:
  name: webhook
  namespace: argo-events
spec:
  service:
    ports:
      - port: 12000
        targetPort: 12000
  webhook:
    example:
      port: "12000"
      endpoint: /trigger-workflow
      method: POST

EventSource is defined as Webhook, which will listen on the endpoint /trigger-workflow on port 12000. Argo Events controller will create proper Kubernetes service, the webhook pod, and map service to the webhook pod.

Now Sensor can use this EventSource to activate triggers.

apiVersion: argoproj.io/v1alpha1
kind: Sensor
metadata:
  name: webhook
  namespace: argo-events
spec:
  template:
    serviceAccountName: operate-workflow-sa
  dependencies:
    - name: cicd-dep
      eventSourceName: webhook
      eventName: example
  triggers:
    - template:
        name: argo-cicd-trigger
        argoWorkflow:
          operation: submit
          source:
            resource:
              apiVersion: argoproj.io/v1alpha1
              kind: Workflow
              metadata:
                generateName: ci-cd-from-webhook-
              spec:
                arguments:
                  parameters:
                    - name: branch
                      value: main
                    - name: dockerRepo
                      value: docker.io/qdnqn/argo-workflows:1.0.0
                workflowTemplateRef:
                  name: cicd-workflow-template
          parameters:
            - src:
                dependencyName: cicd-dep
                dataKey: body.branch
              dest: spec.arguments.parameters.0.value
            - src:
                dependencyName: cicd-dep
                dataKey: body.image
              dest: spec.arguments.parameters.1.value

Dependencies field defines on which EventSources this sensor depends. This example is only EventSource (Webhook) – created already. As you can see trigger in this sensor is Workflow, and we are referencing the familiar CI/CD Workflow template from earlier on.

So what will happen next: The webhook will get hit on the endpoint, and CI/CD pipeline will be triggered.

Let’s try it out.

TERMINAL FROM HOST MACHINE
➜  ~ curl -d '{"branch":"main", "image":"docker.io/qdnqn/argo-workflows:1.0.0"}' -H "Content-Type: application/json" -X POST http://local.k3s/trigger-workflow
success%

Note: Custom ingress is created to access the webhook endpoint outside the cluster.

Webhook triggered successfully.

TERMINAL FROM HOST MACHINE
[email protected]:~$ k get pods -n argo-events | grep 'from-webhook'
ci-cd-from-webhook-tfgsp-1508779606                       2/2     Running   0              8s

CI/CD pipeline is triggered from the webhook. Also, it is visible in the UI.

We have seen how Argo Workflows and Argo Events can be easily utilized to automate processes. It’s a versatile tool and it can be used in many areas. Automation of CI/CD is one aspect where it is useful. 

Ease of integration and interaction with the k8s is what is most important with the workloads running on the Kubernetes and these products help in that way. 

Complete examples are found in the next repositories:

To setup laboratory follow the README.md instructions in the argo-workflows.

 

Leave a Reply