Taking Apache Camel for a ride on Kubernetes

Apache Camel is Java Framework which simplifies the life of developers when it comes to integration ( be it data or systems ).

It is a fairly mature and robust project, heavily based on the Enterprise Integration Patterns and with a beautiful DSL. It also has a very wide (300+) list of ready connectors.

The following code snippet shows that moving files from Dropbox to your local filesystem really comes to writing a few lines of integration DSL :

from("direct:start")
  .to("dropbox://get?accessToken=XXX&clientIdentifier=XXX&remotePath=/root/folder1/file1.tar.gz")
  .to("file:///home/kermit/?fileName=file1.tar.gz");

Developers are still left with packaging, configuring, deploying, running and monitoring their enterprise integration applications, which can be some big chunk of work as well when you run on Kubernetes.


Welcome Apache Camel K

From the Camel K website, Camel K is a lightweight integration framework built from Apache Camel and specifically designed for serverless and microservices architectures.

It leverages recent advances in the Java ecosystem, such as Quarkus and the ease of running Java on Kubernetes nowadays.

The promise is that developers focus on writing only the integration and Camel K will take care of the rest. As a matter of fact, the following snippet saved as a Groovy file

from('timer:tick?period=3000')
  .setBody().constant('Hello world from Camel K')
  .to('log:info')

is the only thing you need to write. Running kamel run myfile.groovy will package and deploy your integration on the configured Kubernetes cluster.

No surprise of course that since Apache Camel is maintained by RedHat, all the Camel K integrations run smoothly on an Openshift Cluster.

A Concrete Example

I wanted to use Camel K for a tiny but realistic example than just printing Hello on the console. I faced myself with a CSV file containing 100k articles from the news and realised the format wasn’t really fit for a friendly, performant and reusable usage of this dataset.

So, how hard could it be to write this with a Camel K integration ?


Installing Camel K on minikube

I’ll not go over the whole minikube installation, with Knative and Istio, you can really just follow the tutorial here ( which contains the Camel K installation as well ), but really installing Camel K comes down to :

  • Download the Camel K binary
  • Run kamel install

Creating your first Integration

Integrations can be written with the DSL in one of the following languages :

  • Groovy
  • Java
  • Javascript
  • Kotlin
  • XML
  • YAML

You can also choose to deploy integrations by creating Integration definitions in YAML, like you would write Pod or Service definitions.

I’ve chosen to go for the latter.

apiVersion: camel.apache.org/v1
kind: Integration
metadata:
  name: s3-to-s3
spec:
  dependencies:
    - "camel:camel-csv"
  configuration:
    - type: configmap
      value: rss-to-s3-configmap
    - type: secret
      value: aws
  flows:
  - from:
      uri: aws-s3://{{BUCKET}}
      parameters:
        prefix: articles
        accessKey: "{{ACCESS_KEY}}"
        secretKey: "{{SECRET_KEY}}"
        region: "{{REGION}}"
        deleteAfterRead: false
      steps:
        - unmarshal:
            csv:
              use-maps: true
              allow-missing-column-names: true
        - split:
            simple: "${body}"
            streaming: true
# Remove the headers on the exchange, otherwise it will try to write on the same bucket but also the original content length and target content length are different
        - remove-headers:
            pattern: "*"
        - set-header:
            name: CamelAwsS3Key
            simple: "articles/converted/${body[id]}.json"
        - marshal:
            json: {}
        - to:
            uri: aws-s3://{{BUCKET}}
            parameters:
                accessKey: "{{ACCESS_KEY}}"
                secretKey: "{{SECRET_KEY}}"
                region: "{{REGION}}"

Let’s analyse the flows part of the definition first. The integration will, step by step :

  • Read files from the given bucket
  • transform each row into a key/value map where keys are the column headers
  • split on each row and stream
  • remove the message headers in order to avoid having the source headers into the destination headers
  • define the filename it will have on the s3 bucket, be it the id column value and a json extension
  • transform as json
  • write to a new destination on s3

Places where environment variables are used are leveraged using ConfigMaps and Secrets.

The only thing that is left is deploying the integration :

kubectl apply -f my-integration.yml

You can follow the instructions on my camel-k-integrations-examples repository if you want to reproduce the same locally.

And this is the output of the kamel log s3-to-s3 along with the logs of the camel-k operator


How does this really works ?

In a nutshell, when you install Camel K, it actually install a Kubernetes operator that will reconcile the state of the integrations for any event related to Camel Integrations.

It will :

  • Build a Camel Quarkus application with the provided flow as configuration
  • Generate a container image from it and push it to the cluster registry
  • Create a Deployment for the integration
  • Apply the Deployment

Crazy cool, isn’t it ?

Summary

The Apache Camel team has been pushing a lot on bringing Camel forward onto modern infrastructure and the Camel K project, while still in its inception, is very promising with for example a native integration with Knative Eventing and Messaging.

The reality is that the project needs still to mature, most of the examples and documentation you will find focus on the developer experience while I’m looking more on production ready and CI/CD based deployments. It’s also unclear how to test integrations.

There is however one feature of Camel K that will be dedicated in a further blog post : Kamelets. Kamelets are Route templates, directly available in integrations and I think it will be a killer feature for organisations that want to give a spin at data mesh architectural paradigms. Stay tuned 😉

If you’re interested in discovering more about Apache Camel K, I suggest you follow the awesome-camel-k repository updates.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s