Site Loader
Get a Quote

Our client is a leading site in Europe and retrieves feeds from different sources. We helped them to scale their infrastructure to process hundreds of feeds at any given time.

Tools:

  • AWS
  • DC/OS
  • Marathon
  • Chronos
  • Jenkins
  • Cronjob Scheduler
  • GitHub

Installation:
This post has been written under the assumption that you are already familiar with Docker, and AWS eco-system. We’ll not go into details of AWS infrastructure provisioning details.
You can step the DC/OS cluster by using the terraform module.

Installation of Marathon and Chrono
DCOS made super easy to install frameworks on top of DCOS cluster. Once you install the DC/OS go to the universe in the left sidebar, and you’ll see all the supported packages. Just click on install, and packages will be installed on your cluster.

Scheduling Jobs:
We’ll be using github to store job description for simplicity. However, we use an in-house built Python Application with Jenkins to store all the job informations.

JOBs Definition:
Marathon and Chronos are two different tools for different purposes, Marathon supports long-running workers and Chronos is an alternative for cron for distributing systems. Our job definition has two types of definitions e.g. worker and cronjob.
Let’s start with YAML file

YAML Example:

celery_worker:
job_name: app_celery_worker
image: app:latest
cpu: 500
memory: 512
aws_ssm_parameters_path: /test
type: worker
project_name: appname
command: while true; do sleep 1; date; done
fetch_images:
job_name: app_fetch_images
image: app:latest
CPU: 500
memory: 512
aws_ssm_parameters_path: /test
type: cronjob
project_name: appname
command: python fetch_image.py
schedule: "R/2014-09-25T17:22:00Z/PT2M"

Let’s commit this YAML file to github, and a simple Python Script will schedule a job.

Python app:
We have written a custom python CLI tool to retrieve the job definitions from jobs datastore (in this example, job definitions exists in GitHub).
Python script will fetch the YAML file from GitHub, and convert to JSON and post to the Rest API of Chronos and Marathon using requests module.
Our Python application consists of the following methods to convert YAML to Chronos and Marathon REST API.

# We store all the environment variables in AWS Parameter Store, following method will retrieve the params by given path.
get_parameters_by_path

# Create Chronos Job
create_chronos_job

# Create Marathon Job
create_chronos_job

# YAML to JSON
convert_yaml_to_json


No matter how many cron jobs you have, we can easily schedule on a DCOS cluster using a small python script. After the implementation, Development Teams doesn’t require any involvement from the infrastructure team, and they can run the workload by themselves without any server administration. They just need to add into the YAML definition file, and the Python App will read and schedule on a cluster.

Production Setup:

  • Ensure that Mesos Slave nodes have docker authorization for pulling images. Use Instance Role/profile to assign an appropriate policy to pull images.
  • Forward the logs to the centralized place, as in case of Worker/Cronjob failure, it will be incredibly hard to debug without proper logging.
  • Use Monitoring tool to monitor your cluster performance – We are using Prometheus and Grafana. We have also deployed Grafana on the same DCOS cluster.
  • DO NOT use latest or stable tags for Docker Images. Always use a custom hash for your docker labelling.
  • You should have at least 3 master nodes for the DCOS cluster.
  • Add your Mesos slave nodes under an AutoScaling Group, and assign appropriate policies to Scale in/Out.

References:

DC/OS:
DC/OS is a platform for running distributed containerized software, like apps, jobs, and services. As a platform, DC/OS is distinct from and agnostic to the infrastructure layer. This means that the infrastructure may consist of virtual or physical hardware as long as it provides compute, storage, and networking.
Visit: https://dcos.io/

Marathon:
Marathon is a production-grade container orchestration platform for Mesosphere’s Datacenter Operating System (DC/OS) and Apache Mesos.
Visit: https://mesosphere.github.io/marathon/

Chronos:
Chronos is a replacement for cron. It is a distributed and fault-tolerant scheduler that runs on top of Apache Mesos that can be used for job orchestration. It supports custom Mesos executors as well as the default command executor. Thus by default, Chronos executes sh (on most systems bash) scripts.
Visit: https://mesos.github.io/chronos/

Post Author: kworx-admin

Leave a Reply

Your email address will not be published. Required fields are marked *