Our client is a leading site in Europe and retrieves feeds from different sources. We helped them to scale their infrastructure to process hundreds of feeds at any given time.
- Cronjob Scheduler
This post has been written under the assumption that you are already familiar with Docker, and AWS eco-system. We’ll not go into details of AWS infrastructure provisioning details.
You can step the DC/OS cluster by using the terraform module.
Installation of Marathon and Chrono
DCOS made super easy to install frameworks on top of DCOS cluster. Once you install the DC/OS go to the universe in the left sidebar, and you’ll see all the supported packages. Just click on install, and packages will be installed on your cluster.
We’ll be using github to store job description for simplicity. However, we use an in-house built Python Application with Jenkins to store all the job informations.
Marathon and Chronos are two different tools for different purposes, Marathon supports long-running workers and Chronos is an alternative for
Let’s start with YAML file
command: while true; do sleep 1; date; done
command: python fetch_image.py
Let’s commit this YAML file to
We have written a custom python CLI tool to retrieve the job definitions from jobs datastore (in this example, job definitions exists in GitHub).
Python script will fetch the YAML file from GitHub, and convert to JSON and post to the Rest API of Chronos and Marathon using requests module.
Our Python application consists of the following methods to convert YAML to Chronos and Marathon REST API.
# We store all the environment variables in AWS Parameter Store, following method will retrieve the params by given path.
# Create Chronos Job
# Create Marathon Job
# YAML to JSON
No matter how many
- Ensure that Mesos Slave nodes have docker authorization for pulling images. Use Instance Role/profile to assign an appropriate policy to pull images.
- Forward the logs to the centralized place, as in case of Worker/Cronjob failure, it will be incredibly hard to debug without proper logging.
- Use Monitoring tool to monitor your cluster performance – We are using Prometheus and Grafana. We have also deployed Grafana on the same DCOS cluster.
- DO NOT use latest or stable tags for Docker Images. Always use a custom hash for your docker labelling.
- You should have at least 3 master nodes for the DCOS cluster.
- Add your Mesos slave nodes under an AutoScaling Group, and assign appropriate policies to Scale in/Out.
DC/OS is a platform for running distributed containerized software, like apps, jobs, and services. As a platform, DC/OS is distinct from
Chronos is a replacement for
cron. It is a distributed and fault-tolerant scheduler that runs on top of Apache Mesos that can be used for job orchestration. It supports custom Mesos executors as well as the default command executor. Thus by default, Chronos executes
sh (on most systems bash) scripts.