How to Run SQL Queries Using Cron
“Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway.” – Geoffrey Moore
For every organization, it is very important to extract actionable information from their databases, and take business decision backed by data. This can be achieved by running a SQL queries at regular interval and mailing the insights to the right person. It can also be used for alerting when something breaks in the day-to-day operations. Let’s look at various solution which can be used in production.
Approach 1: Using Crontab and python script
In this approach, we can make use of crontab program which is already installed in most Linux distros. Crontab is used to trigger any command at a specific schedule, In our case, we will trigger a python script which will run an SQL query and send us the result by email. This is a sample script that will connect to the PostgreSQL database, run a query, and send an email based on a threshold.
We have to then provision an EC2/GCE instance where we will run the crontab, OS of the instance should be Linux. Once it is provisioned, SSH into the instance and download the required python packages for your script. Also, provision a static IP for the instance and whitelist that IP in your database, so that DB is accessible inside the instance. To schedule our python script, use the below commands.
# Edit crontab
*/5 * * * * This line tells crontab that the following cronjob will be triggered every 5 Minutes The cron scheduling syntax is a little overwhelming in the beginning, But the good news is, you can use Crontab.guru website to work on your scheduling. Save the file and that’s it. You have successfully scheduled a cron job which will run every 5th minute.
This command is used to list all your cronjobs, and you can find your newly created job in the list. That’s it, we have successfully created a cronjob which will send us alerts, according to the logic implemented in python script.
Approach 2: WMS (Workflow management tools)
It is very likely that you will need to set up a number of cronjobs and it becomes very difficult to manage them, hence you can use Workflow management tools. Airflow is the industry standard tool and is widely used in production.
Setting up a simple cronjob with airflow is not that easy and comes with a learning curve. You need to create a DAG (Direct Acyclic Graph) which will run at the pre-defined schedule.
The following DAG will execute hello function every day stored in /includes/nishit_modules/test.py checkout airflow scheduler to understand more details about airflow schedule syntax. You can deploy this entire airflow architecture on your own GCP Composer service.
Approach 3: Locale.ai
There are lots of cons to using the first two approaches, let us discuss them in detail.
- Manageability - Complexity in managing cronjob increases exponentially as the number of cronjobs increases. Setting up the Airflow infrastructure itself is time-consuming and complex.
- Reliability - Both approaches are highly unreliable. In the first approach, there is no logging, even if the crontab fails there is no way to determine that the crontab has failed and check logs. While the second approach has some visibility on logs, but still one has to manually keep checking logs daily to make sure airflow infra is up and running.
- Efforts - Both the approaches have one-time effort in setting up everything right from infrastructure, python scripts, SQL Queries and other configurations. Apart from this, there also exists a daily effort to make sure everything is up and running as expected.
- Integrations - Let’s say tomorrow you realized that email is not the right communication channel to alert the right person. Then you have to make changes in the source code and write code to support the new communication channel.
All of these issues are solved by locale.ai. It only takes a few minutes to set up an alert on any SQL query.
You can do everything that the previous approaches did on a single screen. You can integrate your alerts with various destinations such as Slack workspace, WhatsApp, email, and so on. The Locale handles all the hassle of dealing with infrastructure and reliability issues. Locale also includes features such as Incident Management, Root Cause Analysis, and many more.
Using the discussion forum, you can collaborate with other team members to find the root cause of the problem and deploy a fix.
Locale.ai provides the most simple and intuitive way of creating alerts and notifying your operations team when something goes wrong. Locale will not only alert you proactively, but also help the team in determining the root cause of the problem and resolving it as soon as possible. It provides visibility and accountability to the problems that might otherwise go unnoticed.
If you are eager to know more, Book a call with one of our specialists to have all your questions answered today!