Engineering

How to Run SQL Queries Using Cron

Different ways of running SQL queries as a cron-job to get actionable data from database at regular intervals.

Nishit Patel

August 4, 2022

5 min read

Illustration comparing Cron Jobs, Airflow and Locale
“Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway.” – Geoffrey Moore

For every organization, it is very important to extract actionable information from their databases, and take business decision backed by data. This can be achieved by running a SQL queries at regular interval and mailing the insights to the right person. It can also be used for alerting when something breaks in the day-to-day operations. Let’s look at various solution which can be used in production.

Approach 1: Using Crontab and python script

In this approach, we can make use of crontab program which is already installed in most Linux distros. Crontab is used to trigger any command at a specific schedule, In our case, we will trigger a python script which will run an SQL query and send us the result by email. This is a sample script that will connect to the PostgreSQL database, run a query, and send an email based on a threshold.


import smtplib, ssl
import psycopg2

def send_email_notification (email, alert_text):
    """
    Send an email to the provided email adress with the message text
    """

    smtp_server = "smtp.gmail.com"
    port = 587
    sender_email = "alert@locale.ai"
    password = "password"

    context = ssl.create_default_context()

    # Try to log in to server and send email
    try:
        server = smtplib.SMTP(smtp_server,port)
        server.starttls(context=context)
        server.login(sender_email, password)
        server.sendmail(sender_email, email, alert_text) # Send Email
    except Exception as e:
        print(e)
    finally:
        server.quit()

# connecting to postgres
conn = psycopg2.connect(
   database="mydb", user='postgres', password='password', host='127.0.0.1', port= '5432'
)

# Setting auto commit false
conn.autocommit = True

# Creating a cursor object using the cursor() method
cursor = conn.cursor()

# Executing our SQL Query
cursor.execute(''' SELECT  COUNT(*) CANCELLED_ORDERS from ORDERS WHERE cancelled = true ''')

# Fetching one row
result = cursor.fetchone();
print(result)

# checking if threshold is breached or not
threshold = 10
if result[0] > threshold:
    send_email_notification (email = "nishit@locale.ai", alert_text = "No of cancelled orders is more than the threshold")

#Closing the connection
conn.close()


We have to then provision an EC2/GCE instance where we will run the crontab, OS of the instance should be Linux. Once it is provisioned, SSH into the instance and download the required python packages for your script. Also, provision a static IP for the instance and whitelist that IP in your database, so that DB is accessible inside the instance. To schedule our python script, use the below commands.

# Edit crontab
crontab -e


# Edit crontab
crontab -e



*/5 * * * *  path/to/python/python3 /path/to/python/file/script.py

Screenshot of a crontab command screen
Crontab command screen


*/5 * * * * This line tells crontab that the following cronjob will be triggered every 5 Minutes The cron scheduling syntax is a little overwhelming in the beginning, But the good news is, you can use Crontab.guru website to work on your scheduling. Save the file and that’s it. You have successfully scheduled a cron job which will run every 5th minute.



crontab -l

This command is used to list all your cronjobs, and you can find your newly created job in the list. That’s it, we have successfully created a cronjob which will send us alerts, according to the logic implemented in python script.

Approach 2: WMS (Workflow management tools)

It is very likely that you will need to set up a number of cronjobs and it becomes very difficult to manage them, hence you can use Workflow management tools. Airflow is the industry standard tool and is widely used in production.

Setting up a simple cronjob with airflow is not that easy and comes with a learning curve. You need to create a DAG (Direct Acyclic Graph) which will run at the pre-defined schedule.


from airflow.models import DAG
from airflow.operators.python_operator import PythonOperator
from airflow.utils.dates import days_ago
from includes.nishit_modules.test import hello

args = {
    'owner': 'Nishit Patel',
    'start_date': days_ago(1) # make start date in the past
}

#defining the dag object
dag = DAG(
    dag_id='test-dag',
    default_args=args,
    schedule_interval='@daily' #to make this workflow happen every day
)

#assigning the task for our dag to do
with dag:
    hello_world = PythonOperator(
        task_id='hello',
        python_callable=hello,
    )


The following DAG will execute hello function every day stored in /includes/nishit_modules/test.py checkout airflow scheduler to understand more details about airflow schedule syntax. You can deploy this entire airflow architecture on your own GCP Composer service.

Approach 3: Locale.ai

There are lots of cons to using the first two approaches, let us discuss them in detail.

  • Manageability - Complexity in managing cronjob increases exponentially as the number of cronjobs increases. Setting up the Airflow infrastructure itself is time-consuming and complex.
  • Reliability - Both approaches are highly unreliable. In the first approach, there is no logging, even if the crontab fails there is no way to determine that the crontab has failed and check logs. While the second approach has some visibility on logs, but still one has to manually keep checking logs daily to make sure airflow infra is up and running.
  • Efforts - Both the approaches have one-time effort in setting up everything right from infrastructure, python scripts, SQL Queries and other configurations. Apart from this, there also exists a daily effort to make sure everything is up and running as expected.
  • Integrations - Let’s say tomorrow you realized that email is not the right communication channel to alert the right person. Then you have to make changes in the source code and write code to support the new communication channel.

All of these issues are solved by locale.ai. It only takes a few minutes to set up an alert on any SQL query.

A full page screenshoot of Locale's alert creation screen
Alert Creation Flow

You can do everything that the previous approaches did on a single screen. You can integrate your alerts with various destinations such as Slack workspace, WhatsApp, email, and so on. The Locale handles all the hassle of dealing with infrastructure and reliability issues. Locale also includes features such as Incident Management, Root Cause Analysis, and many more.

Screenshot showing collaboration capabilities on incidents on locale

Using the discussion forum, you can collaborate with other team members to find the root cause of the problem and deploy a fix.

Locale.ai provides the most simple and intuitive way of creating alerts and notifying your operations team when something goes wrong. Locale will not only alert you proactively, but also help the team in determining the root cause of the problem and resolving it as soon as possible. It provides visibility and accountability to the problems that might otherwise go unnoticed.

Illustation depicting how users can connect their data source to quickly setup alerts on Locale's platform in 15 minutes

If you are eager to know more, Book a call with one of our specialists to have all your questions answered today!

Featured Articles

View all Articles
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Receive Latest InsideOps Updates

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.