Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Screenshot 2024-11-06 092932.png

image-20241106-013222.png

Since version 1.15.0 Release v1.15.0

Post Target Execution

image-20241106-014342.png

The Post Target Execution feature enables users to run additional processing scripts after data has been successfully loaded into the target system. This option is ideal for users who want to run custom SQL scripts, data transformations, or other post-processing tasks within the target environment. These scripts execute after the main ETL job completes, providing a flexible way to automate downstream tasks without affecting the job’s core success or failure status.

Since Post Target Execution is an optional, supplemental step, it is executed independently of the main ETL job. However, it does provide feedback on its execution status—whether successful or unsuccessful—so users can monitor and troubleshoot as needed.

Use Cases for Post Target Execution

  1. Data Aggregation and Summarization: Users can apply aggregation or summarization queries on loaded data for reporting purposes or for quick access to processed metrics.

  2. Data Formatting and Standardization: This feature can also allow users to apply specific formatting rules or normalize data structures once the main ETL load is complete.

  3. Automated Data Validations: Users might run validation or checks on loaded data to confirm data quality, structure, or constraints after the initial load.

  4. Triggering External Processes: Post Target Execution scripts could also serve as triggers for downstream systems or applications that depend on the ETL process.

Considerations

  1. Test Scripts in a Development Environment: Since Post Target Execution is user-defined, it’s beneficial to test scripts in a development or staging environment first. This helps users identify errors before deploying scripts to production, reducing the likelihood of issues in the target environment.

Supported Connectors

  • MySQL (since v1.15.0)

  • Aws Aurora MySql (since v1.15.0)

  • Aws Rds MySql (since v1.15.0)

  • Azure MySql (since v1.15.0)

  • Google Cloud MySQL (since v1.15.0)

  • PostgreSQL (since v1.15.0)

  • Aws Rds PostgreSQL (since v1.15.0)

  • Azure Cosmos PostgreSQL (since v1.15.0)

  • Azure Postgres (since v1.15.0)

  • Google Cloud PostgreSQL (since v1.15.0)

  • Aws Aurora PostgreSQL (since v1.15.0)

  • AWS RDS MariaDB (since v1.16.0)

  • Snowflake (since v1.16.0)

  • AWS RDS Oracle (since v1.16.0)

  • Oracle (since v1.16.0)

Agent Scripts Execution

Screenshot 2024-11-06 091242.png

Enhancing Your SaaS ETL Pipeline with Pre- and Post-Script Execution in the Secure Agent

For SaaS ETL pipelines, pre- and post-scripts provide flexible, command-line-based execution that can extend and streamline workflows. These scripts run in the command-line environment (CMD, Bash, or similar) managed by the secure agent, before and after the main ETL job. They’re often used for tasks that fall outside the core ETL functionality, running supplementary processes without impacting the job’s success or failure status.

What Are Pre- and Post-Scripts in This Context?

Pre and Post Scripts are a way to call inline commands in the agent environments before and after execution.

NOTE: For multiline commands or more complicated commands, it may not behave as expected as the agent just runs the script in an inline command setting. We suggest to create a separate .bat or .sh files depending on your environment and have the pre and post scripts call them in a single line command instead. For multi calls you can use && or ; and chain the calls depending on the environment. Please refer to the documentation of the OS you are using in the Agent Environment on how to run single line commands.

  1. Pre-Scripts: Scripts that run in a command-line interface before the ETL job starts. They are typically used to:

    • Initialize or prepare the ETL environment (e.g., creating necessary directories, setting up variables)

    • Run preliminary logging or monitoring tasks

    • Check for necessary system resources or permissions before job execution

  2. Post-Scripts: These scripts execute after the ETL job finishes, in a command-line environment. Post-scripts may:

    • Clean up temporary files or directories created during the job

    • Send notifications or logs summarizing job status

    • Archive outputs or move files as part of job finalization

Since pre- and post-scripts do not affect the job’s outcome, they’re best suited for ancillary tasks that enhance job management without creating dependencies on the core ETL logic.

Considerations

  1. Ensure Secure Agent Permissions: The secure agent must have the correct permissions needed to execute scripts in the command-line environment. Confirm that it can:

    • Access necessary system resources (e.g., file directories, networking capabilities)

    It’s essential to run the secure agent with minimal necessary privileges to reduce risk, particularly if scripts are user-generated and may vary widely.