I'm looking for effective ways to execute long-running Python scripts (which can take several hours) on an Azure VM that's behind a VPN, all initiated from an Azure Data Factory pipeline. Additionally, I need to receive back the execution status of these scripts in the pipeline. Any advice or strategies would be greatly appreciated!
3 Answers
Have you thought about using systemctl along with an API? This could help manage your scripts, though it primarily checks if the service is running rather than the actual success of the script execution.
You might find using a webhook to be a great solution. With a webhook, you can send an HTTP call that includes a callback URL for the Python app to return the completion status. That way, ADF gets notified once the script finishes.
But what happens if the app crashes and doesn't send any status? We could end up missing feedback in ADF.
While I haven't worked specifically with VMs, have you considered using an Azure Function? It could potentially handle your needs without the VM. Just a thought!
I really need to run it from a VM since we already have multiple scripts set up there. We might switch to containers later on.
True, that's only going to tell us if it's running, not whether it executed successfully or failed.