Retry mechanism when Azure Batch Quota Limit is reached
acknowledged
Adrian Navarro
When Seqera Platform submits jobs to Azure Batch and encounters a
409
HTTP status code (indicating the batch quota limit has been reached), the run fails. Instead of failing, can we modify the behavior so that the platform continues attempting to submit the job until the Azure Batch pool can accept it? Caused by:Status code 409, {
"odata.metadata":"https://babixanaldevm42aen.uaenorth.batch.azure.com/$metadata#Microsoft.Azure.Batch.Protocol.Entities.Container.errors/@Element",
"code":"ActiveJobAndScheduleQuotaReached","message": {
"lang":"en-US","value":"Active job and job schedule quota for the account has been reached.\nRequestId:e9dede46-9ab1-463b-b797-cdfee3d4db8f\nTime:2024-10-18T06:36:49.8081952Z"
}
}
Instead of returning a 409 Status Code error and halting the run, Seqera Platform should retry submitting the job continuously until the quota is available, ensuring that the job is eventually accepted. This should be done without manual intervention and in a way that does not overload the system.
Rob Newman
acknowledged