Campaign records appearing as Busy / Failed / Congestion
Understand the causes of campaign record statuses like Busy, Failed, and Congestion, and learn how to resolve these issues effectively.
Table of Contents
Symptom or Need
It is identified that the majority (or 100%) of records in an outbound dialing campaign fail to establish effective contact, showing the following indicators:
- Dominant Call Statuses: Busy, Failed, or Congestion.
- Penetration: Very low or zero (0%).
- Behavior: High volume of dialing attempts without generating effective contacts.
Context / Scenarios
This incident usually occurs due to operational changes or infrastructure limitations, such as:
- Activation of new campaigns or changes in the SIP provider/trunk.
- Use of aggressive dialing modes (Aggressive Predictive or Turbo Dial) or increased dialing speed.
- Server overload (shared server or multiple campaigns on the same Skill).
- Low-quality databases (poorly formatted numbers, blank records, or unvalidated prefixes).
- Network instability or latency toward the telephony provider.
Key Concepts to Consider:
- No Answer: The customer does not pick up.
- Answer Machine: An automated machine answered the call.
- Congestion / Busy: May occur due to:
- Issues within the customer’s carrier network.
- Poor signal reception on the customer’s end.
- Massive outages from the customer’s carrier.
- Abandon: The customer answers but hangs up before being transferred to an agent (due to signal issues or customer choice).
Response / Solution
- Initial Validations and Statistics
Gather basic data to scope the failure:
- Identify the operation name, percentage of impact, and number of affected agents.
- Data Analysis: Calculate actual penetration, review average attempts per record, and determine the "Dominant Result" (Is it mostly Busy, Failed, or Congestion?).
- Compare behavior across different providers.
2. Controlled Testing
- Perform 5 to 10 manual calls to the numbers that failed in the campaign.
- Validate if calls go through correctly using an alternative provider (if available).
- Confirm which provider is currently routing the calls.
- Verify that the Caller ID is authorized by the provider.
3. Infrastructure and Provider Review
- Server: Monitor CPU usage (>80%) and RAM at the time of the event. Validate if Dials Per Minute (DPM) exceed the server’s capacity.
- Connectivity: Validate latency and connection stability with the SIP provider.
- Carrier: If the issue is "Congestion," try switching providers to rule out spam blocking or carrier saturation.
- Shared Server: Confirm if the server is shared with other operations and evaluate the performance impact.
- Simultaneous Campaigns: Review the number of active campaigns and their resource consumption.
- SIP Configuration: If applicable, consider changing the Siptar server on the outbound trunk to optimize call stability.
4. Parameterization Adjustments
- Reduce predictive dialing aggressiveness or disable Turbo Dial (if applicable).
- Adjust dialing speed based on the actual number of active agents.
- Database Cleanup: Validate prefixes, remove blank records, and ensure that already managed numbers are not being reprocessed.
- Confirm that multiple campaigns are not associated with the same Skill to avoid call assignment overhead.
- Evaluate switching providers (e.g., TIGO, CLARO, etc.) to rule out service blocking or degradation.
- Validate if the issue is isolated to a specific carrier.
- Confirm potential restrictions due to traffic volume or dialing flagged as spam.
- Verify number portability status in Colombia in case of routing failures.
Possible Causes
The Busy / Failed / Congestion statuses can stem from various factors. The primary causes are detailed below according to the affected component:
A. Infrastructure (Server)
This is related to capacity limitations or environment overload:
- Sustained CPU usage exceeding 80%.
- High spikes in Dials Per Minute (DPM).
- Insufficient RAM.
- Simultaneous execution of multiple campaigns on the same server.
- High latency toward the SIP provider.
- Shared server with high resource consumption.
Typical Result: Predominance of Congestion statuses.
B. Provider / SIP Route
Associated with external or routing failures:
- Carrier saturation.
- Caller ID restrictions.
- Issues with the configured trunk.
- Impact on a specific operator.
- Portability issues (Colombia).
- Incorrect route or unauthorized provider.
Typical Result:
- Mostly Congestion → Possible provider saturation.
- Mostly Failed → Route or authorization issues.
C. Campaign Parameterization
Related to internal configurations or database quality:
- Dialing speed exceeding actual capacity.
- Overly aggressive predictive dialing.
- Turbo Dial activated without sufficient resource capacity.
- Insufficient number of active agents.
- Multiple campaigns using the same Skill.
- Incorrect prefixes.
- Blank or poorly formatted records.
- Reprocessing of already managed numbers.
Typical Result:
- Mostly Failed → Configuration or database issues.
- Mostly Busy → Database saturation or excessive retries.
D. Agent Connectivity
Related to agent availability and stability:
- Agents not truly available.
- Internal network issues.
- High TCP Delay.
- Occupancy level higher than estimated.
Typical Result:
- Call assignment failures.
- Dropped calls during handling.
Recommendations
For proper incident management, it is recommended to:
- If Congestion predominates: Prioritize reviewing the provider and server capacity.
- If Failed predominates: Validate configuration, SIP route, and numbering format.
- If Busy predominates: Check for database saturation and the number of retries.
- Do not escalate to the provider without first conducting manual tests.
- Do not increase dialing speed without previously validating CPU, RAM, and DPM (Dials Per Minute).
- Compare behavior across providers if more than one is available.
- Always document the exact time of the event to facilitate technical analysis.
- Maintain evidence of tests performed before escalation.