Marketnews

Incident Report: Yearly Auction Double Processing and Bidding Issues

Written by Admin | Sep 12, 2025 3:10:49 PM

Summary 

 

  • Title: Yearly auctions double processing and bidding issues 
  • Date & Time: Monday 07.07.2025, 12:34h (CEST) - Friday 11.07.2025, 10:40h (CEST) 
  • Affected Services: Auction processing and bidding 
  • Status: resolved 

Description 

 

Overview 

On Monday 07.07.2025 at approximately 12:34h (CEST), two critical errors occurred during the execution of PRISMA's yearly capacity auctions, leading to the following issues: 

  • Auction results were sent and published twice. In some cases, results were not sent at all. In the remaining document this issue is referred to as “Auction processing issues”. 
  • Some shippers were unable to place bids from round 2 onwards in the auctions which went to the next round. In the remaining document this issue is referred to as “Bidding issues”. 

Technical Details 

Auction Processing Issues 

 Bidding Issues 

 The issue occurred because two application servers ended up processing the same task in parallel. 

The first server was delayed in execution, leading the platform to assume it had failed and to reassign the task to a second server. 

Re-assigning the task to a second server in case of failure is desired behaviour as part of platform redundancy. In this case it was not correctly detected that it was not a failure, but a delay. 

 Caused by legacy code shippers received an error message indicating they were not entitled to edit an existing bid or place a new bid.  

 

The legacy code in question incorrectly handled bid ownership checks, when multiple bids of the same person or of multiple persons in the same organisation were involved. 

 

Scope of Impact 

Auction Processing Issues 

 Bidding Issues 

The majority of TSOs and shippers participating in the yearly capacity auctions were impacted by the double processing of auction results. 

 21 auctions out of a total of 1574 published yearly auctions were impacted by the bidding issues in the second round of the auctions. 

Duration of Impact 

The total duration of the incident from detection on 07.07.2025, 12:34h (CEST) to deployment of fixes on PROD on 08.07.2025, 13:07h (CEST) and 16:35h (CEST) was 1 day, 4 hours and 1 minute. 

Until full resolution including all activities for re-running the auction and data clean-up on platform and TSO side on 11.07.2025, 10:40h (CEST) an additional 2 days, 18 hours and 5 minutes were needed. 

 

Timeline

 

Date 

Time (CEST) 

Responsible 

Action Type 

Description 

Jul 7, 2025  

12:34 

PRISMA internal 

Send MS Teams message 

Customer Success informed about a possible incident. Shippers were reporting seeing their booking results twice. 

First analysis did not show any double bookings in internal support tool. Assumption that only cosmetic clean-up will be necessary after the auctions. 

Jul 7, 2025  

13:05 

PRISMA internal 

Create ISR  

Customer Success created an ISR (Incident & Service Request), formally starting the internal incident process for the auction processing issues. 

Jul 7, 2025  

13:13 

PRISMA Emergency Guard 

Publish UMM 

As part of PRISMA’s business continuity measures the Emergency Guard posted a UMM to inform the market about issues with the auction processing. 

Jul 7, 2025 

13:19 

PRISMA internal 

Send MS Teams message 

Customer Success informed about shippers not being able to place a bid in round 2 of the remaining 21 auctions. 

First analysis showed that it was not an isolated issue, but affecting all 21 auctions. As a result, this was also treated as a formal incident. 

Jul 7, 2025 

14:13 

PRISMA Emergency Guard 

Publish UMM 

As part of PRISMA’s business continuity measures the Emergency Guard posted a UMM to inform the market about issues with bidding in round 2. 

Jul 7, 2025 

15:25 

PRISMA Emergency Guard 

Align internally 

Internal alignment of Emergency Guard with PRISMA’s Management about the next steps regarding cancellation or continuation of the auctions. Decision: to prevent market distortion, PRISMA recommends cancellation of remaining 21 auctions. 

Jul 7, 2025 

16:19 

PRISMA Emergency Guard 

Send Email 

Emergency Guard sent an email to TSO emergency contacts to summarise the call and the decision taken. 

Jul 7, 2025  

16:28 

PRISMA Emergency Guard 

Publish UMM  

As part of PRISMA’s business continuity measures the Emergency Guard posted a UMM to inform the market about the cancellation of the auctions. 

Jul 8, 2025  

10:07 

Customer Success  

Dismiss UMM 

Customer Success (in alignment with Emergency Guard) dismissed the UMM regarding the bidding issues, since the auctions have been cancelled. 

Jul 8, 2025  

13:07 

PRISMA internal 

Deploy fix 

After review and testing the fix for the bidding issues was deployed to the production system.  

Jul 8, 2025  

13:46 

PRISMA Emergency Guard 

Update UMM 

Emergency Guard updated the existing UMM about issues with the auction processing to reflect the new marketing time-frame for re-run of the auctions. 

Jul 8, 2025  

14:04 

PRISMA Emergency Guard 

Update UMM 

Emergency Guard updated the existing UMM about issues with the auction processing to include also the cases of the missing booking confirmations. 

Jul 8, 2025  

16:28 

Customer Success 

Update UMM 

Customer Success updated the existing UMM about issues with the auction processing to reflect the new auction publishing time for the auction re-run. 

Jul 8, 2025  

16:35 

PRISMA internal 

Deploy fix 

After review and testing the fix for the auction processing was deployed to the production system. 

The fix was available since 13:49h, but to avoid interference with the day-ahead auctions, the deployment was scheduled to happen afterwards. 

Jul 11, 2025  

10:40 

PRISMA 

Execute steps for data clean-up 

Necessary steps for data clean-up were executed successfully. 

 

Root Cause Analysis (RCA) 

 

Auction Processing Issues

 Bidding Issues 

 Cause: The application server processing the auction evaluation lost the connection to the database. A second application server started processing the same auctions as part of the redundancy implementation in the platform infrastructure. 

Assessment: Incident was caused by unanticipated high load. 

Detection: The problem was identified by a user report. Shippers contacted PRISMA’s Customer Success after experiencing first issues with duplicate results. 

 Cause: A piece of legacy code in the backend returned wrong permissions to the frontend. Shippers with more than one bid per company were not able to edit bids in the second bidding round, if the bids had been manually placed and not via a bidding plan. 

Assessment: Incident was caused by an isolated bug. 

Detection: The problem was identified by a user report. Shippers contacted PRISMA’s Customer Success after experiencing issues with placing bids. 

 

Resolution & Recovery

 

Auction Processing Issues 

Bidding Issues 

Intermediate resolution: Enlargement of the connection pool for the database connection per application server. This resolution is deployed as per 08.07.2025, 13:07h (CEST).  

During the runtime of the yearly interruptible auctions 2 weeks later, PRISMA closely monitored the platform systems for possible DOS attacks. No signs for attempted attacks were found. 

Long-term resolution: Segmentation of the application servers to improve task allocation and system stability. A set of servers is dedicated to scheduled activities, like auction processing and report generation. A different set of servers is dedicated to processing requests originating from (public) endpoints of the platform, allowing PRISMA to reduce the connection pool size per server back to the original number of 100.  

Restoration of service: Following the intermediate fix the necessary actions for data clean-up (invalidation of double bookings) were aligned and executed with the TSOs. 

 Intermediate resolution: A targeted fix was implemented to enhance the query logic in the backend by enabling it to correctly handle and accept multiple bids. This resolution is deployed as per 08.07.2025, 16:35h (CEST). 

Long term resolution: Refactoring the legacy platform to a modern, independent service-based architecture will reduce the likelihood of such issues massively. This transformation project is already in progress and will be driven further with highest priority. 

Restoration of service: Following the intermediate fix all cancelled auctions were successfully republished using a custom calendar. On 09.07.2025, 09:00h CEST the affected auctions were successfully re-run and at 13:45h (CEST) the incident was declared closed. 

 

Preventive Actions:

  • Increase and improve test coverage of long-term auctions: Identifying the gap between existing test scenarios and additional scenarios that would have covered the specific constellations of the incident will reduce the risk of a reoccurrence. 
  • Re-assess critical business phases and resource planning: Conducting a structured review of recurring, high-impact events, including potential changes in conditions or risks compared to previous years, adequate management oversight and engagement in preparation activities as well as resource allocation, role coverage and contingency plans. 
  • Implementation of long-term resolution for auction processing issues: Adaption of the legacy infrastructure to allow a segregation of application servers, thorough testing and roll-out in production.