The 40+ Minute Incident Report
Posted: Thu Apr 30, 2026 9:13 am
Deployment Incident Report - 2026-04-30
A summary of today's deployment outage.
━━━━━━━━━━━━━━━━━━━━━━━━━━━
At a glance
What we were trying to ship
What went wrong - Issue #1
One of the migrations failed schema-validation on apply.
The ability re-sync migration was written assuming a column count that didn't match the live database's actual column count. The first row failed at apply time, the auto-updater rejected the migration, and the world server refused to start until the migration was removed from the build queue.
━━━━━━━━━━━━━━━━━━━━━━━━━━━
What went wrong - Issue #2
A typo in a config file caused a restart loop.
After the first issue was set aside and the world server image was rebuilt, both world servers entered a restart loop. The world boot would complete, and then it would shut itself back down in milliseconds. Cycle repeated every ~80 seconds.
The cause turned out to be a character on a comment line of the world-server config file. The config parser silently failed at that line and stopped reading further settings. One of the unread settings controls whether the server's interactive console reads from standard input. Because the setting was never loaded, the parser fell back to its built-in default of True. The console process then immediately read from the docker container (not something that should be read from) and interpreted it's output as a shutdown command.
The fix was a single-character correction to the comment line. As soon as the fix was applied, both realms came up clean and have stayed up.
━━━━━━━━━━━━━━━━━━━━━━━━━━━
What landed
Live in production right now:
What didn't go
What we're taking away
Thanks for the patience during the downtime. If you notice anything weird with the fixes that did land (particularly with default weapon proficiencies on characters) please let us know via in-game /gm ticket or here on the forum.
A summary of today's deployment outage.
━━━━━━━━━━━━━━━━━━━━━━━━━━━
At a glance
- Outage window: approximately 40 minutes.
- Affected: both realms (Hardcore + Normal).
What we were trying to ship
- Night Elf resurrect bug - a resurrection path could leave a Night Elf player visually stuck as a wisp after rezzing. Fixed by always clearing the wisp transform aura on resurrect. Night elf players were also unable to rez at spirit healer or corpse, this should now be fixed.
- Default-class spells not displaying - weapon proficiencies (Bows, Guns, Crossbows, Daggers, Staves, Thrown, Fist Weapons, etc.) and other class default skills used to be only kept in memory and re-seeded on every login. Any explicit .learn of one of those IDs got overwritten on the next save. Fixed the issue and ran a backfill that wrote the missing rows for all 952 existing characters across both realms.
- Debuff cap was effectively 1 - the configured value was outside the engine's accepted range. Raised to the 40.
- Crusader Strike - tooltip misaligned with the actual mechanic's damage and stats. We re-synced the spell_template to the correct effect formula. This is part of the migration that failed at apply time on a column mismatch.
- Holy Strike rank 1 - tooltip misaligned with the actual mechanic's damage and stats. We also re-synced the spell_template to match the tooltip. This is also part of the migration that failed at apply time on a column mismatch, these changes didn't go through and will be pushed later.
What went wrong - Issue #1
One of the migrations failed schema-validation on apply.
The ability re-sync migration was written assuming a column count that didn't match the live database's actual column count. The first row failed at apply time, the auto-updater rejected the migration, and the world server refused to start until the migration was removed from the build queue.
━━━━━━━━━━━━━━━━━━━━━━━━━━━
What went wrong - Issue #2
A typo in a config file caused a restart loop.
After the first issue was set aside and the world server image was rebuilt, both world servers entered a restart loop. The world boot would complete, and then it would shut itself back down in milliseconds. Cycle repeated every ~80 seconds.
The cause turned out to be a character on a comment line of the world-server config file. The config parser silently failed at that line and stopped reading further settings. One of the unread settings controls whether the server's interactive console reads from standard input. Because the setting was never loaded, the parser fell back to its built-in default of True. The console process then immediately read from the docker container (not something that should be read from) and interpreted it's output as a shutdown command.
The fix was a single-character correction to the comment line. As soon as the fix was applied, both realms came up clean and have stayed up.
━━━━━━━━━━━━━━━━━━━━━━━━━━━
What landed
Live in production right now:
- Night Elf resurrect fix.
- Default-class spell persistence + the 952-character backfill: 11 Hardcore + 38 Normal characters now have correctly-stored Bows proficiency, plus all the other affected proficiencies.
- Debuff cap raised from "effectively 1" to 40.
What didn't go
- The Holy/Crusader re-sync + new trigger spells migration. We'll fix the schema mismatch and ship it in a smaller follow-up window.
What we're taking away
- Our deployment pipelines are not in a state to be supporting two live service realms. Until these are hardened we will be postponing all but the most critical of bug fixes. We hope to have the pipelines rebuilt in the next 48 hours. You can read more about this in the Announcements forum section.
Thanks for the patience during the downtime. If you notice anything weird with the fixes that did land (particularly with default weapon proficiencies on characters) please let us know via in-game /gm ticket or here on the forum.