13.05.2024 Incident - postmortem

被42名用户阅读

MeikopVint.ee创建者 2024-05-14T08:30:47+03:00
Hello,

If vint.ee is down, we need to write about it. So that future generations can learn from it.

Yesterday, 13.05.2024, vint.ee was down for about an hour (23.00 - 00.00).
Disasters happen when several unfavorable circumstances coincide, and that's what happened this time.

The story begins two weeks ago when I discovered that the vint.ee server had started to be bombarded by particularly aggressive crawlers/bots. The average load on the vint.ee server was 3 instead of the normal 0.5.
So I started looking for a solution and found a way to disable bots at the nginx level (based on the user agent).
I made changes to the config and everything worked - the bots were blocked.

A few days later I started to upload other updates and then it turned out that the nginx conf was in git which I couldn't push to the git server. However, when I pushed the updates to the server, merge conflicts arose. I did something there and got the conflicts resolved but I didn't notice that the nginx conf disappeared from the vint.ee server at all.

On the evening of 13.05.2024 at 23:00, nginx's behavior changed. It probably detected a missing config file and, after a reload, vint.ee was down.

However, bringing Vint.ee back took some time. And in the end, I only managed to do it thanks to the help of my friend Kaspar:
Namely, I discovered that the old mindoku.com configuration files were also read into the nginx config. I removed them so they wouldn't interfere.
But now it has become clear that there was something in those mindoku config files that was necessary for the server to work all these years. It took us over half an hour to figure it out and make the corresponding changes in the vint.ee config file. And that's how the server started working again.

Now for something positive:
In recent months, we have received reports that when opening vint.ee from some computers, a message such as "Unsecured website" appears.
Currently, there is a theory that this message was caused by the mindoku.com configuration files, so maybe we solved this old problem with this incident :) We'll see.

All the best,
Marten Meikop


发表回复

这个功能只针对已验证的或VIP用户