We continue to see latency for Podio this week
Incident Report for Podio
Postmortem

In the beginning of 2019, we noticed significant increase in load during early hours of European time. This resulted in intermittent degraded performance with increased latency, webhooks delays and elevated error rates. The immediate customer affect was the stalling of webhooks and delays for our customers using workflow automation in some occasions during the week.

In February, we implemented an improvement to our webhooks queue system, which was done to reduce the stalling of webhooks delays and Globiflow Advanced Workflows. After implementation of the new queue system, we have seen a general better performance of webhooks and workflows.

In March we identified the main cause behind the load issues due to specific updates in very large apps, serving more than 500k items in the same app. These apps are heavy for the system, as a single update in an app field, can result in millions of updates across both calculations, search indexing, and workflow automation. We also noticed an issue where calling the API to get data views and filtered views from these large apps, could sometimes cause a bottleneck, if the app was being updated in the same time by users manually or workflow automation. We have worked with our API consumers and customers serving the large apps to better spread usage of the heavy calls to periods where the apps are less active and not being updated at the same time.

As a result of these learnings, we will work on a larger project to refactor some of these heavy API calls to better protect calls where apps are being actively used while served from the API. In the meantime, we ask our customers and partners to review API usage not to perform concurrent updates of data via the API, while the data is being updated in the UI or from automated workflows at the same time.

We will also consider to implement a maximum limit of 500k app items per app in Podio in the future. Podio is a project management and work management tool but was never built to serve a database with 500k+ items in the same app. For this purpose, we recommend customers to use a real database functionality, and only service active items to Podio that requires collaboration and team coordination.

We will share further communication before introducing a hard limit of items per apps, however in the meantime, please review our known limitations here:

https://help.podio.com/hc/en-us/articles/115000424752-Known-limitations-in-Podio

In the same period, the growth of very large apps on the platform, also caused an issue for the general Search performance, which resulted in 2 weeks of elevated search issues. We resolved this issue with a larger maintenance of the Search infrastructure on April 1-2.

Finally, in order to cope with the general increase in load, we had a database maintenance running in production that culminated on April 13th. This unavoidable activity resulted in few issues related to item availability & increased response time for customers. We now expect a general stability increase in the coming months.

We are also taking longterm actions to secure a scalable path to ensure we can better cope with unexpected load issues in the future.

Again, we are deeply sorry for the issues this caused to your business, and thank you for your continued support.

Posted 4 months ago. Apr 23, 2019 - 09:02 EDT

Resolved
We have confirmed improved performance since the completion of our database maintenance event and the follow up work conducted Monday. We believe you should see a general improvement of stability for Podio going forward. Further, we will share a more detailed post-mortem, related to this long running incident leading up to the maintenance.
Posted 4 months ago. Apr 18, 2019 - 14:17 EDT
Update
We saw elevated response times today between 8:45 - 9:00 AM CET. The system is back to normal now and team is monitoring the same closely.
Posted 4 months ago. Apr 16, 2019 - 04:19 EDT
Update
We believe to have a more clear understanding on the cause of issues this morning CET - they were unfortunately an after-affect of our database maintenance over the weekend. We have made updates today, which we hope will improve experience further tomorrow. We will keep incident open until confirmed and also offer a more detailed post-mortem later this week for the general latency, error rates and search issues experienced on Podio over the last few weeks.
Posted 4 months ago. Apr 15, 2019 - 15:58 EDT
Update
Systems have been back to normal since 10:15 am CET (5:15 am EST). Though we have already made fixes, we will continue to monitor throughout the high load hours today and post an update on the root cause after that.
Posted 4 months ago. Apr 15, 2019 - 06:30 EDT
Update
We continue to monitor the issue and investigate the root cause.
Posted 4 months ago. Apr 15, 2019 - 05:25 EDT
Update
We have made changes to the roll-back plan to optimize performance. However we still continue to see performance degradation. Our hypothesis is that the changes made to the database is taking time to warm up as new queries hit our database in production. We will continue to share updates as we learn more and confirm the root cause.
Posted 4 months ago. Apr 15, 2019 - 04:28 EDT
Update
We continue to investigate the current issue and work on getting the system back to normal.
Posted 4 months ago. Apr 15, 2019 - 03:57 EDT
Update
Team has identified issues as a follow up to the maintenance work conducted this weekend. We had implemented a roll-back and clean-up plan which is causing unexpected issues in production. We are working on improvements on our end.
Posted 4 months ago. Apr 15, 2019 - 03:04 EDT
Update
The load on Podio is increasing and leading to delays. Team is working on fixing the issue. Thank you for your patience.
Posted 4 months ago. Apr 15, 2019 - 02:52 EDT
Update
Podio will be seeing delays for a few minutes right now. Team is taking a follow-up action to the database maintenance performed over the weekend before the high load hours begin for Podio.
Posted 4 months ago. Apr 15, 2019 - 02:36 EDT
Monitoring
We have successfully completed maintenance today and will be monitoring over the next few days to confirm fix during high load hours. We are also conducting additional clean-up in the backend over the next week.

Thank you so much for everyone's patience throughout this maintenance period. We are working to ensure we can better predict and forecast platform growth in the future, in order to more proactively handle load issues before they become a problem for our customers.
Posted 4 months ago. Apr 13, 2019 - 16:43 EDT
Update
We have confirmed time for a bigger maintenance event for Saturday 13th this weekend. Please follow the scheduled maintenance here for more insights: https://status.podio.com/incidents/txrw9ngr6ytf

We will be undergoing a scheduled maintenance to implement database enhancements to improve the general stability and reliability. We expect the maintenance to be completed within 1 hour - Podio will not be available during this time.

We expect Podio to continue to experience small periods with latency this week until maintenance has been completed. We expect improved performance after completion of the maintenance.

Thanks for your patience.
Posted 5 months ago. Apr 08, 2019 - 07:56 EDT
Update
We have been working towards getting ready for a larger maintenance this weekend, in order to resolve the general performance issues you have seen on Podio over the last few weeks. Unfortunately, the work being prepared in our backend is not processing fast enough for us to be ready this weekend. We will therefore schedule maintenance in the weekend of April 13-14. A scheduled maintenance notification will be sent later today, confirming the exact time and date. We expect the maintenance to take 1 hour with Podio offline.

We currently do not see issues with new items created being unavailable however there is a risk of this returning as load increases on the platform later today. Therefore be aware that we have a risk of this issue occurring again today:

We have a risk of seeing issues with newly created items on Podio - this is causing errors when trying to access newly created items.

We are currently conducting database maintenance of Podio in the backend whenever load on the platform is low while Podio is still online and available, this will increase pressure on the database over the next week.

The background maintenance is a preparation towards a larger maintenance we are planning later in the weekend on April 13-14. The background maintenance can in some occasions mean that newly created items will cause error when being accessed within the first few minutes, however the items will remain on the platform and there is no risk of dataloss

Team is working to reduce impact of this as much as possible, in order to make newly created items available faster.

We will continue to keep you updated here.
Posted 5 months ago. Apr 05, 2019 - 07:17 EDT
Update
We currently do not see issues with new items created being unavailable however there is a risk of this returning as load increases on the platform later today. Therefore be aware that we have a risk of this issue occurring again today:

We have a risk of seeing issues with newly created items on Podio - this is causing errors when trying to access newly created items.

We are currently conducting database maintenance of Podio in the backend whenever load on the platform is low while Podio is still online and available, this will increase pressure on the database over the next 1-2 weeks. The background maintenance is a preparation towards a larger maintenance we are planning later in April. The background maintenance can in some occasions mean that newly created items will cause error when being accessed within the first few minutes, however the items will remain on the platform and there is no risk of dataloss

Team is working to reduce impact of this as much as possible, in order to make newly created items available faster.

We will continue to keep you updated here. We hope to share an update by tomorrow April 4th to when we are able to conduct the planned maintenance event that we believe will improve performance longterm.
Posted 5 months ago. Apr 03, 2019 - 22:52 EDT
Update
We are continuing to work on a fix for this issue.
Posted 5 months ago. Apr 03, 2019 - 10:38 EDT
Update
We have seen improved Search performance after the fixes yesterday. Unfortunately we still see issues with new item creation on Podio - this is causing errors when trying to access newly created items.

We are currently conducting database maintenance of Podio in the backend whenever load on the platform is low while Podio is still online and available, this will increase pressure on the database over the next 1-2 weeks. The background maintenance is a preparation towards a larger maintenance we are planning later in April. The background maintenance can in some occasions mean that newly created items will cause error when being accessed within the first few minutes, however the items will remain on the platform and there is no risk of dataloss

Team is working to reduce impact of this as much as possible, in order to make newly created items available faster.

We will continue to keep you updated here.
Posted 5 months ago. Apr 03, 2019 - 08:56 EDT
Update
Update - New search capacity has been added today and slowly been rolled out to the full user.base. We estimate improved Search performance for the rest of today and going forward.

Please let us know in Support if you still experience increased errors with search after today 11.30 EST (17.30 CET).

We are currently conducting database maintenance of Podio in the backend whenever load on the platform is low while Podio is still online and available, this will increase pressure on the database over the next 1-2 weeks. The background maintenance is a preparation towards a larger maintenance we are planning later in April. The background maintenance can in some occasions mean that newly created items will cause error when being accessed within the first few minutes, however the items will remain on the platform and there is no risk of dataloss.

We will update with a scheduled larger maintenance during a weekend when this time has been confirmed, ETA is currently mid April.

Thank you for your patience.
Posted 5 months ago. Apr 02, 2019 - 11:32 EDT
Update
New search capacity is still in the progress of being added. We estimate improved Search performance by tomorrow but will keep updating here when we expect better Search performance.

We are currently conducting database maintenance of Podio in the backend whenever load on the platform is low while Podio is still online and available, this will increase pressure on the database over the next 1-2 weeks. The background maintenance is a preparation towards a larger maintenance we are planning later in April. The background maintenance can in some occasions mean that newly created items will cause error when being accessed within the first few minutes, however the items will remain on the platform and there is no risk of dataloss.

We will update with a scheduled larger maintenance during a weekend when this time has been confirmed, ETA is currently mid April.

Furthermore, we are currently working to increase the Search capacity with the purpose to reduce search errors that are elevated from 10 - 12 am EST. We will update here when Search capacity has been added. Estimate here is April 2-3.

Thank you for your patience.
Posted 5 months ago. Apr 02, 2019 - 07:34 EDT
Update
We are continuing to work on a fix for this issue.
Posted 5 months ago. Apr 01, 2019 - 16:53 EDT
Update
We are currently conducting database maintenance of Podio in the backend whenever load on the platform is low while Podio is still online and available, this will increase pressure on the database over the next 1-2 weeks. The background maintenance is a preparation towards a larger maintenance we are planning later in April. The background maintenance can in some occasions mean that newly created items will cause error when being accessed within the first few minutes, however the items will remain on the platform and there is no risk of dataloss.

We will update with a scheduled larger maintenance during a weekend when this time has been confirmed, ETA is currently mid April.

Furthermore, we are currently working to increase the Search capacity with the purpose to reduce search errors that are elevated from 10 - 12 am EST. We will update here when Search capacity has been added. Estimate here is April 2-3.

Thank you for your patience.
Posted 5 months ago. Apr 01, 2019 - 07:49 EDT
Update
The response times have improved. Team continues to monitor and work on resolving the issue. Thanks for your patience.
Posted 5 months ago. Apr 01, 2019 - 04:31 EDT
Update
We are seeing increased latency right now. Team is investigating to fix the issue.
Posted 5 months ago. Apr 01, 2019 - 03:45 EDT
Update
We are aware of the current increase in Search errors from 10 am to 12 am EST, we are working to scale up our search infrastructure to resolve this. ETA should be sometime next week for improved search performance. Thanks for your patience.
Posted 5 months ago. Mar 27, 2019 - 10:57 EDT
Update
Podio is again operational and we are truly sorry for the continued errors over the last few weeks. We are working on a longterm plan to address these issues.

We are also aware of increased Search errors on the platform from 10 am to 12 am EST, we are on the side working to scale up our search infrastructure to resolve this. ETA should be sometime next week for improved search performance.
Posted 5 months ago. Mar 27, 2019 - 05:54 EDT
Update
We believe to have found the bottleneck and load is coming down. You should see better responses now.
Posted 5 months ago. Mar 27, 2019 - 05:16 EDT
Update
We are looking latency on Podio - team is investigating
Posted 5 months ago. Mar 27, 2019 - 04:51 EDT
Update
We no longer see errors on accessing new items being created however we are monitoring closely for a full resolution. We have also taken extra steps to ensure better performance tomorrow. We will keep this incident open and will keep you updated tomorrow.
Posted 5 months ago. Mar 26, 2019 - 15:12 EDT
Identified
We have seen stable performance of Search the last hour. We will work to further scale up the Search infrastructure to reduce errors in the future.

We still see errors for newly created items on Podio due database delays on our end by result of the performance issues earlier today. We are working to catch up on these delays this afternoon and expect the issue to be solved today. We will update here when this is fully recovered.
Posted 5 months ago. Mar 26, 2019 - 13:49 EDT
Update
We had 5 minutes of increased latency of the app however performance have been stable since. We do have two on-going issues still happening: We currently see increased level of errors for Search, team is looking into this. We are also aware that newly created items can show errors when accessed, this is due a delay in our database. We are working towards longterm solutions for all of them.
Posted 5 months ago. Mar 26, 2019 - 10:59 EDT
Investigating
Team is investigating
Posted 5 months ago. Mar 26, 2019 - 09:55 EDT
This incident affected: Web and API.