Legal talk:Data retention guidelines

Page contents not supported in other languages.
From Wikimedia Foundation Governance Wiki


Added exception for page views investigation

The Privacy team has temporarily extended the retention period for two datasets for a short period so that the Data Engineering team can investigate the impact of a data collection technical issue. Between June 4, 2021 and January 27, 2022, some of the Foundation’s caching nodes stopped collecting web traffic data (see the Phabricator task for more details). This resulted in data loss for web requests and the derived pageviews, which impacts the Foundation’s ability to correctly report on the Wikimedia pageviews and fundraising banner impressions.

The Data Engineering team required a temporary short-term extension to the usual 90-day retention period in order to better estimate what data was not collected and which projects and geographies were most affected. The wmf.pageview_actor dataset is being used to estimate the data loss for pageviews and the wmf.webrequest dataset is being used to estimate the data loss for fundraising banners. Information from both datasets is required because webrequest data for visited banners is not reported as pageviews. Deletion of these datasets was paused on February 16, 2022 and deletion will resume by March 18, 2022.

If you have questions or concerns, please reach out to If you are interested in a conversation meeting to discuss this exception and investigation, please sign up below and we will contact you with details. MMoss (WMF) (talk) 19:51, 11 March 2022 (UTC)Reply[reply]