Jump to content

维基媒体基金会数据保留指引

From Wikimedia Foundation Governance Wiki
This page is a translated version of the page Legal:Data retention guidelines and the translation is 63% complete.

介绍

数据很重要。这是我们作为一个组织和一个运动学习和发展的方式之一,以及我们如何帮助那些使用它们创造,学习和分享的人更好地完成项目。同时,我们致力于“在维护,理解和改进维基媒体网站的最短时间内保存您的个人数据,以及我们根据适用的美国法律承担的义务”(维基媒体基金会引用隐私政策)。

本文档帮助说明我们如何履行这个承诺,通过描述我们用于数据保留、系统设计和进行中的审查与维护的指引。这些指引将变成一份现存文档——它们将被同时更新以反映现有保留实践的问题。

什么数据会受本指引影响?

这些指引适用于我们从隐私政策非Wiki隐私政策所涵盖的维基媒体网站收集的所有非公开数据。我们捐赠者隐私政策包括适用于捐赠者信息的单独数据保留指南。

我们的非公开数据会保留多久?

除非第三方要求或受不可抗力影响,我们将按照以下列表酌情定义数据保留期限:

数据类型 来源 例子 最高保留期限
非公开的个人信息 从用户自动收集
  • 站点访问者的IP地址(运作数据)
  • A/B测试項目的IP地址(分析数据)
  • Identifying user-agent information of site visitors
最多90天后,它将被删除,汇总或取消标识
帳戶設定
  • 電子郵件信箱
直到使用者刪除/修改其帳戶設定
非个人信息 从用户自动收集 无限期
最多90天后,它将被删除,汇总或取消标识
由用户提供
  • 输入到网站搜索框中的术语日志,或用户导航后跟随搜索引擎的预填充链接中的术语
最多90天后,它将被删除,汇总或取消标识
由用户提供
  • 语言
直到使用者刪除/修改其帳戶設定
非个人信息[T 1] 从各类用户自动收集 无限期
由读者浏览的条目 从读者自动收集
  • 读者访问的条目列表
至多90天后,如果保留,則只以匯總的形式。
  1. 出于本表的目的,用户帐户表示用户名,用户ID或IP地址;读者是指维基媒体项目的访问者。

我们保留公共数据多长时间?

维基媒体托管维基百科及相关项目,作为我们收集,记录和自由分发人类知识总和的使命的一部分。因此,当您为任何维基媒体网站(包括用户或讨论页面)做出贡献时,您将创建一个永久性的公共记录,记录您添加,删除或更改的每个内容。 页面历史记录将显示您的贡献或删除时间,以及您的用户名(如果您已登录)或您的IP地址(如果您未登录)。我们可能会使用您的公共捐款,或者与他人的公共捐款或单独捐赠,为您创建新功能或数据相关产品,或者了解有关维基媒体网站如何使用的更多信息。 如果您错误地将您的个人信息包含在对维基媒体网站的贡献中,并且您希望将其删除,请咨询社区的监督政策。 请记住,我们网站修订历史的透明度和完整性对我们的使命至关重要,基金会支持我们社区拒绝监督请求以保护项目的权利。

如果您选择注册维基媒体项目的帐户,系统会要求您选择用户名。用户名将保留,直到用户请求该帐户为已被重命名,或通过社区隐退流程。

请参阅我们的隐私政策获取更多信息。

定义

For the purposes of these guidelines:

  • "Personal information" means information you provide us or information we collect from you that identifies or could be used to personally identify you. For details, please see the Wikimedia Foundation Privacy Policy and Non-Wiki Privacy Policy.
  • Some examples of "public information" would include:
    • (a) your IP address, if you edit without logging in;
    • (b) your gender, if it is disclosed under your user profile;
    • (c) any personal information you disclose publicly on the Wikimedia Sites, such as your real name or age.
  • Some examples of types of information that are considered to be "nonpublic information" include:
    • (a) your IP address, if you edit while logged in;
    • (b) your email address, if you provided one to us during account registration (but did not post it publicly); and
    • (c) your general location information as might be derived from your IP address, if you have not posted it publicly. The types of information that are considered "nonpublic" as opposed to "public" are more fully explained in our Privacy Policy.
  • Data is "de-identified" when it has been aggregated or otherwise retained in a manner such that it can no longer be used to identify the user.
  • Data is "aggregated" when the data associated with a specific user has been combined with data from others to show general trends or values without identifying specific users.

数据如何聚合的例子:

Using ranges rather than specific numbers, such as recording that there are "between 1 and 10 editors in language X in country Y" rather than recording that there are 4 editors.

Terms that are not defined in this document have the same meaning given to them in the Privacy Policy.

方针例外情况

If we make exceptions to these guidelines, we will notify the community by describing the exception on this page.

  • 数据也许会在系统备份中保留更长周期,但最多不会超过5年。
  • When we conduct a survey or other research, we will provide you with a privacy statement specifying the term of retention for information (including personal information) collected through your participation in such research. In certain cases, information may be retained indefinitely for educational, development, or other related purposes, unless otherwise indicated in the relevant privacy statement. Such information may be retained in raw, aggregated, or de-identified form until we receive a request from the participant to delete the information.
  • Research related to COVID-19: The Wikimedia Foundation Research team is conducting research regarding COVID-19 and its impact on Wikipedia. Retaining de-identified readership data from COVID-19 related articles will enable us to better understand how to prioritize content creation, to understand what happens to readership when there is a "shock to the system", and to empower the research community to answer such questions. By "COVID-19 related articles", we mean articles that link to the COVID-19, SARS-CoV-2 and 2019-2020 COVID-19 pandemic Wikidata items. For comparison purposes, we will retain data from a small number of articles unrelated to COVID-19 as well. In order to collect sufficient data, and obtain a picture of readership as time passes, we will be retaining this de-identified data beyond the 90-day retention limit, for a period of one year, ending on March 1, 2021. (Note that this includes a one-month extension due to staffing changes, in order to allow for the project's completion.). For technical details about the sampling and de-identification process, please see the project page on GitHub.
  • Editing research: There is a short-term extension applying to data collected as part of experimental features to improve replying on talk pages. In order to collect and analyze sufficient data, this data must be kept beyond the standard 90-day period. The retained data will be deleted, aggregated, or de-identified within 180 days.
  • Campaign landing pages: for certain events, campaigns, or marketing channels, users may create accounts on special landing pages. After creating their account on those pages, the association between their account and its source may be retained indefinitely, both to provide a good user experience for that account and for longitudinal analysis on campaign effectiveness. For more information, contact mmiller@wikimedia.org.
  • CampaignEvents extension: An exception exists for data collected by the CampaignEvents extension. The extension collects the global user IDs of event organizers and event participants, as well as which events users organized or attended and when participants registered for an event. In order for the extension features to work consistently, data collected by the CampaignEvents extension may be retained indefinitely.
  • Sound logo contest: There is a short-term extension applying to data collected as part of contest entries to allow the brand studios team to evaluate entries in preparation for announcing the winner in February 2023. The retained data will be deleted, aggregated or de-identified within 90 days after the winner is announced.
  • Webrequest datasets: There is a short, one-time extension for data from the wmf.webrequest and wmf.pageview_actor datasets. This data needs to be retained longer than usual while we correct an error in the way unique devices are calculated from the dataset. Accurate unique device statistics are necessary for engineering purposes and legal reporting requirements. The underlying data used to calculate these statistics will be retained for an extra 30 days beyond the ordinary 90-day deletion period. After 30 days, the affected data will be purged and retention settings will reset back to 90 days.
  • In rare cases, we, or particular users with certain administrative rights as described in our Privacy Policy, may need to retain your personal information, including your IP address and user agent information, for as long as reasonably necessary (which may be longer than the period described in the table above) to:
    • enforce or investigate potential violations of our Terms of Use, this Privacy Policy, or any Foundation or user community-based policies;
    • investigate and defend ourselves against legal threats or actions;
    • help protect against vandalism and abuse, fight harassment of other users, and generally try to minimize disruptive behavior on the Wikimedia Sites;
    • prevent imminent and serious bodily harm or death to a person, or to protect our organization, employees, contractors, users, or the public; or
    • detect, prevent, or otherwise assess and address potential spam, malware, fraud, abuse, unlawful activity, and security or technical concerns.

Audits and improvements

The Foundation is committed to continuous evaluation and improvement of these guidelines, and to periodic audits in order to identify such improvements. As we make changes to existing and systems, we will update these guidelines to reflect our changing practices.

新系统设计

In order to support these data retention periods and our overall privacy policy, new tools and systems implemented by the Foundation will be designed with privacy in mind. This will include:

  • inclusion of these data retention guidelines as requirements during the design process;
  • legal consultation during the design and development process; and
  • inclusion of privacy considerations in the code review process.

仍在进行中的新信息处理

Despite our best efforts in designing and deploying new systems, we may occasionally record personal information in a way that does not comply with these guidelines. When we discover such an oversight, we will promptly comply with the guidelines by deleting, aggregating, or de-identifying the information as appropriate.

联系我们

If you think that these guidelines have potentially been breached, or if you have questions or comments about compliance with the guidelines, please contact us at privacy@wikimedia.org.

隐私相关页面