Jump to content

Legal talk:Notices received from search engines

From Wikimedia Foundation Governance Wiki

Hungarian WP

Hungarian WP is listed in #Projects affected, number of notices received, and number of links removed but the notice is not available. Is that a mistake or is it intentionally non-public? --Tgr (WMF) (talk) 05:52, 2 March 2016 (UTC)Reply

Now, there is an image, but the image shows a notice for a page on the Occitan Wikipedia, not the Hungarian Wikipedia. Samat (talk) 09:05, 25 February 2018 (UTC)Reply

This page hasn't been updated in three years

Why have the Wikimedia Foundation stopped updating this page, has their position changed on content removal notices? The latest Wikimedia Transparency Report from late 2021 (there were none in 2022) included the passage linking here "We have a dedicated page where we post notices of delisted project pages that we have received from the search engines who provide such information as part of their own commitments to transparency." However, the latest information displayed on the page is from the second half of 2019, not 2021 let alone 2022. Has Wikimedia received no requests in the last three years or is it not willing to publish them anymore, and why were there no Transparency Reports last year? --Voello (talk) 02:22, 5 January 2023 (UTC)Reply

Postings flowing again, but with missing URLs

The posting of notices to this page appears to have resumed after quite a long silence (see above), but something has gone awry with the latest uploads.

Reports for English Wikipedia through File:En72RTBF.pdf (18 May 2024) included the "Here are the affected URL(s)" section of the notice, making the affected page visible. (Although as a screenshot PDF, it's still annoyingly not possible to simply copy-paste the text, like it would be on the original notice page.)

However, starting with File:En73RTBF.pdf and as far as I can tell including every capture since, the affected URL(s) section is not visible in the captured notice, making it uselessly generic. It's of no help to anyone to simply see that some notice was sent, without the affected pages being documented as well.

I see the same problem with postings for other wikis. (See, for example, File:Fr54RTBF.pdf and File:De65RTBF.pdf.)

The files (good and bad) were all uploaded by KFrancis (WMF), so my apologies for singling you out with a ping on this. FeRDNYC (talk) 14:23, 6 July 2024 (UTC)Reply

Also, if I might make two suggestions:
  1. While I understand the need to archive these files in a static form like a PDF, there are a number of good tools for doing so without converting the page to an image, so that embedded text remains readable and copy/paste-friendly, and all of the content is captured (not just whatever's visible on the screen at the moment).
    • For Google Chrome, there are several extensions in the Chrome Store that provide page-to-PDF functionality. A number of those also exist for Firefox. I couldn't possibly speculate on Safari, Edge, or any other terrible browsers.
    • But an extension might not even be needed. Most operating systems provide (or can have installed) a print driver that saves to PDF. Using that, a page can be archived via the browser's "Print..." functionality. (Again in Chrome, it's necessary to bypass the Chrome built-in print tools by selecting "Print using system dialog" to bring up the native print interface, but printing even secure pages to PDF using that method is typically very effective.)
  2. Issues of notice archiving and included data aside, rather than sequentially numbering the notices ("En##RTBF.pdf", "De##RTBF.pdf"), it might be more helpful to save them with the date of the notice in the filename, instead.

    For File:En84RTBF.pdf (the latest English notice), an alternate name might be EnRTBF-20240626.pdf, for example. Saved that way, the filenames would provide more visibility into both when notices are received, and just reading through the list(s) of files would give an at-a-glance impression of how frequently a given wiki's pages are subjected to search-engine removal.

    In the unlikely event that more than one notice is received on the same day for a given wiki (though it doesn't seem as though they're coming fast enough for that to be a concern), a _1, _2, etc. could be appended to the date portion of the filename.

Regarding #1, I'd be happy to provide whatever help I can in working out a solution for archiving these notices in a more complete and accessible format, while still ensuring that it meets all of Legal's criteria for permanence and immutability. FeRDNYC (talk) 14:41, 6 July 2024 (UTC)Reply
Hello! Thank you for your message. For a bit of background, these notices were consistently posted until there was a glitch where legal stopped receiving the notices around mid-2019 (hence the large backlog). The glitch has since been fixed and I've been working on getting the notices missed posted. -The new notices, from this calendar year (2024) and the end of 2023 have strangely been missing the URL of the takedown notices (the only information included was the language of the project and some boilerplate language explaining why the takedown URL was not listed). We don't have any visibility into Google's internal policies but it seems reasonable to assume they have an internal policy where some takedown URLs can't be listed. I understand this is frustrating as the notices posted recently are not all that helpful. - There are a few notices received this week, however, that DO list the takedown URL. Those will be posted in order and the backlog is log.
If there is a better/more helpful way to post the takedown notices, I would be very happy to work with you to come up with a more useful solution. KFrancis (WMF) (talk) 00:12, 13 July 2024 (UTC)Reply
@KFrancis (WMF) Thank you for the updates. I've found this article about the decision of Google https://www.seroundtable.com/google-right-to-be-forgotten-takedown-notices-missing-36905.html Pyb en résidence (talk) 09:50, 29 August 2024 (UTC)Reply
@Pyb en résidence Ooh, good find! That does appear to explain the change... though the reasoning behind said change is confounding in a way that only courts of law can really aspire to.
I should apologize, @KFrancis (WMF), for my assumptions in my original message; when I saw that the recent PDFs didn't include URLs, but did have a small blue rectangle at the bottom of the page that looked like the top of the box where the URL appeared in the older messages, I assumed the newer PDFs were simply incomplete (cut off before the URL), and compounded my error by then further assuming it was a process issue on our end. Neither assumption was correct, so my apologies twice over.
Now understanding the situation, in light of that court ruling barring Google from including the URLs in question, the only thing I can't figure out now is why they're bothering to continue (or resume?) sending notices — without the URL, those notices are useless. They contain absolutely no information worth being notified about.
They probably aren't worth archiving, either, since every notice is now identical — "we removed some unspecified pages at your site from the search results for some unspecified search terms". That information is... well, it's completely useless to us, really. (Not to mention, non-actionable.)
Since this is a Legal project page I'm not going to fiddle with the content (IANAL), but it might be worth updating Legal:Notices received from search engines with...
  1. A link, in the opening "For context" section, to either the Guardian article (https://www.theguardian.com/technology/2024/feb/15/google-stops-notifying-publishers-of-right-to-be-forgotten-removals-from-search-results) that @Pyb en résidence's SEO blog article sourced its info from, or a link to the very dry, but very detailed International Association of Privacy Professionals post (https://iapp.org/news/a/swedish-court-rejects-googles-appeal-in-rtbf-case/) that the Guardian quoted and seems to have used as their primary source.
  2. Some changes to the rest of the intro text, since it discusses "the subject of the affected webpage" and editors making changes to "address the referred content", neither of which are applicable if we don't know the URLs removed. The closing paragraph also ends with, "Search engines are not required by law to provide notices, so we appreciate the companies who share our commitment to free speech and transparency. Compelled censorship is unacceptable, but compelled censorship without notice is unforgivable." which feels like it's implicitly discussing notices that include URLs, since with the URLs removed — regardless of the reasons, and whether or not Google had any choice in the matter — it's difficult to characterize the notices as transparent.
FeRDNYC (talk) 04:23, 4 September 2024 (UTC)Reply