Jump to content

Legal talk:Notices received from search engines

From Wikimedia Foundation Governance Wiki

Hungarian WP

Hungarian WP is listed in #Projects affected, number of notices received, and number of links removed but the notice is not available. Is that a mistake or is it intentionally non-public? --Tgr (WMF) (talk) 05:52, 2 March 2016 (UTC)Reply

Now, there is an image, but the image shows a notice for a page on the Occitan Wikipedia, not the Hungarian Wikipedia. Samat (talk) 09:05, 25 February 2018 (UTC)Reply

This page hasn't been updated in three years

Why have the Wikimedia Foundation stopped updating this page, has their position changed on content removal notices? The latest Wikimedia Transparency Report from late 2021 (there were none in 2022) included the passage linking here "We have a dedicated page where we post notices of delisted project pages that we have received from the search engines who provide such information as part of their own commitments to transparency." However, the latest information displayed on the page is from the second half of 2019, not 2021 let alone 2022. Has Wikimedia received no requests in the last three years or is it not willing to publish them anymore, and why were there no Transparency Reports last year? --Voello (talk) 02:22, 5 January 2023 (UTC)Reply

Postings flowing again, but with missing URLs

The posting of notices to this page appears to have resumed after quite a long silence (see above), but something has gone awry with the latest uploads.

Reports for English Wikipedia through File:En72RTBF.pdf (18 May 2024) included the "Here are the affected URL(s)" section of the notice, making the affected page visible. (Although as a screenshot PDF, it's still annoyingly not possible to simply copy-paste the text, like it would be on the original notice page.)

However, starting with File:En73RTBF.pdf and as far as I can tell including every capture since, the affected URL(s) section is not visible in the captured notice, making it uselessly generic. It's of no help to anyone to simply see that some notice was sent, without the affected pages being documented as well.

I see the same problem with postings for other wikis. (See, for example, File:Fr54RTBF.pdf and File:De65RTBF.pdf.)

The files (good and bad) were all uploaded by KFrancis (WMF), so my apologies for singling you out with a ping on this. FeRDNYC (talk) 14:23, 6 July 2024 (UTC)Reply

Also, if I might make two suggestions:
  1. While I understand the need to archive these files in a static form like a PDF, there are a number of good tools for doing so without converting the page to an image, so that embedded text remains readable and copy/paste-friendly, and all of the content is captured (not just whatever's visible on the screen at the moment).
    • For Google Chrome, there are several extensions in the Chrome Store that provide page-to-PDF functionality. A number of those also exist for Firefox. I couldn't possibly speculate on Safari, Edge, or any other terrible browsers.
    • But an extension might not even be needed. Most operating systems provide (or can have installed) a print driver that saves to PDF. Using that, a page can be archived via the browser's "Print..." functionality. (Again in Chrome, it's necessary to bypass the Chrome built-in print tools by selecting "Print using system dialog" to bring up the native print interface, but printing even secure pages to PDF using that method is typically very effective.)
  2. Issues of notice archiving and included data aside, rather than sequentially numbering the notices ("En##RTBF.pdf", "De##RTBF.pdf"), it might be more helpful to save them with the date of the notice in the filename, instead.

    For File:En84RTBF.pdf (the latest English notice), an alternate name might be EnRTBF-20240626.pdf, for example. Saved that way, the filenames would provide more visibility into both when notices are received, and just reading through the list(s) of files would give an at-a-glance impression of how frequently a given wiki's pages are subjected to search-engine removal.

    In the unlikely event that more than one notice is received on the same day for a given wiki (though it doesn't seem as though they're coming fast enough for that to be a concern), a _1, _2, etc. could be appended to the date portion of the filename.

Regarding #1, I'd be happy to provide whatever help I can in working out a solution for archiving these notices in a more complete and accessible format, while still ensuring that it meets all of Legal's criteria for permanence and immutability. FeRDNYC (talk) 14:41, 6 July 2024 (UTC)Reply
Hello! Thank you for your message. For a bit of background, these notices were consistently posted until there was a glitch where legal stopped receiving the notices around mid-2019 (hence the large backlog). The glitch has since been fixed and I've been working on getting the notices missed posted. -The new notices, from this calendar year (2024) and the end of 2023 have strangely been missing the URL of the takedown notices (the only information included was the language of the project and some boilerplate language explaining why the takedown URL was not listed). We don't have any visibility into Google's internal policies but it seems reasonable to assume they have an internal policy where some takedown URLs can't be listed. I understand this is frustrating as the notices posted recently are not all that helpful. - There are a few notices received this week, however, that DO list the takedown URL. Those will be posted in order and the backlog is log.
If there is a better/more helpful way to post the takedown notices, I would be very happy to work with you to come up with a more useful solution. KFrancis (WMF) (talk) 00:12, 13 July 2024 (UTC)Reply