Google Image Search can also filter for re-use licences of the images. I'm wondering how they know which licence an image is published under. How can I declare that licence on my website? Is it possible to declare a licence for each image on the page, or only for the entire page including all referenced images (possibly including pre-fetched content)? And what are the licences that Google understands and can classify to their filter?

有帮助吗?

解决方案

I've searched around for a while and have finally found a solution, thanks to Creative Commons. In short, for Google (and other search engines) to know what license the content on a specific page is under, you have to tell it.

This is done the same way as you give Google other data like page relationships - using the HTML structure of the page. In this case, you use the rel attribute of a tags. To declare a single-page license:

<a href="license-url" rel="license">License</a>

Of course you can change the link text to whatever, but the important bit is the rel attribute. The href should point to the license itself.

I don't know how Google knows what license it is, but that's how you declare it, and Google's robots will do the magic for you. In terms of bulk licensing, I dare say you could preprocess pages with PHP (possibly in conjunction with an SQL database) to insert this license tag.

Sources:
Creative Commons Licence Chooser;
MicroFormats' RelLicense

You can also have a look at Sitepoint's definition of the rel attribute and its uses.

Hope this helps.

其他提示

In this answer, I am assuming that:

  1. You have a collection of images licensed under, say, a Creative Commons license.
  2. You want image search engines to return your images when the user is filtering for Creative Commons-like images.

Creating metadata HTML pages

I think the best way to attach licensing information to an image is to create a canonical HTML page corresponding to each image--much like how Wikipedia or Flickr does it.

Let's say that we want to license a gallery of images under CC BY-SA 3.0 where every image has a URL of the format https://example.com/img1.jpg.

In that case, we embed the licensing information in HTML pages with URLs that look like https://example.com/img1.jpg.html.

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8" />
    <title>Viewing img1.jpg</title>
    <meta property="og:image" content="https://example.com/img1.jpg" />
    <link rel="license" href="https://creativecommons.org/licenses/by-sa/3.0/"/>
    <link rel="canonical" href="https://example.com/img1.jpg.html"/>
  </head>
  <body>
    <div>
      <img src="https://example.com/img1.jpg" />
      <small>
        This image is licensed under a
        <a rel="license" href="https://creativecommons.org/licenses/by-sa/3.0/">Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)</a> license
      </small>
    </div>
  </body>
</html>

Using schema.org metadata

We can also schema.org ImageObject metadata to our metadata pages--encoding our metadata as microdata, RDFa, or JSON-LD. The schema.org markup makes it possible to add additional annotations that describe the copyright holder and how to obtain a license to use the image.

Here is an example of a JSON-LD document that you can put inside the <head> tags.

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "ImageObject",
  "author": "Bob Smith",
  "copyrightHolder": "Bob Smith's employer",
  "copyrightYear": 2021,
  "contentUrl": "https://example.com/img1.jpg",
  "license": "https://creativecommons.org/licenses/by-sa/3.0/",
  "acquireLicensePage": "https://example.com/img1.jpg.html"
}
</script>

Exposing your metadata HTML pages to search engines

Finally, you should make it easy for crawlers to find these HTML pages. You can organically pepper in links to these HTML pages whenever you embed one of your images. Alternatively, you could just list all of your HTML pages with the Sitemap Protocol.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top