-->

Implementing sitemaps in django application

Sitemap:

  • Sitemap is a collection of web pages of website that are placed in single page that is available for both search engines as well as users.
  • Xml is used to generate sitemap for a website.
  • Sitemap tells a search engine that how frequently your contents in a page changes and follow the priority while indexing the web pages.

Rules of a sitemap:

  • A sitemap can have at most 50,000 URL's.
  • A sitemap file size should not greater than 10 MB.
  • If you have more than  50,000 URL's then you can categorize the URLS and and create sub sitemaps for it.
  • Use sitemap index to hold all the sub sitemaps of a website.
  • Submit the sitemap index to the search engine bots. It will automatically crawls the all web pages  from a given website.

Sitemaps framework in django

  • It automates the creation of sitemap xml
  • We can classify the urls of a website as static and dynamic

Configurations of sitemaps framework

  • We have to add 'django.contrib.sitemaps' in INSTALLED_APPS setting.
  • You must add 'APP_DIRS': True in TEMPLATES setting.
Sitemaps framework provides two classes Sitemap and GenericSitemap.

Sitemap Class:

  • It has the following attributes or methods that will help in creating the sitemap.
  • items:

    • It is a method and It returns list of objects.
    • Every object that is returned from this method will be passed to location(), lastmod(), changefreq() and priority() methods.
  • location:

    • It is either an attribute or a method
    • If it is a method, It takes an item/object as a argument.
    • It's responsible for returning the absolute path.
    • Absolute path should not contains protocol(http, https), domain name.
    • Example: "/contact-us/", "/country/india/"
    • If you do not override it will call "obj.get_absolute_url()" by default.
    • If the object doesn't have get_absolute_url it will raises an error.
  • lastmod: 

    • It is either an attribute or a method.
    • It is responsible for returning a date or datetime object.
    • It returns none by default.
    • It is used to tell a search engine that when the page is last modified.
     
  • changefreq:

    • It is either an attribute or a method.
    • If it is a method, It takes an item/object as a argument.
    • It tells search engines that how frequently a web page changes.
    • Possible values are "always", "hourly", "daily", "weekly", "monthly", "yearly", "never"
  • priority:

    • The priority of this URL relative to other URLs on your site.
    • It's valid values range from 0.0 to 1.0.
    • It only lets the search engines know which pages you deem most important for the crawlers.
    • The default priority of a page is 0.5.
       
  • protocol:

    • It is an attribute. It specifies which version of protocol to use while generating full url.
    • Possible values are "http", "https"
    • Default values is "http"
  • i18n:

    • It is a boolean attribute.
    • If it is true then it will add a language prefix for all urls.
    • By  default it is set to "False"

GenericSitemap:

  • It simply inherits the Sitemap class. So, It gets all the above mentioned properties.
  • It takes arguments "info_dict", "priority", "changefreq", "protocol".
  • info_dict contains two keys 1. "queryset", 2. "date_field"
  •  We don't need to write extra classes for Sitemaps instead we can use it
Now, It's coding time.
If we use sitemap view for urls below 50,000 and sitemap index for above 5,000.

Buy a product to Support me