🏠 Team IT Security News

TSecurity.de ist eine Online-Plattform, die sich auf die Bereitstellung von Informationen,alle 15 Minuten neuste Nachrichten, Bildungsressourcen und Dienstleistungen rund um das Thema IT-Sicherheit spezialisiert hat.
Ob es sich um aktuelle Nachrichten, Fachartikel, Blogbeiträge, Webinare, Tutorials, oder Tipps & Tricks handelt, TSecurity.de bietet seinen Nutzern einen umfassenden Überblick über die wichtigsten Aspekte der IT-Sicherheit in einer sich ständig verändernden digitalen Welt.

16.12.2023 - TIP: Wer den Cookie Consent Banner akzeptiert, kann z.B. von Englisch nach Deutsch übersetzen, erst Englisch auswählen dann wieder Deutsch!

Google Android Playstore Download Button für Team IT Security

800+ IT News als RSS Feed abonnieren

Thema auswählen:

📚 Generate sitemap.xml and robots.txt in Nextjs

🕛 Zeit seit Veröffentlichung: 526 Tage, 0 Stunden 37 Minuten
📆 Veröffentlicht am: 04.12.2022 um 02:13 Uhr
💡 Newskategorie: Programmierung
🔗 Quelle: dev.to

Introduction

When you make your site you want it to be first in google search results or in other words we need to make improvements to our Search Engine Optimization (SEO).
Google rank websites by many different reasons but one of the most important is that it knows our site and know what to expect on it. That is the reason why we
need sitemap.xml and robots.txt.

Robot.txt tells Google crawler which files it can request from website and which cannot.

Sitemap

Lets begin with what sitemap represent and how it works.

A sitemap is a file where you provide information about the pages, videos, and other files on your site, and the relationships between them. Search engines like Google read this file to crawl your site more efficiently. A sitemap tells Google which pages and files you think are important in your site, and also provides valuable information about these files.

What sitemap.xml do is that is basically defining relationship between pages on website. Search engines utilize this file to more accurately index your site. You can add additional
things like when was the last time it was updated, how frequently the pages changes, priority, etc.

text='Sitemap flow'
alt='Sitemap flow'
/>

Static Sitemap

When you have static website, static sitemap will do the job.In other words when your website does not change frequently you can make simple .xml file
for defining and telling google crawler which content you have.

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
      <loc>https://yourapp.com</loc>
  </url>
  <url>
      <loc>https://yourapp.com/blog</loc>
  </url>
  <url>
      <loc>https://yourapp.com/libary</loc>
  </url>
  <url>
      <loc>https://yourapp.com/contact</loc>
  </url>
</urlset>

Dynamic Sitemap

On the other hand, if your site frequently changes, you need to make dynamic sitemap. You can do it manually by generating .xml file after fetching all your files, but in this post we will cover the easier way of doing so.
There is great npm module called next-sitemap which is doing all dirty work for you.

First you need to install it by using following command:

yarn add next-sitemap

Create site map configuration file for next-sitemap to use. There are many properties available but we will use these three:

siteUrl- used for setting base URL of your website
generateRobotsTxt- Generate a robots.txt file and list the generated sitemaps. Default false
sitemapSize- Split large sitemap into multiple files by specifying sitemap size. If number of URLs reach over default it will create new sitemap.xml so you will have sitemap-0.xml and sitemap-1.xml,etc. Default is 5000.

module.exports = {
  siteUrl: process.env.SITE_URL || 'https://yourapp.com',
  generateRobotsTxt: true, // (optional)
  sitemapSize: 7000
}

In your package.json add postbuild script which will be automatically triggered after succesfull build where we will start next-sitemap command.

{
  "build": "next build",
  "postbuild": "next-sitemap"
}

Output

After build is done you will have generated sitemap.xml and sitemap-0.xml at public folder by default. If you set generateRobotsTxt to true you will get robots.txt file as well.

If check sitemap.xml you should see:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap><loc>https://bojanjagetic.com/sitemap-0.xml</loc></sitemap>
</sitemapindex>

As you can see there is only one location referencing to our sitemap-0.xml. Lets open and check content of it:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url><loc>https://bojanjagetic.com</loc><lastmod>2022-12-03T20:01:39.202Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/routes/aboutme</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/routes/blog</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/post/npm-vs-yarn</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/post/programming-concepts</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/libary/crypto-scrapper</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
<url><loc>https://bojanjagetic.com/libary/github-card-npm-component</loc><lastmod>2022-12-03T20:01:39.203Z</lastmod><changefreq>daily</changefreq><priority>0.7</priority></url>
...

So as you can see it generated all routes that I have so Google crawler knows which resources are available.

Robots

As we mention already, robots.txt will tell Google crawler which files and resources can be requested and location of sitemap. Content of generated robots.txt is something like following:

# *
User-agent: *
Allow: /

# Host
Host: https://bojanjagetic.com

# Sitemaps
Sitemap: https://bojanjagetic.com/sitemap.xml

Conclusion

Know that we have sitemap.xml and robots.txt we can know get better visibility on Google search and it will be better ranked, which means we will get more visitors.

...

Sharing is caring on Social Media

📌 CVE-2018-20687 | Raritan CommandCenter Secure Gateway up to 7.x XML Data .* XML Request xml external entity reference (ID 155359)

🕛 79 Tage, 12 Stunden 12 Minuten
📆 23.02.2024 um 14:28 Uhr
📈 25.29 Punkte