What is Robots.txt? Have you ever heard of it, if not, today it is a matter of great news for you because today you will be giving people some information about Robots.txt. If you have a blog or a website then you must have felt that sometimes all the information that we do not want is public in the Internet, do you know why such a thing happens. Why our many good things have not been indexed after too many days. If you want to know about the secret behind all these things, then you will have to read this article, Robots.txt carefully, so that you will know about all these things till the end of the article.
To tell all the search engineers, the files and folders have to show all the public in the website and Robots meta tag is used for what does not. But all Search Engines do not have to read Meta taga, so many Robots Meta tag goes unnoticed as unread. The best way to do this is to use the Robots.txt file, which can easily be reported to Search Engineers about files and folders of your website or blog. So today I thought why you should give all the information about what is Robots.txt , so that you will not have any problem in understanding it further. Then what are the delays, let's start and know what the robots.txt is and what is the fate behind it.
The point here to note is that if we do not implement this file in the right place, then search engineers will think that maybe you have not included the robot.txt file so that the pages of your site may not even be index. So this small file has a lot of importance if it has not been used correctly, it can also reduce the ranking of your website. Therefore it is very important to have good information about this.
• User-Agent: Those robots that follow these rules and they are applicable (eg "Googlebot," etc.)
• Disallow: To use it means blocking pages from bots which you do not want to be able to access it. (Need to write disallow before files here)
• Noindex: With its use, the search engine will not index your pages that you do not want to be indexed.
• Use a blank line to separate all the User-Agent / Disallow group, but note here that there are no blank lines between the two groups (not between the user-agent line and the last Disallow needed .
• Hash symbol (#) can be used to give comments inside a robots.txt file, where all the items of # will be ignored will be ignored. They are mainly used for whole lines or end of lines.
• Directories and filenames are case-sensitive: "private", "private", and "PRIVATE" are quite different for all search engines.
Let's understand this with the help of example. Here's a note about him.
• The robot "Googlebot" here has not written a statement disallowed in it so that it is free to go anywhere
• All the sites here have been closed where "msnbot" has been used.
• All robots (other than Googlebot) are not permitted to view / tmp / directory or files called / logs, which are explained below, through comments, eg, tmp.htm,
/ logs or logs.php
User-agent: Googlebot
Disallow:
User-agent: msnbot
Disallow: /
# Block all robots from tmp and logs directories
User-agent: *
Disallow: / tmp /
Disallow: / logs # for directories and files called logs
To tell all the search engineers, the files and folders have to show all the public in the website and Robots meta tag is used for what does not. But all Search Engines do not have to read Meta taga, so many Robots Meta tag goes unnoticed as unread. The best way to do this is to use the Robots.txt file, which can easily be reported to Search Engineers about files and folders of your website or blog. So today I thought why you should give all the information about what is Robots.txt , so that you will not have any problem in understanding it further. Then what are the delays, let's start and know what the robots.txt is and what is the fate behind it.
What is Robots.txt?
Robots.txt is a text file that you put in your site so that you can tell Search Robots which pages you want to visit or crawl in your site and who do not. By the way, following Robots.txt is not mandatory for search engines but they pay attention to it and do not visit pages and folders mentioned in it. According to that Robots.txt is very important. So it is very important to keep it in main directory so that the search engine has the ability to find it.
- How to Write a SEO Friendly Blog Post
- What is Google AMP
- What is Bounce Rate and how to reduce it
The point here to note is that if we do not implement this file in the right place, then search engineers will think that maybe you have not included the robot.txt file so that the pages of your site may not even be index. So this small file has a lot of importance if it has not been used correctly, it can also reduce the ranking of your website. Therefore it is very important to have good information about this.How does Robot.txt work?
Any search engines or Web Spiders have come to your website or blog for the first time, then they crawl your robot.txt file as it contains all the information about your website, which is not to crawl and which ones to do is. And they index your guided pages, so that your indexed pages appear in search engine results.Robots.txt files can prove to be very fond of you if:
- You want search engines to ignore duplicate pages in your website
- If you do not want to index your internal search results pages
- If you do not want search engines to index some pages directed to you then
- If you do not want to index some of your files such as some images, PDFs, then
- If you want to tell search engines where your sitemap is stable then
How robots.txt file is created
If you have not even created a robots.txt file in your website or blog then you should make it very soon, because it is going to be very favored for you in the future. You must follow some instructions to create this:
- First create a text file and save it as robots.txt. For this, you can use NotePad if you use Windows or TextEdit if you use Macs and then save it according to the text-delimited file.
- Now upload it to your website's root directory. Which is a root level folder and it is also called "htdocs" and it appears after your domain name.
- If you use subdomains then you need to create different robots.txt file for all the subdomain.
What is Syntax of Robots.txt?
In Robots.txt we use some syntax, which we really need to know about.• User-Agent: Those robots that follow these rules and they are applicable (eg "Googlebot," etc.)
• Disallow: To use it means blocking pages from bots which you do not want to be able to access it. (Need to write disallow before files here)
• Noindex: With its use, the search engine will not index your pages that you do not want to be indexed.
• Use a blank line to separate all the User-Agent / Disallow group, but note here that there are no blank lines between the two groups (not between the user-agent line and the last Disallow needed .
• Hash symbol (#) can be used to give comments inside a robots.txt file, where all the items of # will be ignored will be ignored. They are mainly used for whole lines or end of lines.
• Directories and filenames are case-sensitive: "private", "private", and "PRIVATE" are quite different for all search engines.
Let's understand this with the help of example. Here's a note about him.
• The robot "Googlebot" here has not written a statement disallowed in it so that it is free to go anywhere
• All the sites here have been closed where "msnbot" has been used.
• All robots (other than Googlebot) are not permitted to view / tmp / directory or files called / logs, which are explained below, through comments, eg, tmp.htm,
/ logs or logs.php
User-agent: Googlebot
Disallow:
User-agent: msnbot
Disallow: /
# Block all robots from tmp and logs directories
User-agent: *
Disallow: / tmp /
Disallow: / logs # for directories and files called logs


0 comments:
Post a Comment