Your robots.txt file determines which portions of your site are crawled, so you have to decide upon which portions are to be crawled and which should be blocked.
Creation and maintenance of proper robots.txt files can be difficult at times. While many sites don’t even include a robots.txt file, locating the directives inside of a robots.txt file that’s large that are blocking crawling of URLs can be a bit difficult. To make this process easier, Google is now providing a robots.txt testing tool which has recently been updated within Google Webmaster Tools.
You can find the updated testing tool in Webmaster Tools within the Crawl section:
You will find your most recent robots.txt file in this section and it enables you to choose URLs to determine if they are able to be crawled. This tool also allows you to change your robots.txt will and test it. However you still must upload your new version to your server after testing for your changes to take effect. Google includes more information on robots.txt directives and the way they’re processed in their developer’s site.
In addition, you have the ability to view older robots.txt versions and determine when Google’s crawlers were block from crawling portions of your site. For Instance, if Google detects a server error related to your robots.txt file, they won’t crawl your site.
It could be that there are some warnings or errors displayed for your site. Consequently Google recommends that you check your robots.txt files. You can also use this check in conjunction with other Google Webmaster tools, like using their “Fetch as Google” tool to display important page sites. If they tool reports blocked URLs, you can perform a robots.txt test to locate the directive that is causing them to be blocked, and remedy the situation. It is common for older robots.txt files to block JavaScript, CSS, and mobile content. Fixing this situation is often fairly easy after you have found these issues.
Google’s updated robots.txt testing tool should make it faster and more straightforward to maintain your robots.txt files.