Search

From Things and Stuff Wiki
Revision as of 04:20, 13 March 2017 by Milk (talk | contribs) (→‎Systems)
Jump to navigation Jump to search


General

Services

Google


DuckDuckGo

Wolfram Alpha

Other

robots.txt

  • https://en.wikipedia.org/wiki/Robots_exclusion_standard - also known as the robots exclusion protocol or simply robots.txt, is a standard used by websites to communicate with web crawlers and other web robots. The standard specifies how to inform the web robot about which areas of the website should not be processed or scanned. Robots are often used by search engines to categorize web sites. Not all robots cooperate with the standard; email harvesters, spambots and malware robots that scan for security vulnerabilities may even start with the portions of the website where they have been told to stay out. The standard is different from, but can be used in conjunction with Sitemaps, a robot inclusion standard for websites.

Bad robots (csf blocked);

  • bingbot - 207.46.13.0/24, 157.55.39.0/24
  • majestic-12 - 136.243.103.165

Alerts

Services

SEM/SEO

Google

  • Google Tag Manager lets you add and update your website tags, easily and for free, whenever you want, without bugging the IT folks. It gives marketers greater flexibility, and lets webmasters relax and focus on other important tasks.
  • Google Keyword Planner is like a workshop for building new Search Network campaigns or expanding existing ones. You can search for keyword and ad group ideas, get historical statistics, see how a list of keywords might perform, and even create a new keyword list by multiplying several lists of keywords together. A free AdWords tool, Keyword Planner can also help you choose competitive bids and budgets to use with your campaigns.

News

Articles etc.

Sitemaps

  • Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Tools

JavaScript

Bad

Other

Distributed

  • Yacy - Distributed web search service

Other