Internet search engines: review of popular and little-known search engines
Total
Introduction
Few people can now imagine the Internet without search, search results, and information search systems (IRS) that organize it all. But until recently, all Internet information fit into several directories, the names of which are still well-known (DMOZ, Yahoo).
Today, the volume of information on the Internet is so huge that it is impossible to fit it into any catalogues. To process, store information, and organize searches, powerful software products have been created and continue to be created, which we call search engines (SE). Each search engine (search engine) has its own databases, its own algorithms for processing, searching, ranking and displaying information.
Internet search engines are
The following academic definition of search engines can be given. A search system is a set of programs and technical means for organizing a user search on the Internet, in which, when responding to a text query, the user receives a list of relevant (corresponding to the request) results.
The issuance is made in the form of a list of links to the source of information with a brief description (preview), sometimes with a photo.
For the first example, let’s remember the world search leader “Google” and the leader of the Runet search engine “Yandex”. In addition to these search engines, you can name a dozen more existing search engines, which we will talk about below.
Opinion: Search engines Google, Yandex and others are not generators (producers) of content, but are aggregators (accumulators) of content and, for the most part, other people’s content. It is worth remembering that using someone else’s content to create your own traffic and monetize it can be characterized as “piracy,” which, of course, does not happen in reality.
Rating
- and Google share the first two places of leaders: about 49% and 45%.
- Third place: Search Mail.ru about 3%;
- Other search engines float below 1%.
I look at the statistics on Google Analytics:
- yandex/organic 40.26%
- google/organic 38.93%
- mail.ru/organic 0.60%
- rambler/organic 0.52%
- bing/organic 0.12%
The statistics are inexorable: Yandex searches are used most of all, and if you consider that 3% is a good result compared to 45%, then Mail.ru search can be called the third most popular.
In this regard, discussions about the popularity of search engines other than Yandex and Google can be attributed to superstition, and special promotion of sites in other search engines (not Yandex and Google) does not deserve attention.
How search engines work
The question of how search engines work is as common as the question “what color is the sky.” If the sky is blue, then search engines collect information on the Internet, process it, rank it and send it to the user based on the search query.
The theory of Internet search is much more extensive and cannot be presented in the article. However, the main points will be useful to us:
Internet search engines do not store documents, that is, they do not download and upload documents completely to their repositories;
IRSs use the Internet as a decentralized document repository. Search engines periodically crawl the Internet, select the information they need based on their algorithms, and partially place it (the information) in their database (Database). This leads to several problems:
- Information retrieval systems do not use all the information on the Internet, but only part of it;
- Internet information changes frequently. About 1,500 thousand pages are added per day, hence the possible “empty output”;
- There are a large number of duplicates (duplicate content). Unfortunately, I don’t have exact data on takes, and the reported figure of 25% of takes seems too high;
- There is a lot of advertising, which is also bypassed by search engines;
- “Wandering” of search robots on the network greatly increases the load on resources (does not apply to search engines);
- Most sites are commercial (about 83%) and have little informational value.
For these and some other reasons, the vast majority of Internet information retrieval systems use a keyword search scheme (search engines), rather than a classic search scheme based on information classification.
Features of keyword search
Despite the changing algorithms of search engines, whose advertising tries to convince us that machines are becoming smarter and more understanding, the basis of the work of search engines is keyword search.
I like this keyword search scheme.
As you can see, the work of Internet search engines is based on searching for new documents (search robot Spider + Crawler), indexing detected documents (Indexer) and executing a user query (Search Engine Results Engine). The names of search robots used for these purposes are listed in brackets.
As I said, most search engines do not copy the full text of documents into their database. For searching, when indexing a document, a search image is created. To organize a search by , the indexing robot creates an image of the document using the so-called derived method. That is, the document image contains a title and a set of keywords.
However, it can be stated quite accurately that all IPS pay attention to the following:
- Presence of a keyword in
document; - The presence of a key in the URL or domain;
- The presence of a key in the subtitle;
- Total number of keys on the page (density%);
- Presence of keys in the description;
- What web links lead to this page;
- What internal links are there on this page?
Page ranking
At the end of the theory, it is worth mentioning. More often, page ranking in SERPs is mentioned in the context of relevance. That is, search engines must build search results to match the search query as closely as possible. As Yandex writes, nothing should be lost (completeness of the output) and nothing unnecessary should be found (accuracy of the output). You see how this works out in practice every day.
Conclusion
- Internet search engines are complex software products, the work of which is supported by thousands of specialists and enormous material resources.
- Search engine algorithms are kept secret, although the underlying focus of algorithm updates is publicly available and bears proper names.
- Despite the different approaches to generating search results, all search engines are based on the general principles of page indexing, which to this day remain basic for promotion.
Yandex search engine
A popular Runet search engine that often becomes the most popular. According to statistics from 2009, Yandex constantly crawls 15 million pages of the Runet, processing 140 thousand GB of text data, 1.6 billion unique pictures out of 2.1 billion pictures in total.
Yandex search engine was created in 1993. The word Yandex does not mean anything, although it is generally accepted that it is a transformation of the word “Index”, or the phrase “yet another indexer”. Today, Yandex.Search processes a quarter of a billion requests a day, and if it were so intrusive, it would be my favorite search engine.
Search Yandex
https://yandex.ru/: Yandex user search is organized on the Internet, taking into account the user’s region. Ability to search by images, videos, maps, news, blogs, products and dictionaries.
For fine-grained searches, there is a search language here (https://yandex.ru/support/search/query-language/).
Internet search engines Yandex
Google search engine
In the Google search engine, the search is organized without topics (main search) and searches by sections: pictures, news, maps, videos, shopping, books, air tickets, finance.
There are settings:
Safe search. Allows you to block inappropriate content and sexual images from Google search results. This feature does not guarantee 100% protection, but it hides most of such content.
Setting the number of results per page (default 10).
Personal results. Find links, pictures and videos on Google that your friends have shared with you on social networks.
Region selection. The default is the current region.
Languages. You can specify the search language.
Advanced Search. Allows you to search using advanced parameters.
Tools. Here you can select the search language, specify the time the information appeared, and select an exact match or the entire search result.
Internet search engines Google
Mail search engine
https://go.mail.ru/. Here the search is organized on the Internet (general search), by videos and pictures. There is a separate search for applications for mobile devices.
(https://www.bing.com/?scope=web&FORM=Z9LH). General search, search by pictures, videos, news, maps.
Yahoo search in Russian. https://ru.search.yahoo.com/. Pure search without advertising. Search the Internet, using pictures and news. Select the time to add information.
Other search engines
- DuckDuckGo (https://duckduckgo.com/) Smart search.
- Pipl (https://pipl.com/) Search for people in the USA.
- Findsounds ( http://www.findsounds.com/ 11 Tools for analyzing the relevance of site pages to a search query