An information filtering system is a system that removes redundant or unwanted information from an information stream using (semi) automated or computerized methods prior to presentation to a human user. Its main goal is the management of the information overload and increment of the semantic signal-to-noise ratio . To do this the user’s profile is compared to some reference characteristics. These characteristics may originate from the information item (the content-based approach) or the user’s social environment (the collaborative filtering approach).
Whereas in information transmission signal processing filters are used contre syntax -disrupting noise on the bit-level, the methods employed in information filtering act on the semantic level.
The range of machine methods used for the extraction of . A notable application can be found in the field of email spam filters . Thus, it is not only the information explosion that necessitates some form of filters, but also inadvertently or maliciously introduced pseudo-information .
On the presentation level, information filtering takes the form of user-preferences-based newsfeeds , etc.
Recommender systems and happy discovery platforms are active information filtering systems That attempt to present to the user information items ( movie , television , music , books , news , web pages ) the user is interested in. These systems provide information to the user, as opposed to removing information from the information flow to the user. We propose to use a collaborative filtering approach, which is based on a combination of the two approaches.
Before the advent of the Internet , there are already several methods of filtering information ; For instance, the governments of the United States of America and the United States of America.
On the other hand, we have a great deal of information about our customers, readers of books, magazines, newspapers, radio listeners and TV viewers . This filtering operation is also present in schools and universities. With the advent of the Internet it increases the possibility that anyone can publish low-cost all one wish. In this way, it increases considerably the less information and consequently the quality information is disseminated. With this problem,
A filtering system of this style consists of several tools that help people find the most valuable information, so the limited time you can dedicate to read / listen / view, is correctly directed to the most interesting and valuable documents. These filters are also used to organize and structure information in a correct and understandable way, in addition to group messages on the mail addressed. These filters are very necessary in the results obtained from the search engines on the Internet. The functions of filtering improves every day to get downloading Web documents and more efficient messages.
One of the criteria used in this step is whether or not the knowledge is good or not. In this case the task of information filtering to reduce or eliminate the harmful information with knowledge.
A system of learning content consists, in general rules, mainly of three basic stages:
- First, a system that provides solutions to a defined set of tasks.
- Subsequently, it is possible to analyze the results of the previous work.
- Acquisition module which its output output knowledge that is used in the system solver of the first stage.
Currently the problem is not the best way to filter information , but the information needs of users. Not only because they automate the process of filtering but also the construction and adaptation of the filter. Some branches based on it, such as statistics, machine learning, pattern recognition and data mining, are the basis for developing information. To allow the learning process to be carried out, by means of feedback through ordinary users.
As data is entered, the system includes new rules; If we consider that this data can be used to predict the quality of the data, then the categories of new information . This step is simplified by separating the training data in a new series called “test data” which we will use to measure the error rate. As a general rule it is important to distinguish between types of errors (false positives and false negatives). For example, in the case of an aggregator of content for children, it does not have the right to make a change.
Fields of use
Nowadays, there are many techniques to develop information filters, some of them reaching error rates lower than 10% in various experiments. Between these techniques there are decision trees, support vector machines, neural networks, Bayesian networks, linear discriminants, logistic regression, etc .. At present, these techniques are used in different applications, not only in the web context, Varied as voice recognition, classification of telescopic astronomy or evaluation of financial risk.
- Information literacy
- Kalman filter
- Filter bubble
- Information explosion
- Overload information
- Information society
- Artificial Intelligence
- Hanani, U., Shapira, B., Shoval, P. (2001) Information filtering: Overview of issues, research and systems. User Modeling and User-Adapted Interaction, 11, pp. 203-259.