Learning from Web: Review of Approaches

04/13/2005
by   Vitaly Schetinin, et al.
0

Knowledge discovery is defined as non-trivial extraction of implicit, previously unknown and potentially useful information from given data. Knowledge extraction from web documents deals with unstructured, free-format documents whose number is enormous and rapidly growing. The artificial neural networks are well suitable to solve a problem of knowledge discovery from web documents because trained networks are able more accurately and easily to classify the learning and testing examples those represent the text mining domain. However, the neural networks that consist of large number of weighted connections and activation units often generate the incomprehensible and hard-to-understand models of text classification. This problem may be also addressed to most powerful recurrent neural networks that employ the feedback links from hidden or output units to their input units. Due to feedback links, recurrent neural networks are able take into account of a context in document. To be useful for data mining, self-organizing neural network techniques of knowledge extraction have been explored and developed. Self-organization principles were used to create an adequate neural-network structure and reduce a dimensionality of features used to describe text documents. The use of these principles seems interesting because ones are able to reduce a neural-network redundancy and considerably facilitate the knowledge representation.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset