Free and Latest article publishing for websites and ezines!

Research on Several Problems in Text Retrieval

Information Retrieval technology (IR) aims at recognizing and acquiring information from the set of information, and plays an important role in our study and scientific research. Especially in today, the Internet is applied more and more widely, and the quantity of information increases sharply. Infromation Retrieval technology has become an efficient approach for people to develop and make use of all sorts of information resources effectively, to acquire and absorb information fleetly and roundly. The research of the present thesis involves in related technologies on information retrieval such as document processing, text classification and query optimization etc. The following are achieved results in this dissertation:1. Feature selection in text classificationIn the thesis, we introduce the concepts of absolute reliability, relative reliability and compositive reliability and set forth the feature selection algorithm based on mutual information reliability. The algorithm combines the correlativity between a term and the class and the difference on the term among all the classes, i.e., the reliability of the maxium mutual information among classes. Experiments show that compared to the basic mutual information function, the algorithm based on mutual information reliability can improve the precision, recall and F1 measures effectively. Further more, we also apply normalization to seval traditional functions or make local feature selection based on these functions. Experiments show that normalized feature selection and local feature selection can improve the classification precision more or less.2.Muticlass classificationIt is common to set a threshold for each class in order to settle the problem that a text may belong to different classes. When the similarity of the text and one class is above the threshold of this class, then the text is classified to this class. In this thesis, we research on the determination of

Recommended Articles from the IT Science Category:

Most Viewed ScienceArticles in the IT Science Category:

  1. Channel Model Simulation and Spread Spectrum OFDM for HF Communication
  2. Study on the Political Function of Mass Media
  3. Research on Algorithms of GPU-Based 3D Medical Image Processing
  4. Study on Radar Tracking and Discrimination for Ballistic Missiles
  5. Research on QoS Based Multicast Routing Protocols in Mobile Ad Hoc Networks
  6. Study on Robot Joint Based on Reversing Ball Screw Mechanism
  7. Research on Real Time Pulse Train Deinterleaving for Radar Intercept System
  8. Reaearch on Optimization Problem of Manufacturing Process in a Discrete Manufacturing Industry
  9. Study of Parallel FDTD Algorithm and EM Scattering in Layered Half-space
  10. Spatial Three Degree-of-Freedom Parallel Mechanisms: Configurations, Performances and Applications
  11. Channel Estimation in MIMO-OFDM Wireless Communication System
  12. Preparation and Investigation of p-ZnO Film and ZnO Light Emitting Device
  13. The Application and Study of Electrochemical Biosensors Based on Nanomaterials
  14. High-speed Polarization Control in Optical Fiber and Polarization Encoding Communication
  15. A Study of Space-Frequency Coding and Signal Detection in MIMO-OFDM Systems


© 2004-2009 Latest-Science-Articles.com - All Rights Reserved Worldwide.