Project Report and Technical Documentation

Project Report and Technical Documentation

Thomas Jund Andrew Mustun

Laurent Cohn 24th May 2004

Version 1.0

ii Abstract In this paper we present quaneko, a tool to efficiently find data on the local computer system. The purpose of this document is the technical specification and description of the tool. Please note that this is not a user manual. You can find the user manual on the project web site . To engineer an efficient index, which is the heart of the project, is a challenging task. Besides the importance of good performance for search queries and fast indexing, there has to be a flexible handling for file types that are used in a typical modern office environment. Adding support for individual file formats has to be as simple as possible and must not need any programming skills. Compared to other search tools like the built-in Windows Search, quaneko can be configured to parse any file formats. After reading this document you will know which indexing system was implemented in quaneko and how it was developed. To learn about the way how to use the tool we recommend to read the manual mentioned above. More information and file downloads for quaneko are available from .

Project Report and Technical Documentation

CONTENTS

iii

Contents

1 Introduction

1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Task Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.3 Objective and Structure of this Document . . . . . . . . . . . . . . . . . . . . . 2

2 Problem Analysis

3

2.1 User Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1.1 User Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1.2 User's Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1.3 Existing Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 Usability Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2.1 Common Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2.2 GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2.3 CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2.4 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.3 Supportability Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.3.1 Operating Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.3.2 Programming Language and Libraries . . . . . . . . . . . . . . . . . . . 5

3 Design

6

3.1 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.2 Core Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4 Filter Module

8

4.1 Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4.2 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

5 Stemming Module

9

6 Parser Module

10

7 Index Handler

11

7.1 Word Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

7.2 File Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Project Report and Technical Documentation

iv

CONTENTS

7.3 Direct Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 7.4 Inverted Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 7.5 Array Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

8 Optimization

15

8.1 Combining Different Types of Indexes . . . . . . . . . . . . . . . . . . . . . . . 15

9 Settings

16

9.1 File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

9.2 Settings Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

10 Testing Documentation

18

10.1 Application Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

10.2 Performance Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

10.3 Memory leaks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

11 Installation

19

11.1 Installation of Binaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

11.2 Installation from Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

11.2.1 Unix, Linux, Mac OS X . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

11.2.2 Windows (32bit) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

12 Project Management

21

12.1 Project Initiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

12.1.1 Choosing a License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

12.1.2 Hosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

12.2 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

12.2.1 Source Code Documentation . . . . . . . . . . . . . . . . . . . . . . . . 21

12.2.2 Technical Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . 21

12.2.3 User Manual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

12.3 Choosing a Development Platform . . . . . . . . . . . . . . . . . . . . . . . . . 22

12.4 Choosing a Programming Language . . . . . . . . . . . . . . . . . . . . . . . . 23

12.5 Choosing a GUI Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

12.6 Version Control System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

12.7 Generation of Executables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Project Report and Technical Documentation

CONTENTS

v

12.8 Project Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 12.8.1 Process Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 12.8.2 Project Responsibilities and Deliverables . . . . . . . . . . . . . . . . . 24

12.9 Milestones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 12.10Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 12.11Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

12.11.1 Risk Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

13 Conclusion

26

13.1 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

13.2 Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

13.3 Room for Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

A C++ Application Programming Interface

27

A.1 qnk Namespace Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

A.1.1 Detailed Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

A.1.2 Function Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . 28

B Glossary

33

C References

34

D Index

35

E About the Authors

37

Project Report and Technical Documentation

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download