people.dsv.su.se

SELECT

Telematics RE4008

REPORT D2.2 version 1.3

Protocol specifications

Written by:

Jacob Palme,

Michel Claude

Johan Kaers

Christopher Lueg

David Mason

Andras Micsik

Massimo Vanocchi

Richard Wheeler

Contractual Date of Delivery: 31 May 1999

Last Revision: 24 Jun 2000

Nature of the Deliverable: SP**

Deliverable Type: PU*

Workpackage WP2

Task S2

Responsibility: DSV

*Type: PU-public, LI-limited, RP-restricted

**Nature: PR-Prototype, RE-Report, SP-Specification, TO-Tool, OT-Other

Table of contents

1. Table of contents 2

2. Executive summary 4

3. Terminology issues 4

4. Implementation issues 5

5. Protocol elements summary table 5

6. File name structure on the sel.nu server 6

7. Handling of anonymous ratings 7

8. Style sheet information in XML encodings 8

9. Submission points 8

10. Protocol elements full specifications 9

10.1. Validation of XML Encodings 9

10.2. Ratings 9

10.2.1. The DTD for an atomic rating 9

10.3. Get-Service-Description-List (XML) 11

10.3.1. Query format (get service description-list): 11

10.3.2. Response format (get service description-list): 12

10.4. Get-Service-Description 13

10.4.1. Query format (get service-description): 13

10.4.2. Response format (get-service-description): 14

10.5. Set-Service-Description (not yet ready) 14

10.5.1. Query format (set-service-description) 14

10.5.2. Response format (set-service-description) 14

10.6. Send-Rating 14

10.6.1. Transmit-Format (send-rating): 15

10.6.2. Response format (send-rating-response): 16

10.7. Set-Profile 19

10.7.1. Transmit format (set-profile): 19

10.7.2. The XML DTD for the user profile 20

10.7.3. Response format (set-profile): 22

10.8. Get-Profile 24

10.8.1. Query format (get-profile): 25

10.8.2. Response format (get-profile): 26

10.9. Login 28

10.9.1. Query format (login): 29

10.9.2. Response format (login): 31

10.10. Logout 31

10.10.1. Query format (logout): 32

10.10.2. Response format (logout): 32

10.11. Get-Atomic-Ratings 33

10.11.1. Query format (get-atomic-ratings): 33

10.11.2. Response format (get-atomic-ratings-response): 35

10.12. Simple-Search Operation 38

10.12.1. Query format (simple-search-query): 39

10.12.2. Response format (simple-search-response): 42

10.13. Advanced-Search Operation (Not yet ready) 44

10.13.1. Query format (advanced-search-query): 44

10.13.2. Response format (advanced-search-response): 45

10.14. Evaluate Operation 45

10.14.1. Query format (evaluate-query): 46

10.14.2. Response format (evaluate-response): 48

10.15. Exchange-Ratings-Data (not yet ready) 52

10.15.1. Query format (exchange-ratings-data): 52

10.15.2. Response format (exchange-ratings-data): 53

11. References 54

12. Appendix A: Summary of the Euroseek Search Language 55

12.1.1. Simple search 56

12.1.2. Phrase search 56

12.1.3. Search for names 56

12.1.4. Combined search, AND 57

12.1.5. Combined search, OR 57

12.1.6. Exclusion 57

12.1.7. Grouped search 58

12.1.8. Field search, host 58

12.1.9. URI search 59

12.1.10. Region 59

12.1.11. Domains 59

12.1.12. Language 59

13. Appendix B: Query language translation to URI string 59

14. Appendix C: A word macro to extract the XML encodings 60

15. Appendix D: NNTP Authentication 61

16. Appendix E: List of test files 65

17. Appendix F: A short introduction to XML 66

18. Appendix G: Issues for further study 67

19. Appendix H: Major differences between version 1.1 and version 1.2 68

20. Appendix I: The SELECT service description 68

20.1. Example 71

21. Appendix J : The SELECT Agent protocol 73

Entry points 74

XML Extensions 74

Privacy and security policy 76

22. Appendix K : Protocol Implemtation Status 77

Executive summary

The SELECT rating system is a system for storing and using ratings on Internet resources. Ratings can be provided by ordinary readers of the resource, by expert readers, by computer linguistic analysis of the documents, or by observation of user behaviour. This specification describes the protocols needed between different program modules to provide the SELECT service. Protocols are specified to find out how documents are rated, to send ratings, to register a rater, to search for rated resources, to evaluate a list of resources (get their ratings), to find new items with good ratings, and to exchange ratings data between SELECT servers.

Terminology issues

The word “derived rating”, is in this document used for what in other documents is named “general rating”.

Implementation issues

The protocols specified in this specification are somewhat more general-purpose than the functionality specified in the SELECT functional specification [17]. This means that an implementor can choose to implement the functionality in the functional specification, as a special case of these protocols, or to implement the functionality described in the protocol specification.

Protocol elements summary table

|Name |Task |Client(s) |Page |

|Get |Find out which rating services are handled by this server |Input reader ratings / Another SELECT |9 |

|service-description-| |version 1.0 server / A filtering process, a| |

|list | |user or manager | |

|Get-service-descript|Find out which rating descriptors are handled by this rating|Input reader ratings / Another SELECT |13 |

|ion |service |version 1.0 server / A filtering process, a| |

| | |user or manager | |

|Set-service-descript|Specify and modify a service description, for example by |A filtering service manager |14 |

|ion |adding new rating descriptors. Maybe this can be done | | |

| |manually, by filling in a HTML form, and does not require | | |

| |any other protocol support? | | |

|Send-rating |Send a rating on a resource |Input reader ratings / Automatic rating |14 |

| | |agents | |

|Set-profile |Self-register a rater with a server, as well as registering |Input reader ratings, a user or manager |19 |

| |someone else as a rater for a closed server, or modifying | | |

| |the profile of an existing user. | | |

|Get-profile |Get the profile of another user, subject to access controls.|Input reader ratings, a user or manager |24 |

|Login |Establish credentials for a user |All of the above |24 |

|Logout |Waive credentials for a user |All of the above |31 |

|Get-atomic-ratings |Get the ratings made by one or more named users on one or |All of the above |33 |

| |more resources. | | |

|Simple-Search |Make a search for rated resources, HTML search query form |Rating search client, a user |33 |

|Advanced-Search |Make a search for resources, XML query form |Rating search client, NLP module (to find |44 |

| | |items which need NLP ratings) | |

|Evaluate |Get the ratings for a list of resources |Rating search client, news client, news |42 |

| | |server, a user | |

|Exchange-ratings-dat|Mirror ratings data between two SELECT version 1.0 servers |One SELECT version 1.0 server |52 |

|a | | | |

An introduction to the XML and DTD formats, which are used throughout this specification, is given in Appendix F: A short introduction to XML on page 66.

File name structure on the sel.nu server

Here is a suggested file structure for the sel.nu server:

|URL |Content |

| |Repository of XML format specifications (DTDs) for |

| |SELECT version 1. |

| |XML format the list of services. |

| |XML format for the description of one SELECT service. |

| |XML format for the send-rating operation. |

| |XML format for the responses of the send-rating |

| |operation. |

| |XML format for the evaluate query request. |

| |XML format for the response of the evaluate operation.|

| |List of SELECT services in version 1 of SELECT, see 20|

| |Appendix I: The SELECT service description on page 68.|

| |Version 1 of the select general service |

| |Description of common descriptors to several SELECT |

| |services. |

| |Description of the general SELECT service. The general|

| |service is for everyone, not for specialised groups. |

| |Entry point for incoming non-anonymous (registered or |

| |pseudonymous) ratings to the select general service. |

| |Entry point for incoming anonymous ratings to the |

| |select general service. |

| |Entry point for the web-based search operation. |

| |Entry point for the evaluate operation. |

| |Version 1 of the select ISCN service |

| |Description of the special SELECT service for ISCN. |

Handling of anonymous ratings

Ratings can be either identified, anonymous.

All SELECT services may not allow anonymous ratings (see page Error! Bookmark not defined.).

Anonymous ratings are fully anonymous, no raterid of any kind is specified.

Anonymous ratings are sent to a different URL (see page 7 and 8), containing /id/, than the URL. For anonymous ratings, the “raterid” has the special value “anonymous” (see page 34).

When combining atomic ratings to derived ratings, different weight may be given to identified, anonymous ratings, including the weight zero to anonymous ratings.

When retrieving atomic ratings, you will get identified ratings only for yourself. Other ratings are not returned

Style sheet information in XML encodings

The XML encodings produced by SELECT agents may contain style sheet information. An agent which does not use this information, should ignore it. Such an agent must be capable of receiving and ignoring style sheet information.

Such style sheet information may be:

(a) A style sheet reference in the processing instruction head of an XML document.

(b) A style sheet reference in the DTD file (not valid today, August 1999, but may become valid in the future).

Example: The following two XML data are semantically equal:

|Version 1: |Version 2: |

| | |

| | |

| | |

Submission points

Below are shown the entry points for access to the SELECT general service. For a specialised service, the word “general” below should be replaced by the subdirectory for that service.

|Operation |Service |Submission point |

|Get-Service-Descriptio|All services | |

|n-List | | |

|Get-Service-Descriptio|General service | |

|n | | |

|Send-Rating |General service | |

| | |Note: For use by ratings supplied by identified raters. |

|Send-Rating |General service | |

| | |Note: For use for ratings supplied by identified raters. |

|Send-Rating |General service | |

| | |Note: For use for ratings supplied anonymously. Different URL for entering ratings are needed to |

| | |avoid the identified user cookie being sent with anonymous ratings. |

|Set-Profile |General service | |

|Get-Profile |General service |? |

|Login |General service |? |

|Logout |General service |? |

|Get-Atomic-Ratings |General service | |

|Simple-Search |General service |? |

|Advanced-Search |General service | |

|Evaluate |General service | |

Protocol elements full specifications

10.1 Validation of XML Encodings

The XML code in this specification has been validated using the XML validation service at .

10.2 Ratings

Ratings are collections of descriptors of resources, which can be used as a basis for filtering.

10.2.1 The DTD for an atomic rating

|Explanation |Format of information sent () |

| | |

|Start of attribute list for atomic-rating | |

|The value for one rating descriptor | |

| | |

10.3 Get-Service-Description-List (XML)

The Get service-description-list operation retrieves a list of SELECT version 1.0 service descriptions and their URIs, but does not retrieve the actual service descriptions. This will not necessarily be a list of all SELECT services over the world, it may usually be a list of SELECT services on this particular host, or a list of services recommended by the manager of this host.

10.3.1 Query format (get service description-list):

An HTTP GET operation is performed on a URI established to return SELECT version 1.0 service descriptions.

Example:

This description is requestes list at:

|Explanation |Information sent |

|Get the file named |GET /v1.0/select-service-descriptions HTTP/1.1 |

|"sel-1/select-service-descriptions". Preferred | |

|language is in English, second choice Italian | |

|From the HTTP server "sel.nu" port 80. |Host: sel.nu |

|Only files in the format application/xml are |Accept: application/xml |

|accepted. | |

|Request is sent from a client which is, or |User-Agent: Mozilla/4.5 |

|simulates, Netscape 4.5. | |

|This user has connected to this server before, |Cookie: session="1234567890123456" |

|and a cookie identifies the session. | |

10.3.2 Response format (get service description-list):

|Explanation |Information sent |

|Standard reply header |HTTP … |

| | |

| | |

| | |

| | |

| | |

| | |

| | |

| | |

| | |

|DTD of replied message | |

| | |

| | |

| | |

| | |

10.4 Get-Service-Description

Summary: The Get service-description operation will query a SELECT version 1.0 server to get a description of some services. The main components of this description is a list of descriptors and scales used by this service.

Access control: None.

Input data: The names of the services.

Output data: A description of the service, and a list of the descriptors supported for ratings in that group.

Base protocol: HTTP combined with XML.

10.4.1 Query format (get service-description):

An HTTP GET operation is performed on a certainURI.

Example (get service-description):

This example retrieves the service description at the URI:

get-service-descriptions

|Explanation |Information sent |

|Get the file named “general/select-service-description” which |GET /v1.0/get-service-descriptions HTTP/1.1 |

|contains a description of the SELECT version 1.0 general | |

|service. The SELECT version 1.0 general service is a service | |

|available to everyone, as different for service for special | |

|user groups. | |

|Get the file from the HTTP server "sel.nu" port 80. |Host: sel.nu |

|Only files in the format application/xml are accepted. |Accept: application/xml |

|Indicates which client sent this operation. Mozilla/4.5 |User-Agent: Mozilla/4.5 |

|indicates that the client is either Netscape version 4.5, or | |

|another client which prefers to simulate Netscape 4.5. | |

|This user has connected to this server before, and a cookie |Cookie: session="1234567890123456" |

|identifies the session. | |

|Request of SELECT test service | |

| | |

| | |

| | |

| | |

| | |

| | |

|DTD of message | |

| | |

| | |

| | |

10.4.2 Response format (get-service-description): checked

The response is an XML [12], [13] resource, containing a SELECT version 1.0 service description. The XML Resource Type Declaration for this XML page is:

See Appendix I: The SELECT service description on page 68.

10.5 Set-Service-Description (not yet ready)

Summary: The set service-description operation is used to modify a service description, for example by adding new rating descriptors. Maybe this can be done manually, by filling in a HTML form, and does not require any protocol support?

Issues:

Access control:

Input data:

Output data:

Base protocol:

10.5.1 Query format (set-service-description)

10.5.2 Response format (set-service-description)

10.6 Send-Rating

Summary: The send-rating operation will send one or more ratings to a SELECT version 1.0 server. This operation can be used both for explicit ratings provided by users, for implicit ratings derived by observing user behaviour, and for ratings derived through automatic analysis of documents using NLP methods.

Access control: If the rater is not identified by a cookie (created by a login operation), then this rating will be handled as anonymous, the user will be instructed to either login first, or send the ratings to the separate entry-point for anonymous ratings. Some SELECT servers may not accept anonymous ratings.

Input data: Information about the rated resource, the rater and the rating values.

Output data: Acceptance or rejection.

Base protocol: XML transported through HTTP.

10.6.1 Transmit-Format (send-rating):

A HTTP POST operation, with the content the XML-formatted rating.

The send-rating is an HTTP POST operation, whose body is an XML resource containing the rating. Below is a POST sent to the URI

|Explanation |Information sent |

|Connect to the SELECT server. The URI used identifies the |POST /v1.0/general/id/input-ratings HTTP/1.1 |

|rating service, to which this rating is sent. | |

|To the HTTP server "sel.nu" port 80. |Host: sel.nu |

|Only files in the format application/xml are accepted. |Accept: application/xml |

|Request is sent from a client which is, or simulates, Netscape|User-Agent: Mozilla/4.5 |

|4.5. | |

|The format of the query is XML. |Content-Type: Application/xml |

|This user has connected to this server before, and a cookie |Cookie: session="1234567890123456" |

|identifies the session. | |

The body of the query is an XML [12], [13] resource. The XML Resource Type Declaration for this XML resource is as follows. Note that no Rater-ID is included, because this ID can be derived from the Cookie. And no rating-service-description is referred to, because the URI, to which this rating is sent, implies a particular rating-service.

|Explanation |Format of information sent () |

|Reference to data structure defined in a separate DTD file. | |

|A list of ratings are sent. | |

|Import DTD from separate DTD file atomic-rating.dtd. |%atomic-rating; |

Example (send-rating):

|Explanation |Information sent () |

|HTTP header |POST /v1.0/general/id/input-ratings HTTP/1.1 |

| |Host: sel.nu |

| |Accept: application/xml |

| |User-Agent: Mozilla/4.5 |

| |Content-Type: Application/xml |

| |Cookie: session="1234567890123456" |

|A blank line to mark the end of the HTTP header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration (DTD) file | |

| | |

|Start with information about the resource rated and about | |

|First rating descriptor | |

|Second rating descriptor, note that decimal values are | |

|Third rating descriptor | |

|Fourth rating descriptor | |

|End of data | |

10.6.2 Response format (send-rating-response):

The response is an XML [12], [13] resource. The XML Resource Type Declaration for this XML resource is:

|Explanation |Format of information sent () |

|The evaluation are returned, one rating | |

|service at a time. | |

|Whether all the rating labels were | |

|If only some of the rating values were | |

|rejected, this element is used to list the | |

|rejected rating values. | |

| | |

Refusal reasons for the send-rating operation

The following refusal reasons may be used in rejecting a send-rating operation by a SELECT server:

|Refuse-reason |Explanation |

|none |Operation was not rejected. |

|bad-syntax |Wrong syntax of HTTP header or XML data sent. |

|missing-info |Mandatory-information missing from sent data. |

|unknown-descriptors |Trying to store a rating for a descriptor not supported by this server. |

|wrongtype |Wrong type of a descriptor value, for example text for a descriptor which must have a numerical |

| |value. |

|wrong-competence |This rater is not allowed to send ratings with this competence to this service. |

|wrong-trust |This rater is not allowed to send ratings with this trust to this service. |

|wrong-rater-type |This service does not accept ratings of this rater-type from this user. |

|wrong-context |This service does not accept ratings with this context from this user. |

|access-control |This user is not allowed to send ratings to this service. (There are no operations in this |

| |specification to give people access rights. Some SELECT services may want to give only certain |

| |people the right to perform various operations, such as send ratings. How to do this is not |

| |described in this specification.) |

|not-logged-in |You performed an operation which requires login, but were not logged in. |

|other |Other errors |

Example 1 (positive send-rating response):

|Explanation |Information sent () |

|HTTP response header |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 7 July 1999 19:58:23 +0200 |

|A blank line to indicate the end of the HTTP header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration (DTD) file ||

|specifying the syntax for this XML resource. | |

|Accepted is default. | |

Example 2 (negative send-rating response):

|Explanation |Information sent () |

|HTTP response header. |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 7 July 1999 19:58:23 +0200 |

|A blank line to indicate the end of the HTTP header. | |

|Identifies that this is in XML format. | |

|References the Resource Type Declaration (DTD) file ||

|specifying the syntax for this XML resource. | |

|All ratings were not accepted. | |

|Rating-label rejected, this server does not accept ratings| |

|End of send-rating-response. | |

10.7 Set-Profile

Summary: The set-profile operation can be used for a rater to register him/herself (for services which allow this) and can be used by administrators to register raters (for services which do not allow self-registration). It can also be used to modify existing registrations.

Issues: The format of interest-profile is not specified. The format of reward-account is not specified.

Access control: The profile of a person is not modifiable by other people, only by that person him/herself, or an agent for that person, or certain certified SELECT processes, who will not divulge the profile to other people. A SELECT administrator may also usurp super-user privileges and perform this operation on anyone.

Input data: User identification and some profile attributes to be set or changed.

Output data: Accepted or rejected.

Base protocol: XML

10.7.1 Transmit format (set-profile):

The set-profile operation is an HTTP POST operation, whose body is an XML resource containing the profile, sent to the profiles cgi-script in the server for this particular rating service. Example: “”.

Note that a user, who is registered in more than one rating service, has a separate profile and a separate cookie for each of them.

|Explanation |Information sent |

|Connect to the SELECT server |POST /v1.0/general/id/set-profile HTTP/1.1 |

|To the HTTP server "sel.nu" port 80. |Host: sel.nu |

|Only files in the format application/xml are accepted. |Accept: application/xml |

|Request is sent from a client which is, or simulates, Netscape |User-Agent: Mozilla/4.5 |

|4.5. | |

|The format of the query is XML. |Content-Type: Application/xml |

|This user has connected to this server before, and a cookie |Cookie: session="1234567890123456" |

|identifies the session. | |

The body of the operation is an XML [12], [13] resource. The XML Resource Type Declaration for this XML resource is:

|Explanation |Format of information sent |

| |() |

|Reference to data structure defined in a separate DTD file. | |

|The set-profile consists of a profile plus two attributes. | |

|Start of XML attribute list. | |

|Profile is taken from the external ENTITY declared in the first |%profile; |

|row. Further information, see section 0. | |

Note: All attributes of a user profile are not settable for ordinary users (example: no-of-docs-rated). They should thus not be used when setting a profile.

Pseudonym should include the domain name of the SELECT server. Thus, if a user wants the pseudonym foobar, the user should request the pseudonym foobar@sel.nu when connecting to any of the SELECT servers at sel.nu. Note that this means that the same pseudonym is not allowed in more than one SELECT service, if all the services are on the same server. The SELECT server must check suggested pseudonyms in a data base which is common to all SELECT services on a particular host.

10.7.2 The XML DTD for the user profile

Profile is an XML [12], [13] resource. The XML Resource Type Declaration for profile:

|Explanation |Format of information sent () |

| | |

| | |

|Language code of languages understood by this user in priority| |

|order. May be repeated once for every language. Language codes| |

|are taken from RFC 1766 and ISO 639. | |

|Keywords specified by this user to identify his/her interests.| |

|Keywords automatically derived by observation of this user to | |

|identify his/her interests. | |

|This syntax is preliminary. We may assign a more complex | |

|syntax to this later on, with defined subelements and | |

|structure like a set of instructions in some filtering | |

|language. | |

|List of previous queries made by this user. May influence | |

|filtering procedure. | |

|To be defined. | |

Example (set-profile):

|Explanation |Information sent () |

|HTTP header |POST /v1.0/general/id/set-profile HTTP/1.1 |

| |Host: sel.nu |

| |Accept: application/xml |

| |User-Agent: Mozilla/4.5 |

| |Content-Type: Application/xml |

| |Cookie: session="012345678901234354" |

|A blank line to mark the end of the HTTP header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration (DTD) file| |

|Setting the profile for someone else. | |

|Start of the profile to be set. | |

|Embedded elements |standards |

| |computers |

| |fiction |

| |psychiatry |

| | Do not filter away any document containing "IETF" |

|End of set-profile | |

10.7.3 Response format (set-profile):

|Explanation |Format of information sent () |

|The evaluation are returned, one rating service at a time. | |

|References the Resource Type Declaration (DTD) file ||

|specifying the syntax for this XML resource. | |

|The set-profile was accepted. | |

Example 2 (negative set-profile response):

|Explanation |Information sent () |

|HTTP response header. |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 7 July 1999 19:58:23 +0200 |

|A blank line to indicate the end of the HTTP header. | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration (DTD) file ||

|specifying the syntax for this XML resource. | |

|The set-profile was not fully accepted. | |

10.8 Get-Profile

Summary: The get-profile operation can be used to get the profile settings for a particular user in a particular SELECT server. It can be used to retrieve a profile, and then send in a modified profile using (to the extent this is allowed) using the set-profile operation. Filtering agents may use get profile to get information used in the filtering for a certain user. ML algorithms may automatically modify a user's profile, the profile may indicate limits on what ML algorithms may do to it. (Example: “ML may not filter out any articles in newsgroup X, since it is very important to me”.)

Issues: Date, caching!

Access control: The profile of a person is not accessible by other people, only by that person him/herself, or an agent for that person, or certain certified SELECT processes, who will not divulge the profile to other people. A SELECT administrator may also usurp super-user privileges and perform this operation on anyone.

Input data: Identification of the user or search-info for the user, whose profile is wanted.

Output data: The profile of this user, or a rejection error.

Base protocol: application/x-www-form-urlencoded for the request, XML for the response.

10.8.1 Query format (get-profile):

The get-profile operation is an HTTP GET operation, with an HTML form

[pic]

The HTML behind this form might be:

SELECT Login

Get SELECT User

Info

Fill

in either an e-mail

address or a search string:

The

e-mail address

of the user:

Search

string:

Response format:

HTML

XML

Example: “”.

Example (get-profile):

|Explanation |Information sent |

|Connect to the SELECT server |GET /v1.0/general/id/profiles?raterid=jpalme HTTP/1.1 |

|To the HTTP server "sel.nu" port 80 |Host: sel.nu |

|Only files in the format application/xml are accepted. |Accept: application/xml |

|Request is sent from a client which is, or simulates, Netscape|User-Agent: Mozilla/4.5 |

|4.5. | |

10.8.2 Response format (get-profile):

|Explanation |Format of information sent () |

|Reference to data structure defined in a separate | |

|The evaluation are returned, one rating service at | |

|a time. | |

| | |

|Profile is taken from the external ENTITY declared |%profile; |

|in the first row. Further information, see section | |

|0. Profile may be incomplete, in case only some | |

|attributes are retrievable for this requestor (if | |

|you get profile for someone else than yourself). | |

| | |

| | |

Example 1 (positive get-profile response):

|Explanation |Information sent () |

|HTTP response header. |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 7 July 1999 19:58:23 +0200 |

|A blank line to indicate the end of the HTTP | |

|header. | |

|Identifies that this is in XML format. | |

|References the Resource Type Declaration (DTD) file| |

|specifying the syntax for this XML resource. | |

|The get-profile was accepted.. | |

|Start of the profile to be set. | |

|Embedded elements |standards |

| |computers |

| |fiction |

| |psychiatry |

| | Do not filter away any document containing "IETF" |

|End of get-profile | |

Example 2 (negative get-profile response):

|Explanation |Information sent () |

|HTTP response header. |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 17 August 1999 12:22:46 +0200 |

|A blank line to indicate the end of the HTTP header. | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration (DTD) file | |

|The get-profile did not succeed. | |

| | |

| |You cannot do this without first logging in. |

| | |

10.9 Login

Summary: The login operation is used to identify a user, and cause a cookie value to be set, which allows this user to perform certain access-controlled operations during the validity time of this cookie.

Standard NNTP authentication is described in Appendix D: NNTP Authentication on page 61.

Access control: E-mail address and password. Is this needed also if you only plan to send in pseudonymous ratings?

Input data: User identification by either e-mail address or pseudonym plus password.

Output data: Acceptance or rejection.

Base protocol: application/x-www-form-urlencoded for the request, HTML or XML for the response.

10.9.1 Query format (login):

The same as if the user has filled in the following HTML form:

[pic]

The HTML of which might be:

SELECT Login

SELECT Login

Your e-mail addressor pseudonym:

Your password:

Response format:

HTML

XML

Example (login):

|Explanation |Information sent |

|Connect to the SELECT server |GET /v1.0/general/id/login?e-mail-address=jpalme@dsv.su.se&password=select |

| |HTTP/1.1 |

|To the HTTP server "sel.nu" port 80 |Host: sel.nu |

|Only files in the format application/xml are accepted. |Accept: application/xml |

|Request is sent from a client which is, or simulates, |User-Agent: Mozilla/4.5 |

|Netscape 4.5. | |

10.9.2 Response format (login):

|Explanation |Format of information sent () |

|Response to a login | |

|If ok | |

Example (login response):

|Explanation |Information sent () |

|HTTP header |HTTP/1.1 200 OK |

| |Date: Sun, 25 Jul 1999 13:32:18 +0200 |

| |Server: Apache/1.2.4 |

| |Last-Modified: Sun, 25 Jul 1999 13:32:18 +0200 |

| |ETag: "437e5-98-3531f2e3" |

| |Content-Length: 152 |

| |Accept-Ranges: bytes |

| |Connection: close |

| |Content-Type: application/xml |

|Set the cookie. |Set-cookie: session="1234567890123456";Domain="sel.nu";Path="/v1.0/general/id/" |

|A blank line to mark the end of the HTTP | |

|header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration | |

|XML resource. | |

|Start and end of login-response for a | |

10.10 Logout

Summary: The logout operation removes the cookie, which gave the user privileges to perform certain commands in logged-in state.

Issues:

Access control:

Input data:

Output data:

Base protocol:

10.10.1 Query format (logout):

The same as if a user clicks on an HTML link:

Log out

Example (logout):

|Explanation |Information sent () |

|An ordinary HTTP connection |GET /v1.0/general/id/logout; |

|To the HTTP server "sel.nu" port 80. |Host: sel.nu |

|Only files in the format application/xml are accepted. |Accept: application/xml |

|Request is sent from a client which is, or simulates, Netscape 4.5. |User-Agent: Mozilla/4.5 |

|This user has connected to this server before, and a cookie |Cookie: session="1234567890123456" |

|identifies the session. | |

10.10.2 Response format (logout):

|Explanation |Format of information sent () |

|The evaluation are returned, one rating service at a time. | |

|Whether all the rating labels were accepted, or some of | |

Example (logout response):

|Explanation |Information sent () |

|HTTP header |HTTP/1.1 200 OK |

| |Date: Sun, 25 Jul 1999 13:32:18 +0200 |

| |Server: Apache/1.2.4 |

| |Last-Modified: Sun, 25 Jul 1999 13:32:18 +0200 |

| |ETag: "437e5-98-3531f2e3" |

| |Content-Length: 152 |

| |Accept-Ranges: bytes |

| |Connection: close |

| |Content-Type: application/xml |

|Max-age="0" resets the cookie. |Set-cookie: session="1234567890123456";Domain="sel.nu";Path="/v1.0/general/id/";Max-age="0|

| |" |

|A blank line to mark the end of the HTTP header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration (DTD) file| |

|Start and end of login-response for a rejected | |

|login. | |

10.11 Get-Atomic-Ratings

Summary: The get-atomic-ratings operation retrieves atomic ratings done by one or more named raters on one or more resources. It can be used by a user agent to find out if this user has already rated this resource. It might also be used in peer rating, where person A wants to find items rated highly by named individuals B and C.

Access control: The ratings made by a certain user can only be seen by that user, i.e. after logging in as that user. A person may however, in his/her personal profile, specify that other people can see his/her ratings. Get-atomic-ratings on a list of people may only be done in the following cases (i) all the people have specified in their profile that their ratings may be seen by other people, or (ii) the requestor is a certified filtering agent which will not divulge the personal ratings to a person, or (iii) the list of users is larger than ten, in this case, the atomic ratings are returned without identification of who made which rating.

Input data: A URI for the rated resource, and a list of one or more people, whose atomic ratings on this resource are wanted.

Output data: A list of atomic ratings, with or without identification of who made them, or an error code.

Base protocol: XML.

10.11.1 Query format (get-atomic-ratings):

The get-atomic-ratings query is an HTTP POST operation, whose body is an XML resource containing the query, sent to .

Note: You must be logged in, to perform this operation, even if you only are going to retrieve anonymous ratings.

|Explanation |Information sent |

|Connect to the SELECT server |POST /v1.0/general/id/get-ratings HTTP/1.1 |

|To the HTTP server "sel.nu" port 80. |Host: sel.nu |

|Only files in the format application/xml are |Accept: application/xml |

|accepted. | |

|Request is sent from a client which is, or simulates,|User-Agent: Mozilla/4.5 |

|Netscape 4.5. | |

|The format of the query is XML. |Content-Type: Application/xml |

|This user has connected to this server before, and a |Cookie: session="1234567890123456" |

|cookie identifies the session. | |

The body of the query is an XML [12], [13] resource. The XML Resource Type Declaration for this XML resource is:

|Explanation |Format of information sent |

| |() |

| | |

|Start of attribute list for get-atomic-ratings | |

|Each URI to be evaluated is a free text field containing the URI of the | |

|resource to be evaluated. | |

|Start of attribute list for rater. | |

|Identify whose ratings are requested. | |

|Start of attribute list for rater. Omitted if you want all ratings, made by | |

|List of requested rating descriptors. If no list is specified, this means that | |

|all available ratings are requested. | |

|Start of attribute list for rater | |

Example of a body (get-atomic-ratings):

|Explanation |Information sent () |

|Start | |

| | |

| | |

|List of locations, for which ratings are | |

|retrieved | |

|Raters, whose ratings are requested | |

| | |

|Which rating labels are requested. | |

| | |

|End of get-atomic-ratings | |

10.11.2 Response format (get-atomic-ratings-response):

The response is an XML [12], [13] document. The XML Resource Type Declaration for this XML resource is:

|Explanation |Format of information sent () |

|Reference to data structure defined in a separate DTD file. | |

|Import DTD from separate DTD file atomic-rating.dtd. |%atomic-rating; |

|The evaluation are returned, one rating service at a time. | |

|If only some of the settings were accepted, here is a list | |

|of those not accepted. The #PCDATA can contain a | |

|human-readable description of the refusal reason in the | |

|preferred language of the user doing the registration (not | |

|always the language of the user being registered). | |

|XML attributes for refuse-reason. | |

Example 1 (get-atomic-ratings-response):

Note: This response is sent in the case where the ISCN server had no ratings for any of the resources requested, so that only ratings from the select general ratings server are returned.

|Explanation |Information sent |

|HTTP response header |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 7 July 1999 19:58:23 +0200 |

|A blank line to indicate the end of the | |

|HTTP header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration | |

|(DTD) file specifying the syntax for this | |

|XML resource. | |

|Start of get-atomic-ratings-response for | |

|one resource. | |

|First rating returned | |

|First rating descriptor | |

|Second rating descriptor | |

|Third rating descriptor | |

|End of data | |

|Second rating returned | |

|First rating descriptor | |

|Second rating descriptor | |

|Third rating descriptor | |

|End of data | |

| | |

Example 2 (get-atomic-ratings-response rejection):

|Explanation |Information sent |

|HTTP response header |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 7 July 1999 19:58:23 +0200 |

|A blank line to indicate the end of the HTTP header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration (DTD) file specifying | |

|Start of server list | |

|Start of ratings for one resource to be rated | |

|End of evaluate-response report and end of file | |

10.12 Simple-Search Operation

Summary: Find web pages satisfying a query and which are highly rated.

Access control: No access control for basic rating. Rating based on a particular users interest and values may be available only if preceded by a login operation for this particular user.

Input data: The user specifies the query by filling in a query form. Simple search, when the personalised checkbox is unchecked, is always made on the general-rating (see page Error! Bookmark not defined.)derived descriptor. When the [pic] checkbox is checked, the general-rating is made using a default personal-rating derived descriptor, which actually returns different values for each user. If the user is unknown, [pic]will return an error message.

Output data: A HTML page or an XML document with a list of found pages sorted according to rating and relevance.

Base protocol: HTML application/x-www-form-urlencoded for the request, and HTML or XML for the response.

10.12.1 Query format (simple-search-query):

The simple-search query is an HTTP GET operation with the query after "?" in the URI. The query format is the same used by EuroSeek, and which is further documented in [16] and appendix Appendix A: Summary of the Euroseek Search Language

(Taken From [16]) on page 55.

The query is the same as would be sent with the following HTML form (which is mainly a simplified version of the Euroseek search form, with “context” and “only unseen” added:

[pic]

If the user checks to “Use my interest profile” or “Use my keywords”, then that user can, but need not fill in any “Search query”. If the user does not fill in any “Search Query” but checks “Only unseen” and “Use my interest profile” or “Use my keywords”, then this will be a search for highly-rated new, by this user unseen information. Note that by checking “News”, a search for news articles is done and the result may be presented on the web, even though the rating of these web articles was done through a newsreader and not through a web interface.

By "Peer search" is meant search, where higher value is given to ratings provided by people with similar interests and values as yourself.

HTML code behind the form above:

SELECT Search Query

Search:

Internet

Select directory

News

Only unseen

Search query:

Limit to Country:

Limit to Language:

Any

Cymraeg

Dansk

Deutsch

English

Español

Français

Italiano

Magyar

Nederlands

Norsk

Português

Suomi

Svenska

Result format:

HTML

XML

Peer search

Max no of docs:

Context:

general

business

leisure

shopping

research

politics

Use my interest profile

Use my keywords

Use only manual keywords and profile

Example of query string:

(filter OR "SELECT rating") AND EU&domain=world&language=world

which with URI encoding will become:

search=internet&search=select&search=news&query=%28filter+OR+%22SELECT+rating%22%29+AND+EU&Search=Search&textfield=&lang=world&resultformat=html&context=yes&business=yes&leisure=yes&shopping=yes&research=yes&politics=yes

Example of a simple-search query

Query is sent to the following URL for the SELECT general service:

|Explanation |Information sent |

|Connect to the SELECT server |GET /v1.0/general/simple-search?search=internet&search=select&search=news&query=%28filter+OR+%22SEL|

|"format" can be either "xml" or "html" and |ECT+rating%22%29+AND+EU&Search=Search&textfield=&lang=world&resultformat=html&context=yes&business=|

|specifies in which format the response is |yes&leisure=yes&shopping=yes&research=yes&politics=yes HTTP/1.1 |

|to be delivered | |

|To the HTTP server "sel.nu" port 80. |Host: sel.nu |

|Only files in the format application/xml |Accept: application/xml |

|are accepted. | |

|Request is sent from a client which is, or |User-Agent: Mozilla/4.5 |

|simulates, Netscape 4.5. | |

|This user has connected to this server |Cookie: session="1234567890123456" |

|before, and a cookie identifies the | |

|session. | |

10.12.2 Response format (simple-search-response):

The simple-search response can be in either XML or HTML format depending on the request. If no format was specified in the request, HTML is the default format. The response contains a list of resources matching the query and sorted by rating-value.

The XML Resource Type Declaration for this XML resource is:

|Explanation |Format of information sent () |

|The evaluation are returned, one rating service at a time.| |

| | |

| | |

|If only some of the rating values were rejected, this | |

|element is used to list the rejected rating values. The | |

|#PCDATA contains the summary or keywords or some other | |

|description of the found resource. | |

| | |

Example 1 (positive simple-search response):

|Explanation |Information sent |

|HTTP response header |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 7 July 1999 19:58:23 +0200 |

|A blank line to indicate the end of the HTTP header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration (DTD) file | |

| | |

| | |

| |An overview of flowers found in Kenya. |

| | |

| | |

| |An overview of flowers found in Kiwi. |

| | |

| | |

Example 2 (negative simple-search response):

|Explanation |Information sent |

|HTTP response header. |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 7 July 1999 19:58:23 +0200 |

|A blank line to indicate the end of the HTTP header. | |

|Identifies that this is in XML format. | |

|References the Resource Type Declaration (DTD) file | |

|All ratings were not accepted. | |

|Rating-label rejected, this server does not accept ratings|You are not allowed to make this search. |

|in the leisure context. | |

|End of simple-search-response. | |

10.13 Advanced-Search Operation (Not yet ready)

Summary: Find web pages satisfying a query and which are highly rated.

Access control: No access control for basic rating. Rating based on a particular users interest and values may be available only if preceded by a login operation for this particular user.

Input data: Some general-purpose search format, based on SQL or some other search language. The advanced search should especially allow the needs of other modules.

Required functionality:

1. It should be possible to search on all derived and atomic ratings. Example of use: The NLP modules need a way of getting a list of which documents are to be rated by the NLP modules. Can this be done through a variant of the advanced-search operation?

2. It should be possible to retrieve all ratings on resources with a particular author, including ratings with a particular author sent to a particular newsgroup.

Output data: A HTML page or an XML document with a list of found pages sorted according to rating and relevance.

Base protocol: HTML application/x-www-form-urlencoded for the request, and HTML or XML for the response.

10.13.1 Query format (advanced-search-query):

The advanced-search query is an HTTP POST operation, whose body is an XML resource containing the profile, sent to the profiles cgi-script in the server for this particular rating service. Example: “”.

|Explanation |Information sent |

|Connect to the SELECT server |POST /v1.0/general/id/search HTTP/1.1 |

|To the HTTP server "sel.nu" port 80. |Host: sel.nu |

|Only files in the format application/xml are accepted. |Accept: application/xml |

|Request is sent from a client which is, or simulates, Netscape |User-Agent: Mozilla/4.5 |

|4.5. | |

|The format of the query is XML. |Content-Type: Application/xml |

|This user has connected to this server before, and a cookie |Cookie: session="1234567890123456" |

|identifies the session. | |

The body of the operation is an XML [12], [13] resource. The XML Resource Type Declaration for this XML resource is:

|Explanation |Format of information sent () |

|Not yet ready | |

Example of a advanced-search query

|Explanation |Format of information sent () |

|HTTP header |POST /v1.0/general/id/advanced-search HTTP/1.1 |

| |Host: sel.nu |

| |Accept: application/xml |

| |User-Agent: Mozilla/4.5 |

| |Content-Type: Application/xml |

| |Cookie: session="012345678901234354" |

|A blank line to mark the end of the HTTP | |

|header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration | |

|XML resource. | |

|Not yet ready | |

10.13.2 Response format (advanced-search-response):

The response format for the advanced-search is the same as the response format for the simple search, described in section 10.12.2.

10.14 Evaluate Operation

Summary: Get the ratings for a list of URIs.

Access control: No access control for basic rating. Rating based on a particular users interest and values may be available only if preceded by a login operation for this particular user.

Input data: A list of URIs and a list of services. For each service, a list of derived rating labels are listed. Note that only derived ratings, not atomic ratings, can be found with this operation. If N URIs, M services and V label types are listed, then NxMxV rating labels are returned.

Output data: A list of rating labels.

Base protocol: HTTP and XML.

Issue: Is a “streaming” version of this operation needed? By streaming is meant a version in which the URIs to process are sent to the server in parallel with the server returning responses, so that responses for the first URIs are returned before the last URIs have been sent to the server for evaluation.

10.14.1 Query format (evaluate-query):

The evaluate query is an HTTP POST operation, whose body is an XML resource containing the query, sent to

|Explanation |Information sent |

|Connect to the SELECT server |POST /v1.0/general/evaluator HTTP/1.1 |

|To the HTTP server "sel.nu" port 80. |Host: sel.nu |

|Only files in the format application/xml are accepted. |Accept: application/xml |

|Request is sent from a client which is, or simulates, Netscape|User-Agent: Mozilla/4.5 |

|4.5. | |

|The format of the query is XML. |Content-Type: Application/xml |

|This user has connected to this server before, and a cookie |Cookie: session="1234567890123456" |

|identifies the session. | |

The body of the query is an XML [12], [13] resource. The XML Resource Type Declaration for this XML resource is:

|Explanation |Format of information sent () |

|A list of locations to be evaluated, followed by a list | |

|of services to evaluate these locations. The returned | |

|response will be L x S rating labels, if L is the number| |

|of locations and S the number of services. | |

|Start of attribute list for evaluate-query. | |

|Each URI to be evaluated is a free text field containing| |

|the URI of the resource to be evaluated. | |

|Each service description is a free text field containing| |

|the URI of the service | |

|Start of attribute list for service | |

|List of requested descriptors. If no list is specified, | |

|this means that all available descriptors are requested.| |

|Only derived ratings can be requested, not atomic | |

|ratings. | |

|Start of attribute list for label | |

|Instead of listing the labels to be retrieved, it is | |

|possible to just specify the name of a collection, to | |

|retrieve the labels specified in this collection.. The | |

|collection must be a collection specified in the | |

|service-description of the service used. | |

| | |

Example of a body (evaluate-query):

|Explanation |Information sent () |

|Start | |

| | |

| | |

|List of locations to be evaluated | |

| | |

|List of services whose evaluations are | |

|descriptors requested are listed. For the |select-reader-quality-rating |

|general service, reader-quality and |select-reader-interest-rating |

|reader-interest-ratings are requested, for |keywords |

|the iscn service, all available descriptors| |

|are requested. | |

| | |

| | |

| | |

|End of evaluate-query. | |

10.14.2 Response format (evaluate-response):

The response is an XML [12], [13] document. The XML Resource Type Declaration for this XML resource is:

|Explanation |Format of information sent () |

|The evaluation are returned, one rating service at | |

|a time. | |

|Start of list of attributes for the | |

| | |

| | |

| | |

|URI of the rated resource. | |

|Start of a ratings label. Note: If no label is | |

|available, then no labels are specified. | |

|Start of attribute list for label | |

Example 1 (evaluate response):

Note: This response is sent in the case where the ISCN server had no ratings for any of the resources requested, so that only ratings from the select general ratings server are returned.

|Explanation |Information sent |

|HTTP response header |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 7 July 1999 19:58:23 +0200 |

|A blank line to indicate the end of the HTTP header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration (DTD) file | |

|specifying the syntax for this XML resource. | |

|Start of evaluate-response for one resource. | |

|Start of ratings for one resource to be rated | |

|One rating descriptor value for this resource | |

|Another rating descriptor value | |

|Another rating descriptor value | |

|One rating descriptor value for this resource | |

|End of list of all labels for this resource. | |

|Start of ratings for one resource to be rated | |

|One rating descriptor value | |

|One rating descriptor value | |

|End of list of all labels for this resource. | |

|End of evaluate-response report and end of file | |

Example 2 (evaluate response rejection):

|Explanation |Information sent |

|HTTP response header |HTTP/1.1 200 OK |

| |Content-Length: 569 |

| |Content-Type: application/xml |

| |Server: Select 1.0 |

| |Date: 7 July 1999 19:58:23 +0200 |

|A blank line to indicate the end of the HTTP header | |

|Identifies that this is in XML format | |

|References the Resource Type Declaration (DTD) file specifying | |

|Start of server list | |

|Start of ratings for one resource to be rated | |

|End of evaluate-response report and end of file | |

10.15 Exchange-Ratings-Data (not yet ready)

Summary: This operation is used between two select servers, in order to replicate information in their data bases.

Issues:

Access control:

Input data:

Output data:

Base protocol:

10.15.1 Query format (exchange-ratings-data):

Example (replicate-ratings):

|Explanation |Information sent |

|Not ready | |

10.15.2 Response format (exchange-ratings-data):

|Explanation |Format of information sent |

| |() |

|Not ready | |

Example (replicate-ratings response):

|Explanation |Information sent |

|Not ready | |

Extensions between protocol and functional specs

Here is a list of possible differences between the protocol specs in this document and the functional and architecture specs [17], [18].

One SELECT server keeps a list of the different SELECT services and their descriptions and URLs.

Each service description can be provided in more than one natural language, and has associated with it a natural language tag.

Each service description has associated with it an URI of the service description. Thus URI can also be used to identify this particular SELECT service.

Each service description stores the e-mail address of its maintainer.

Each service description stores whether pseudonymous ratings are permitted in this service.

Each service description stores which security level is used and required in that service.

One service can “import” all the rating descriptors of another service, and then add to them.

There is a special kind of rating descriptors called “derived” descriptors. These descriptors are computed by the server, by combining different people's different ratings on a resource. For each derived descriptor is defined the method of deriving the derived descriptor.

Each rating has one of the four basic formats “Boolean”, “numerical”, “words”, “text” or “date”.

Each rating has a rating-engine.

Birthyear instead of age.

Automatic and manual keywords in the user profile are separated.

Cookie-life-time user settable number of seconds in the user profile.

Users are identified by their e-mail address. They can also, if they so want, have a pseudonym, or they can have only a pseudonym and no real identifiable identity. Fully anonymous ratings can also be allowed by some services.

Keywords in the user profile will be split into two sets of keywords, those manually set and those implicitly derived.

Format for “interest profile” is not yet defined.

The user can choose whether to get results as a formatted HTML page, or as an XML page which might be further formatting by a local proxy before being displayed to the user.

A Usenet news user can get his/her news server to add additional attributes to article heads, indicating their computed rating. The newsreader can then use these for filtering.

References

|[1] |Rating Services and Rating Systems (and Their Machine Readable Descriptions) |

|[2] |PICS Distribution Label Syntax and Communication Protocols . |

|[3] |RFC 2109 HTTP State Management Mechanism. D. Kristol, L. Montulli. February 1997. |

|[4] |RFC 2068 Hypertext Transfer Protocol -- HTTP/1.1. R. Fielding, J. Gettys, J. Mogul, H. Frystyk, T. Berners-Lee. January 1997. |

| | |

|[5] |RFC 1945 Hypertext Transfer Protocol -- HTTP/1.0. T. Berners-Lee, R. Fielding & H. Frystyk. May 1996. |

| | |

|[6] |HTML 4.0 Specification, W3C Recommendation, by Dave Raggett, Araud Le Hors and Ian Jacobs, . |

|[7] |RFC 2110 MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML). J. Palme, A. Hopmann. March 1997. |

| | |

|[8] |RFC 2045-2049 Multipurpose Internet Mail Extensions (MIME). N. Freed & N. Borenstein, November 1996. |

| |, rfc2046.txt, rfc2047, rfc2048, rfc2049 |

|[9] |RFC 821 Simple Mail Transfer Protocol. J. Postel, August 1982. |

|[10] |RFC 822 Standard for the format of ARPA Internet text messages. D. Crocker, August 1982 |

|[11] |RFC 1738 Uniform Resource Locators (URI).T. Berners-Lee et al, December 1994. |

|[12] |Extensible Markup Language (XML) 1.0. W3C REC-xml-19980210, T. Bray, J. Paoli, C.M. Sperberg-McQueen. |

| | |

|[13] |A Technical Introduction to XML, N. Walsh, Oct 1998, |

|[14] |PICS-NG Metadata Model and Label Syntax, O. Lassila, |

|[15] |Resource Description Framework (RDF) Schema Specification, |

|[16] |Euroseek Search Wizard, |

|[17] |SELECT, Telematics Application Programme, RE4008, Deliverable 2.1 Draft, Functional Specifications Report, by Roland Alton-Scheidl and|

| |Richard Wheeler. |

|[18] |SELECT System Architecture, by Richard Wheeler |

Appendix A: Summary of the Euroseek Search Language

(Taken From [16])

|Type of search |Operator |Action |Example |

|Simple search |single word |finds a word |tennis |

|Phrase search |"quotation marks " |words that must appear together |"computer games" |

|Search for names |Capital Letters |indicates proper nouns |Charlie Chaplin |

|Combined search, AND |AND |both words searched for |tennis AND sport |

|Combined search, OR |OR |one or the other, or both words searched for |tennis OR sport |

|Exclusion |minus sign - |excludes a word or a phrase |tennis -racket |

|Grouped search |parenthesis ( ) |grouping |(tennis OR hockey) AND sport |

|Field search, host: |host: |finds documents on a host computer |host: |

|URI search |search for an URI |finds a specific site | |

12.1.1 Simple search

The easiest way to search. Just type a word in the Search form and click on the Search button.

Example:

Query: tennis

Response: Documents containing the word tennis.

1. Click on the link 'tennis' above.

2. Click the Search button.

3. Look in your browser for the responses.

4. Make your own Simple search!

12.1.2 Phrase search

Quotation marks are useful when you are searching for words that must appear together. If you leave out the quotation marks; EuroSeek will automatically use the operator AND to find documents with both words in them. (Limited support for longer phrases, e.g. "to be or not to be")

Example:

Query: "computer games"

Response: Documents in which the words "computer games" appears next to each other.

1. Click on the link 'computer games' above.

2. Click the Search button.

3. Look in your browser for the responses.

4. Make your own Phrase search! Type two words within quotation marks, in the Search form and click the Search button.

12.1.3 Search for names

The search is case insensitive, but has an automatic recognition of capitalised names.

Example:

Query: Charlie Chaplin

Response: Documents in which the name Charlie Chaplin is mentioned.

1. Click on the link 'Charlie Chaplin' above.

2. Click the Search button.

3. Look in your browser for the responses.

4. Make your own Search for names! Type a name in the Search form and click the Search button.

12.1.4 Combined search, AND

Use AND to find documents that include both of the search words.

Example:

Query: tennis AND sport

Response: Documents containing both of the words tennis and sport.

1. Click on the link 'tennis AND sport' above.

2. Click the Search button.

3. Look in your browser for the responses.

4. Make your own Combined search with AND! Type two words with AND in between them, in the Search form; and click the Search button.

12.1.5 Combined search, OR

Use OR to find documents that include any or both of the search words.

Example:

Query: tennis OR sport

Response: Documents containing either tennis, sport or both.

1. Click on the link 'tennis OR sport' above.

2. Click the Search button.

3. Look in your browser for the responses.

4. Make your own Combined search with OR! Type two words with OR in between them, in the Search form; and click the Search button.

12.1.6 Exclusion

Use -(minus) sign for words that must not be present in your search. Do not type space between the -(minus) sign and the word that follows.

Example:

Query: tennis -racket

Response: Documents containing the word tennis but not racket.

1. Click on the link 'tennis -racket' above.

2. Click the Search button.

3. Look in your browser for the responses.

4. Make your own Exclusion search! Type any word in the Search form followed by space and a -(minus sign) in front of another word; which you don't want to be present in your responses. Click the Search button.

12.1.7 Grouped search

You may use the operators AND, OR , – (minus) for combining words into more complex queries. You can also group queries using parenthesis.

Example:

Query: (tennis OR hockey) AND sport

Response: Documents containing the word sport and any or both of the words tennis and hockey.

1. Click on the link '(tennis OR hockey)AND sport' above.

2. Click the Search button.

3. Look in your browser for the responses.

4. Make your own Grouped search! Click on the examples below and fill in your own words in the Search form, but don't leave out the parenthesis.

More examples:

golf(-clubs)(-course)(-balls)

tennis AND (hockey OR sport)

(tennis AND hockey) -sport

( (tennis AND hockey) -sport) OR golf

12.1.8 Field search, host

Limits the search to a specified host computer.

Example:

Query: host:

Response: Documents under the host:.

5. Click on the link 'host:' above.

6. Click the Search button.

7. Look in your browser for the responses.

8. Make you own Field search with host:! Type host: followed by a webadress in the Search form, as in the example above. Click the Search button.

12.1.9 URI search

Finds documents with a specific word or phrase in the URI.

This search is useful when you are looking for a specific site.

Example:

Query:

Response: Documents which URIs begin in

1. Click on the link '' above.

2. Click the Search button.

3. Look in your browser for the responses.

4. Make you own URI search! Type http:// followed by a webaddress in the Search form, as in the example above; and click the Search button.

12.1.10 Region

A single European country, or a larger region like Scandinavia or America can be specified.

12.1.11 Domains

Search can be restricted to only a certain descriptor of organisations, like Companies, Universities, Military, Government, Organizations, Network.

12.1.12 Language

Search can be restricted to a certain language.

Appendix B: Query language translation to URI string

query-string = phrase ['+AND+'/'+OR' phrase]

%28Kapsch+AND+personal%29+OR+Wien

Appendix C: A word macro to extract the XML encodings

Here is a Microsoft Word (Visual Basic for Applications) macro, which will convert the table format of the XML specifications into plain text files.

Sub ConvertTable()

ActiveDocument.Tables(1).ConvertToText Separator:=wdSeparateByParagraphs

WordBasic.StartOfDocument

Selection.Find.ClearFormatting

Selection.Find.Replacement.ClearFormatting

With Selection.Find

.Text = ""

.Replacement.Text = ""

.Forward = True

.Wrap = wdFindContinue

.Format = False

.MatchCase = False

.MatchWholeWord = False

.MatchWildcards = False

.MatchSoundsLike = False

.MatchAllWordForms = False

End With

Selection.Find.Execute Replace:=wdReplaceAll

WordBasic.StartOfDocument

With Selection.Find

.Text = ""

.Replacement.Text = ""

.Forward = True

.Wrap = wdFindContinue

.Format = False

.MatchCase = False

.MatchWholeWord = False

.MatchWildcards = False

.MatchSoundsLike = False

.MatchAllWordForms = False

End With

Selection.Find.Execute Replace:=wdReplaceAll

WordBasic.StartOfDocument

With Selection.Find

.Text = ""

.Replacement.Text = ""

.Forward = True

.Wrap = wdFindContinue

.Format = False

.MatchCase = False

.MatchWholeWord = False

.MatchWildcards = False

.MatchSoundsLike = False

.MatchAllWordForms = False

End With

Selection.Find.Execute Replace:=wdReplaceAll

WordBasic.StartOfDocument

Selection.Find.Execute Replace:=wdReplaceAll

With Selection.Find

.Text = ""

.Replacement.Text = ""

.Forward = True

.Wrap = wdFindContinue

.Format = False

.MatchCase = False

.MatchWholeWord = False

.MatchWildcards = False

.MatchSoundsLike = False

.MatchAllWordForms = False

End With

Selection.Find.Execute Replace:=wdReplaceAll

End Sub

Appendix D: NNTP Authentication

The text below is copied from Common NNTP Extensions, draft-ietf-nntpext-imp-01.txt by S. Barber, December 1997.

3.1 AUTHINFO

AUTHINFO is used to inform a server about the identity of

a user of the server. In all cases, clients must provide

this information when requested by the server. Servers are

not required to accept authentication information that is

volunteered by the client. Clients must accommodate servers that

reject any authentication information volunteered by the client.

There are three forms of AUTHINFO in use. The original version,

an NNTP v2 revision called AUTHINFO SIMPLE and a more recent

version which is called AUTHINFO GENERIC.

3.1.1 Original AUTHINFO

AUTHINFO USER username

AUTHINFO PASS password

The original AUTHINFO is used to identify a specific entity

to the server using a simple username/password combination.

It first appeared in the UNIX reference implementation.

When authorization is required, the server will send a 480

response requesting authorization from the client. The

client must enter AUTHINFO USER followed by the username.

Once sent, the server will cache the username and may send

a 381 response requesting the password associated with that

username. Should the server request a password using the 381

respose, the client must enter AUTHINFO PASS followed by

a password and the server will then check the authentication

database to see if the username/password combination is valid.

If the combination is valid or if no password is required,

the server will return a 281 response. The client should then

retry the original command to which the server responded with

the 480 response. The command should then be processed by

the server normally. If the combination is not valid, the server

will return a 502 response.

Clients must provide authentication when requested by the server.

It is possible that some implementations will accept authentication

information at the beginning of a session, but this was not the

original intent of the specification. If a client attempts to

reauthenticate, the server may return 482 response indicating

that the new authentication data is rejected by the server.

The 482 code will also be returned when the AUTHINFO commands

are not entered in the correct sequence (like two AUTHINFO

USERs in a row, or AUTHINFO PASS preceding AUTHINFO USER).

All information is passed in cleartext.

When authentication succeeds, the server will create an email

address for the client from the user name supplied in the

AUTHINFO USER command and the hostname generated by a reverse

lookup on the IP address of the client. If the reverse lookup

fails, the IP address, represented in dotted-quad format, will

be used. Once authenticated, the server shall generate a Sender:

line using the email address provided by authentication if it

does not match the client-supplied From: line. Additionally,

the server should log the event, including the email address

This will provide a means by which subsequent statistics generation

can associate newsgroup references with unique entities - not

necessarily by name.

3.1.1.1 Responses

281 Authentication accepted

381 More authentication information required

480 Authentication required

482 Authentication rejected

502 No permission

3.1.2 AUTHINFO SIMPLE

AUTHINFO SIMPLE

user password

This version of AUTHINFO was part of a proposed NNTP V2

specification, which was started in 1991 but never completed,

and is implemented in some servers and clients. It is a

refinement of the original AUTHINFO and provides the same

basic functionality, but the sequence of commands is much

simpler.

When authorization is required, the server sends a 450 response

requesting authorization from the client. The client must enter

AUTHINFO SIMPLE. If the server will accept this form of

authentication, the server responds with a 350 response. The

client must then send the username followed by one or more

space characters followed by the password. If accepted, the

server returns a 250 response and the client should then

retry the original command to which the server responded

with the 450 response. The command should then be processed

by the server normally. If the combination is not valid,

the server will return a 452 response.

Note that the response codes used here were part of the

proposed NNTP V2 specification and are violations of RFC 977.

It is recommended that this command not be implemented, but

use either or both of the other forms of AUTHINFO if such

functionality if required.

3.1.2.1 Responses

250 Authorization accepted

350 Continue with authorization sequence

450 Authorization required for this command

452 Authorization rejected

3.1.3 AUTHINFO GENERIC

AUTHINFO GENERIC authenticator arguments...

AUTHINFO GENERIC is used to identify a specific entity to the

server using arbitrary authentication or identification

protocols. The desired protocol is indicated by the

authenticator parameter, and any number of parameters can

be passed to the authenticator.

When authorization is required, the server will send a 480

response requesting authorization from the client. The

client should enter AUTHINFO GENERIC followed by the

authenticator name, and the arguments if any. The authenticator

and arguments must not contain the sequence "..".

The server will attempt to engage the server end authenticator,

similarly, the client should engage the client end authenticator.

The server end authenticator will then initiate authentication

using the NNTP sockets (if appropriate for that authentication

protocol), using the protocol specified by the authenticator name.

These authentication protocols are not included in this document,

but are similar in structure to those referenced in RFC 1731[8]

for the IMAP-4 protocol.

If the server returns 501, this means that the authenticator

invocation was syntactically incorrect, or that AUTHINFO

GENERIC is not supported. The client should retry using the

AUTHINFO USER command.

If the requested authenticator capability is not found, the

server returns the 503 response code.

If there is some other unspecified server program error, the

server returns the 500 response code.

The authenticators converse using their protocol until complete.

If the authentication succeeds, the server authenticator will

terminate with a 281, and the client can continue by reissuing

the command that prompted the 380. If the authentication fails,

the server will respond with a 502.

The client must provide authentication when requested by the

server. The server may request authentication at any

time. Servers may request authentication more than once

during a single session.

When the server authenticator completes, it provides to the

server (by a mechanism herein undefined) the email address

of the user, and potentially what the user is allowed to

access. Once authenticated, the server shall generate a Sender:

line using the email address provided by the authenticator

if it does not match the user-supplied From: line. Additionally,

the server should log the event, including the user's

authenticated email address (if available). This will provide

a means by which subsequent statistics generation can

associate newsgroup references with unique entities - not

necessarily by names.

3.1.3.1 Responses

281 Authentication succeeded

480 Authentication required

500 Command not understood

501 Command not supported

502 No permission

503 Program error, function not performed

nnn authenticator-specific protocol.

Appendix E: List of test files

The following test files can be found at: . Since the “select.nu” server is not yet operating, some of the links in these files refer temporarily to “http:/cmc.dsv.su.se/select/xml/” instead of to “select.nu”.

atomic-rating.dtd

common-service-description.dtd

common-service-description.xml

common-service-description.xml

euroseek-search.html

evaluate-query.dtd

evaluate-query.xml

evaluate-response-1.xml

evaluate-response-2.xml

evaluate-response.dtd

evaluate-result-1.xml

flower-service-description.xml

general-service-description.xml

get-atomic-ratings-1.xml

get-atomic-ratings-2.xml

get-atomic-ratings-response-1.xml

get-atomic-ratings-response-2.xml

get-atomic-ratings-response.dtd

get-atomic-ratings.dtd

get-atomic-ratings.xml

get-profile-response-1.xml

get-profile-response-2.xml

get-profile-response.dtd

get-profile.html

login-response.dtd

login-response.xml

login.html

logout-response.dtd

logout-response.xml

profile.dtd

search-form.html

send-rating-1.xml

send-rating-2.xml

send-rating-response.dtd

send-rating.dtd

service.dtd

services.dtd

services.xml

set-profile-response-1.xml

set-profile-response-2.xml

set-profile-response.dtd

set-profile-result-1.xml

set-profile.dtd

set-profile.xml

simple-search-response-1.xml

simple-search-response-2.xml

simple-search-response.dtd

Appendix F: A short introduction to XML

Here is a short-short introduction to XML features used here.

|Explanation |Format and syntax specification (in XML |Example of actual usage |

| |terminology, this is Document Type | |

| |Declaration, DTD) | |

|Header | | |

| | ||

|The element demo consists of one or more | | |

|“flower-description” | | |

|EMPTY means that there can be no elements | | |/> |

| | | |

|End tag for “demo” | | |

means that the element foo can have three subelements. bar1 can occur zero, one or more times, bar2 one or more times, bar3 zero or one time). So the following would be legal: .

Appendix G: Issues for further study

These issues are items which are not needed for the base system implementations, and which may be modified by experience from the first implementation efforts.

1. The Advanced Search facility should be specified, based on query by example, SQL or some other standard query language methodology. This will be done later in the project by SZTAKI.

2. The format of the personal interest profile, and keywords is not ready. In particular, should there be a split between profile set by the user him/herself and set by automatic methods, such as ML algorithms on the user's rating and behaviour. Also, to what extent should this profile be specified in a formal, logical language, like "If newsgroup is alt.culture.sweden then do not filter away anything", etc. In the first implementations, we will just use a simple set of unordered keywords as the personal profile.

3. Is security enough? Do we need more security features? If so, which and how?

4. Is there a need for NNTP versions of some or all of the operations? See for example page 14 and page 24.

5. Is a streaming version needed for the evaluate operation (page 42)?

6. Privacy and security issues for Get-Atomic-Ratings (page 33).

7. Is more needed for ML support?

8. Is more needed for NLP support?

9. Exchange-Ratings-Data not ready. No great priority.

10. Set-Service-Description not ready. No great priority. Can be done using local or web-based interface.

11. Is more needed for thesauri support?

Appendix H: Major differences between version 1.1 and version 1.2

1. Pseudonymous ratings have been allowed, anonymous and non-anonymous are kept.

2. “Collection” functionality has been removed.

Appendix I: The SELECT service description

A SELECT service description file contains

• A list of services

• Per service

• Admistrative information about the service accessible on the server

• Name

• Maintainer

• Website about the service

• Textual description in natural language (possible in multiple languages)

• A list of categories

• Per category

• A textual description of the category (possible in multiple languages)

• A name for the category

• Rater type (human or computer generated rating)

• The datatype of the ratings for this category (a label, keyword, value or derived category)

• Depending on the datatype

• Value:

How the category should be displayed on screen (“none” if not possible)

The calculationmethod used to calculate the instant rating value of a resource for this category

A minimum and maximum value for the rating values in this category

• Label :

How the category should be displayed on screen (“none” if not possible)

The calculationmethod used to calculate the instant rating value of a resource for this category

A list of labels for the category. Per Label

• A value that corresponds to the description contained in the textual or iconic labels.

• A list of Textual and/or Iconic labels (possible in multiple languages)

• Derived:

The calculationmethod used to calculate the instant rating value of a resource for this category

A number of categories from which the value of an instant rating of this category is derived.

A list of labels that describe how to map the value of an instant rating of this category back to natural language. Per label

• A minumum and maximum value. If the rating falls between these 2 values, the associated textual label is selected.

• Keyword:

The calculationmethod used to calculate the instant rating value of a resource for this category

• A list of imported categories : categories of other services that are imported into this service

• The classname of the Java class that starts the agents associated with the service

All this tranlated to the XML document type definition looks like this:

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

20.1 Example

An example of all this is the service description file of the SELECT test server:

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

Appendix J : The SELECT Agent protocol

The advanced SELECT platform now supports an agent architecture that makes it easy to integrate collaborative or information filtering algorithms. The agents can run in the Select server or on a client machine over the Network and can perform tasks as maintaining datastructures that can speed-up collaborative filtering algorithms, perform collaborative filtering, notifying other agents of a change in the Select database or using machine learning at the client to derive useful user-profile information. The agents can communicate with one another and with the Select server using an extension of the original XML Select protocol. They can request certain services to be executed or can send a notification about the occurrence of a certain event.

The introduction of agent means that the protocol had to be extended to support this. This appendix gives an overview of the necessary extensions.

Entry points

Following new entry points in the Select server have been made:

|Entry |Function |

|/agent/register-agent.xml |Register an agent with the AgentList |

|/agent/deregister-agent.xml |Deregister an agent with the AgentList |

|/agent/list-agents.xml |Request a remote AgentList |

In addition to this, every remote agent and the server have the following entry points:

|Entry |Function |

|/agent/request/”name” |Request an agent a service |

| |“name” is the name of the agent |

|/agent/notify/”name” |Notify an agent of an event. |

| |“name” is the name of the agent |

XML Extensions

This is an overview of the new XML document type definitions.

Registering an agent with the server-side AgentList.

|Register Agent Request (register-agent.dtd) |

| | |

| | |

| |name of the notification |

| |type of notification |

| |depends on the type |

| | |

|Register Agent Reply (register-agent-reply.dtd) |

| | |

| | |

| | |

Deregister an agent with the server-side AgentList

|Deregister Agent Request (deregister-agent.dtd) |

| | |

| | |

| | |

|Deregister Agent Reply (deregister-agent-reply.dtd) |

| | |

| |Possible or negative outcome of the deregister operation |

| | |

Get a remote AgentList for use in a remote agent

|List Agents Request (list-agents-request.dtd) |

| | |

| |A list of one or more session identifiers |

| | |

| | |

|List Agents Reply (list-agents.dtd) |

| | |

| |A list of data for one or more agents |

| | |

| | |

| | |

Notify a remote agent of the occurrence of a certain event

|Notify Agent Request (notify-agent.dtd) |

| | |

| | |

| | |

| | |

Request an agent for some service

|Agent Request (agent-request.dtd) |

| | |

| | |

| |parameters = “raterid” |

|Agent Reply (agent-reply.dtd) |

| | |

| | |

| |agent) |

Privacy and security policy

When dealing with an architecture where agents can run on the Select server or client machines and can communicate and request services of one another, it is necessary to have a system in place that limits the access to the agents in order to avoid abuse of machine resources (server or client) or the exposure of personal profile or other database information to unauthorized persons.

First of all, remote agents can only be registered and controlled by users that are registered with the Select Server. The user first logs in and receives a session-id. This ID is associated with every agent the user registers with the AgentList. It’s used for authentication in a number of cases:

• A remote agent only accepts requests if the requesting agent knows this agent’s session id.

• When a remote agent requests a remote AgentList, it must include the session-id(s) of the agents it wants access to. This way, the server only returns information about agents for which the remote agent has proven it’s authority of access.

Local agents can be “private” or “public”. Private local agents do not accept requests from remote agents, only from other local agents. Public local agents can accept requests from remote agents. Both local agent types can themselves request information from remote agents.

• A public local agent only accepts request when the session-id associated with the request is present in the server. This means that the session that created the agent has not ended yet.

Appendix K : Protocol Implemtation Status

Here we provide an overview of the implementation status of the SELECT protocol.

Get-Service-Description-List

Compelety implemented

Get-Service-Description

Completely implemented

Set-Service-Description

Not specified or implemented yet.

Send-Rating

Completely implemented exept for the

rater-competence

rater-trust

message-id

fields of the rating. These are not used anywhere.

Set-Profile

Implemented except for

• the pseudonymous related thing. Pseudonyms are never used in the present system.

• Multiple profiles per user. Every user has 1 profile for all the services, it contains it’s unique data (name, password,…,languages spoken, reward account) and some data for the SELECT test service filtering algorithms (keywords,…)

• the reward account that is never used.

• A profile must always be replaced as a whole. Single fields cannot be changed separately.

Get-Profile

Completely implemented

Login & Logout

Completely implemented

Get Atomic Ratings

Implemented except the only the first combination of rater/URL/category is used in the lookup.

All other requested combinations are ignored.

Simple-Search Operation

Completely implemented but the search query is a regular expression. This gets matched against the keywords stored with the rated URI’s and the ones with the best instant ratings are returned.

Evaluator

Implemented except the

personal

context

sort

collection-name

fields that are ignored and used nowhere

................
................

In order to avoid copyright disputes, this page is only a partial summary.

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches