The System for Universal Media Searching (SUMS) is a first step towards a System for Universal Multi-Media Access (SUMMA). Underlying this new approach are seven basic concepts: Simplicity, Questions, Levels, Strategies, Choices, Categories and Universals-Particulars and three premises. A description of these will clarify why and how SUMS is different than existing Internet solutions.
Many systems speak of simplicity through phrases such as Keep It Stupid and Simple (KISS). The general result of this quest is to reduce the functional dimensions of searching to a single field and hope that users will be happy or at least distracted. Meanwhile, these same systems continue the Mediaeval tradition of horror vacui. A major site is often equated with overwhelming the user with a tremendous amount of choices at once such that it often takes several minutes to determine which choice might lead one to what one is seeking. Moreover, these systems typically see no problem in filling the remainder of the screen with banner advertisements such that one is constantly being distracted from one’s goal.
SUMS is concerned with a different kind of simplicity which in technical jargon is called cognitive ergonomics. The assumption is that the mind only has a certain amount of capacity for attention. If one is searching for something then the tools used to search for something should be as invisible as possible. In this way one retains a maximum of attention for one’s goal. For this reason less on the screen is often more because we can concentrate on what counts. This may go completely against the maxims of advertising gurus but if our goal is education and learning we should not be distracted by those whose goal is selling and buying.
This commitment to simplicity need not entail a commitment to Keep it Stupid and Simple (KISS). The challenge is to attain simplicity without abandoning or compromising intelligence. On the contrary, a strategic simplicity allows intelligence to focus on what is important without being distracted by unnecessary information overload. In essence, the simplicity of SUMS is to strip away distractions without reducing the intelligence, focus and depth of the search. How can such simplicity be achieved? This is the role of the other functions of SUMS.
Search engines today typically promise to find us answers directly using search and retrieval techniques. On the surface they are very successful. Google typically provides thousands and even millions of hits. This kind of success is a problem in itself. No sane user can browse through a million hits. In the early days of browsers and search engines the challenge was finding enough hits. Today the challenge lies in limitation which as Goethe recognized long ago, is a true sign of a Master (Der Meister zeigt sich in der Begrenzung). To meet this challenge SUMS focuses in a first instance on questions rather than answers. SUMS uses the five basic questions of reporters plus why? It does in at least five distinct ways.
Present day search engines depart from an assumption that every user is potentially searching for everything. On the contrary, most of us are searching for something specific or things about some specific problem or series of problems. If we identify why we are searching we are effectively saying also the topics for which we are not searching.
In the case of beginners SUMS asks users to choose from one of 10 goals: Everyday, Emergency, Business, Education, Environment, Government, Health, Legal, Leisure and Religion.
For example, if we our goal is religion then we are searching for churches and religious places and we are not usually searching for Business locales, Hospitals, Government buildings, Legal offices etc. If our goal is Education we are typically not searching for the above mentioned institutions and searching instead for schools, colleges and universities.
Hence knowing a user’s goal helps a) to define that for which they are searching and b) immediately eliminates a great deal of that for which they are not searching.
Once we know the why of the search it is wise to learn the how. If the user’s goal is religion it is important to know whether they are searching for Christian, Hebrew, Islamic, Hindu, Buddhist religions. Next it is useful to know whether this search has temporal (e.g. the period 1800-1850) and temporal limits (e.g. only in England, in Europe etc.). All this helps to limit the range of the questions What and Who which are the normal entry points in library catalogues.
For more advanced users with a defined topic one can proceed more directly. One can use the questions Who, What, Where and When simultaneously. For instance if I type only Who= Leonardo, I am potentially searching for everything Leonardo wrote. If I type Who=Leonardo + What=optics I limit the search to what he wrote on optics. If I then add +Where =Milan and When= 1505-1508, I am only searching for a small amount of material by Leonardo on optics in Milan form 1505-1508. Using SUMS this process is reduced to a single search using four fields.
The six large questions in the upper part of the screen also have another function. They help identify the main goal of the search. Who indicates a biographical focus; What indicates a subject focus; Where indicates a spatial focus on a given area or place; When indicates a temporal focus on a given period, time or date. Once this main focus is identified then the questions in the classes of Choices help further to delimit the range of the interest. Even if my focus is Who, i.e. biographical, say in Goethe, I may still be interested in his work in a specific place, e.g. Weimar (i.e. Where), at a given time (i.e. When).
The notion of asking questions is as old as the history of answers. The Internet is fully aware of Frequently Asked Questions (FAQs) are a useful tool in approaching retrieval. Companies such as Ask Jeeves have commercialized this idea.
Instead of creating long lists of questions SUMS poses questions implicitly through a small series of choices usually in the form of 1 word statements. Underlying this approach is a simple strategy. Questions have a certain order depending on the depth to which one delves. For example it is of no use to ask which edition a person wants before they have decided which book they wish to find. Similarly details of a book’s size only become relevant when we know its title and edition. SUMS approach to simplicity does not remove complexity. Instead SUMS confronts the user with one aspect at a time on order that the complexity of the whole does not overwhelm one at any time.
Once one has found the book, painting or object for which one is looking there are a series of further questions to be asked concerning its quantity (size, weight etc.), quality (paperback, hardbound, leatherbound etc), space (places where distributed) and time (how long the edition remained on the shelves etc.), which further delimit knowledge about the object. We shall return to this in section 6 below.
Even ten years ago there were serious concerns whether web “crawlers” could ever hope to crawl through all of existing knowledge on the World Wide Web. The assumption of search engines today tends to be that since the technology is fast enough to “meta-crawl” everything, they should present us with everything. The good news is that this brings enormous results. The bad news, as suggested earlier, is that the results are so enormous that we cannot cope with them.
To meet this challenge SUMS begins with a distinction between a) pointers to knowledge, b) the object of knowledge and c) analyses and commentaries about that object. Pointers to knowledge are the equivalent of a virtual reference room. They entail 1) Terms as in classifications, thesauri, ontologies; 2) Definitions as in dictionaries; 3) Explanations as in encyclopaedias; 4) Titles as in library catalogues, book catalogues, bibliographies; and 5) Partial Contents in the form of abstracts, reviews, tables of contents. Knowledge entails 6) the full contents of a book, painting or other object.
In addition, analyses and commentaries which were traditionally classed as secondary literature entail 7) internal analyses when the study focuses on a single text, painting or object as in a monograph or catalogue raisonnée; 8) External analyses when the book or object is compared with others; 9) Restorations when the object has been “repaired” and 10) reconstructions when a ruin has been fully rebuilt. SUMS thus arrives at 10 levels of knowledge.
These Levels again lead to further distinctions. For instance, Level 8: External Analysis leads to basic comparisons:
3. Related Drawings
4. Related Paintings
5. Other Paintings by Author
6. Other Paintings by Theme
7. Other Media
8. Further Analyses
Further Analyses allows one to make external analyses in terms of Who (Persons), What (Objects), Where (Places), When (Events), How (Instructions) and Why (Reasons). Many early attempts at virtual museums aimed only at offering an example of Level 6: Full Contents. Many Internet examples address one of the other levels in isolation. For instance there are a number of sites on dictionaries, on encyclopaedias and on reconstructions.
SUMS distinguishes itself form these existing efforts in two fundamental ways. First it is concerned with links between the levels of knowledge: i.e. how a term is/can be linked with dictionary definitions, encyclopaedia explanations, titles, full contents etc. Second, SUMS is concerned with presenting only the level of detail that is necessary at a given stage. For instance, while one is dealing with pointers to knowledge (especially levels 1-3), before one has committed oneself to a given title, there is no need for the analysis levels (7-10). Hence the levels not only clarify where one is, they allow us to delimit the amount of knowledge with which we need to deal at any given moment. This is another example of how SUMS simplifies the amount of material without ‘dumbing down’ the process. Indeed, by providing the user with simple amounts at the right stage of their search, SUMS can increase the complexity of the approach as a whole, while simplifying the individual steps.
Another important ingredient in the process of searching has to do with the scope and depth that wishes to search. Sometimes we are searching very specifically for one word or 1 term. Sometimes we are searching for the words surrounding that term and need to understand the contexts associated with that term. As we widen the scope of the terms for which we are searching we need also to widen the scope of the sources consulted. To meet this challenge, SUMS introduces seven basic strategies.
1. Guided Strategy is the simplest of these. As the term suggests this is a mode for beginners who wish to have the equivalent of a guided tour as they make their search for a specific word.
2. Direct Strategy allows the user to search directly for a given term in a database devoted to that term.
3. Personal Terms Strategy assumes that the user is interested not only in a given term but also in the terms surrounding it, which have been identified by an expert in the field. The search now expands from one word to a cluster of words, which may readily range from 50 to 500 or more. The search may begin with a given database but these same words can be used as a basic vocabulary for searching in other libraries, museums, archives etc.
Once we choose a term we are given a number of further options in the choices box.
4. Database Fields Strategy. To know what to search for it is very useful to have clues as to what one can hope to find. One important clue, especially as one moves to external databases lies in knowing the fields under which facts have been collected. This offers a fourth strategy within SUMS.
5. Subject Headings Strategy uses the categories of library subject headings to increase the scope of the controlled vocabulary, which can be used in searching for terms related to the topic at hand. This entails access via simple alphabetical lists.
6. Standard Classification Strategy uses the relations of a standard classification such as the Library of Congress for further searching. Whereas the Subject Headings Strategy (5) typically entails alphabetical lists this has a tree structure which makes it easier to recognize related fields.
7. Multiple Classifications Strategy extends this principle to a number of major classification systems. In our demonstration users are provided with a master list of the arts and mathematics sections of these library classification systems. When we choose a term from this list, those systems which deal with the term are highlighted. Clicking on one of these takes us to the term within the tree of that classification system. From there we can go directly to libraries, which use this classification and find the holdings available under that term.
Small lists of appropriate questions are an essential feature of SUMS. At a meta-level there are 10 basic choices, namely: 1) Access; 2) Learning; 3) Levels; 4) Media; 5) Quality; 6) Quantity; 7) Questions; 8) Space; 9) Time; 10) Tools. Each of these classes of Choices leads to a series of further choices. The first two classes of Choices (Access and Learning) serve to identify the user’s level of age range, education and preferences, which serve as an a priori orientation before one begins one’s search. Classes 3 and 4 (Levels and Media) give one access to different levels of knowledge in different media. These classes serve primarily to orientate the user with respect to pointers to knowledge in the form of reference works and arrive at the full texts of different works.
The five classes 5-9 (Quality; Quantity; Questions; Space; Time) have two purposes:
Class 10 of the Choices (Tools) provides basic tools for editing, commenting on existing knowledge and creating new knowledge.
Whereas the introductory strategies of SUMS are concerned with pointers to knowledge in the form of virtual reference works, the more advanced levels focus on full texts and their analysis. To achieve this, SUMS builds on Aristotle’s work. Aristotle identified 10 basic categories for knowledge: 1) substance, plus nine categories, 2. Quantity; 3. Quality; 4. Relation; 5. Place; 6.Time; 7. Position; 8. State; 9. Action and 10. Passion (Affection). Three of these categories, namely, Quantity, Quality and Time recur as classes 5,6 and 9 of Choices. In Aristotle’s approach, Relation is limited to what Perreault would call Ordinal: Comparative relations. In SUMS this becomes a subset of Quality. In Aristotle, Place and Position are separate categories. In SUMS these become subsets of Space. In Aristotle, State, Action and Passion are three separate categories. In SUMS these become subsets of Questions: Why (figure 1). As one advances from SUMS to SUMMA and shifts from pragmatic claims to versions striving to approximate reality and truth, the Aristotelian categories are shifted in alignment with the six basic questions (cf. figure 2).
|Substance: Divisio||Subsumptive: Division||Levels: Term||What|
|Substance: Partitio||Subsumptive: Partition||Levels: Term||What|
Figure 1. Correspondences between Aristotle’s categories, Perreault’s relators and SUMMA.
|Levels (Term)||Space||Time||Quantity, Quality||Questions|
Figure 2. Correspondences between questions in SUMS/SUMMA & relators of Perreault.
(Not in Space-Time)
Type/Kind (is a)
Whole/Part (has a)
---------------------------------- Space-Time Horizon ----------------------------------
Figure 3. Six questions linked with universals/particulars and Perreault’s relators.
In the longer term this points towards a basic reorganisation of knowledge which has been discussed elsewhere. A fundamental quality of this approach will be to organise knowledge in terms of their basic relations as outlined by Perreault, using the same framework of simple questions established by SUMS (figure 3). This will the goal of SUMMA.
SUMMA (System for Universal Multi-Media Access)
SUMS is concerned with searching in existing knowledge which has already been classed by librarians and other knowledge experts. By contrast, SUMMA is concerned with knowledge structures that represent the frontiers of our knowledge today. SUMMA entails searching, organising, archiving and creating knowledge. Ultimately SUMMA is concerned with comparing ontologies, world views and reaching new synthesis. SUMMA adds three further levels to the strategies of SUMS.
8) Comparative Ontologies
Not yet functional is a further level which will combine classifications in new ways.
The approach of the UIA based on Dr Dahlberg’s matrix offers an example of an entry into this level. Whereas strategy levels 5-7 are concerned with purely pragmatic classifications of knowledge as they exist in library systems and other collections, this level is concerned with classifications of knowledge that represent the frontiers of knowledge today. It accepts that there may be more than one such explanation and is thus concerned with bridging methods to compare them.
9) Historical World Views
Today’s electronic systems assume that our contemporary world-view is the only viable system. Earlier cultures had other world- views, which organised their knowledge and facts in other ways. SUMMA aims to orientate users as they move from one world-view to another. This strategy helps to understand how facts, ideas and concepts have evolved over time.
10) Synthesis and Creation
A tenth strategy is concerned with collaborative work in arriving at new syntheses of knowledge and creating new knowledge. Here three-dimensional navigation will become particularly useful.
Three Premises of SUMS
i) SUMS distinguishes clearly between transitory and enduring knowledge. Search engines such as Yahoo attempt this by providing a category for Reference. This covers basic categories such as dictionaries and encyclopaedias but overlooks how each subject has its own reference works. SUMS addresses the problem by identifying different levels of knowledge to distinguish between objective and subjective dimensions (see Levels below). In addition choices in quality and quantity help establish whether materials are accepted only locally or internationally.
ii) SUMS effectively creates the equivalent of standardised style-sheets through a coherent presentation interface, which is also a set of criteria for searching. One of the fundamental problems of the Internet in its present state is that there is a complete absence of an accepted style-sheet. Hence one author lists articles under Publications, a second under Periodicals, a third under Journals. This makes things almost impossible to find because there are no common headings for finding materials which have things in common. Some of the efforts at metadata address these problems but will not solve them. Imagine trying to index a book or journal in which every author followed their own eccentric ways of documentation with respect to footnotes, headings etc. For this reason publishers impose a common style-sheet, such as the Modern Languages Association (MLA) or Chicago rules, when preparing a work for publication. In SUMS the lists of choices provide a standard set of terms for searching and presenting facts, the equivalent of a controlled authority list.
iii) SUMS separates operating space from presentation space. In most programs the space for making decisions and operations is not separated clearly from the presentation space. In Windows, for instance, some decisions are above the workspace, others are below, while subsidiary lists continually impinge on the workspace. SUMS separates these spaces by creating a specific area for lists of choices (see below).