Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / Java

Node Classification in Open Hypermedia System

0.00/5 (No votes)
4 Jan 2010CDDL18 min read 19.5K  
we propose our node classification technique which help users in identifying different nodes’ type, hence providing navigational assistance

ABSTRACT

Hypermedia systems can be found anywhere on the Internet, from online learning websites to social network websites. Hypermedia systems offer great navigational freedom with its rich link structure but this navigation freedom becomes a problem and contributes to Lost in Hyperspace problem.

   Most research in HCI (Human Computer Interaction) has been focused towards usability of hypermedia system from user-interfaces perspective, but little has been done to address usability of the underlying link structure of hypermedia systems.

   In this paper we investigate the issues which address the usability of hypermedia structures. We (i) investigate the problems encountered by the hypermedia users, for instance ‘LIH’ problem; (ii) explain human memory organisation & mental models; (ii) argue which hypermedia structures are better to organise information; (iii) cite techniques to discover hierarchies in the hypermedia structure; (iv) finally we propose our node classification technique which help users in identifying different nodes’ type, hence providing navigational assistance.

KEYWORDS

Hypermedia Structure, Usability, Hierarchical Structure, Node Classification.

1.     Introduction

 
 “Hypertext” refers to a set of information nodes connected with a sets of links in between the nodes, when the information stored in the nodes are not only text, but also image, audio, animation and video, is known as “Hypermedia”. Hypermedia structure refers to the underlying link structure between the nodes.

   The main strength of hypermedia systems is that they have a flexible structure and give user great deal of freedom to browse and interact with the information contained within them [31]. It is precisely this freedom that gives rise to many problems; one of the problems that seem inherent within hypermedia systems is that users tend to lose their way in the maze of information within the system. This is commonly referred to as the ‘Lost In Hyperspace’.

   Nielsen describes ‘LIH’ as “one of the major usability problems with hypertext” [30] and suggests that to help users navigate the hyperspace; assistance must be provided to understand and recognise their present location in the overall structure. A logical progression would be to apply the techniques, which are used to organise physical space into electronic space. So the users can apply their navigational skills learnt in physical space, to navigate them in electronic environment.

   In the domain of Urban Design, Lynch concluded that people represent their physical environment (as mental or cognitive maps), he found that subjects frequently recall particular features, such as nodes (hubs), landmarks (a recognizable point of reference in the larger space), paths (routes between locations) and edges (boundaries) and these are used effectively for navigation in physical environment [26]. To capitalise on this, several researchers have tried to see, to what extent navigational strategies from the physical environment can be applied to the electronic environment. In particular, Botafogo et al. [5] and Modjeska & Marsh [28] tried to identify landmarks within electronic environment, so that the users can apply their navigational skills learnt in physical environment, to navigate them in electronic environment.

To facilitate navigation in electronic environment the environment must be organised in a structure. The question arises here, which organising structure is the best suited, to structure a knowledge domain? In this paper we argue that hierarchical structure is the best structure to organise the information domain.

   The remainder of this paper is structured as follows: Section 2 surveys the major problem faced by hypermedia users such as ‘LIH’. Different hypermedia structures are explored in Section 3. We conduct an experiment to analyse websites in Section 4. In Section 5 we extend the ideas from Section 4 to propose our node classification technique; and finally, Section 6 concludes.

2.     Lost In Hyperspace

Often the user lacks an intuitive sense of location and orientation within the hypermedia structure, not knowing where the user currently is, in the overall structure and which is the best way to proceed from there? Computer Science literature refers to this as the ‘LIH’ problem.

   Foltz and Davis [19] believed that ‘LIH’ problem is due to a lack of informed design of the space and insufficient navigational affordances, particularly to describe and to locate the physical environment.

The problem appears to be one of unfamiliarity with the hypermedia structure rather than user being unfamiliar with the information content [31].

  In an experiment, Elm & Woods [17] gave users of a hypertext system a set of information retrieval tasks and found that the degree of ‘LIH’ experienced by the subjects was independent of their level of expertise in the information domain.

As a result of their study three different forms of being lost were outlined:

  1. Not knowing where to go next.
  2. Knowing where to go, but not knowing how to get there.
  3. Not knowing where they are in the overall structure.

‘LIH’ problem appears to be navigational related problem, to tackle this efficiently navigational strategies used in physical environment must be extended and adopted to assist navigation in electronic environments.

3.     Hypermedia Structures

   How hypermedia system can then be structured to minimise cognitive overhead and maximize coherence?

Many possible structures have been suggested. Three typical structures used in experiments are “hotwords”, “hierarchical maps”, and “spider maps”. “hotwords” are simply links embedded in the body of the text. “Hierarchical maps” provide a graphical representation of the information in a hierarchical structure similar to an organisational chart. “Spider maps,” also called “concept maps,” provide a more detailed graphical representation of the information with more extensive non-linear cross-linking, similar to a spider web. Findings from Beasley & Waugh [2] concluded that there was a significant difference in perceived disorientation, with subjects in the “hotwords” condition having the most disorientation while subjects in the “hierarchical maps” condition having the least [2].

   Another study comparing “hotwords” versus “hypermap” structures revealed no difference in recall but found that the group using “hotwords” reported feeling significantly confused and frustrated [33]. One explanation of why “hotwords” perceived as confusing and frustrating might be because “hotwords” demand additional cognitive resources from the user. For instance, the users have to remember long trails of link, in order to get explanation of unfamiliar words in the knowledge domain.

In another study, Modjeska & Marsh [28] conducted experiments on websites, which represented two structural types (strongly1 & weakly hierarchical1) and two sizes (small & large). Their experiment produced number of preliminary results. On a website scale, the structure had a significant effect on user navigation, with strongly hierarchical sites having a greater number of nodes accessed than did the weakly hierarchical sites. The website structure had significant effects on user perceptions with strong hierarchical sites being perceived as smaller than the weak hierarchical sites. They concluded that strong hierarchical structures appear more usable than the weakly hierarchical ones.

3.1     Discovering hierarchies in Hypermedia Structure

  

We have cited several literatures which suggest that hierarchical structure is the best-suited structure for making user’s mental model about the system, and then a method to discover hierarchies in the existing hypermedia system is required so that any new node can be found automatically.

Botafogo et al. [5] report that the authors of hypermedia systems are encouraged to create hierarchical structures, but when writing, the hierarchy is lost because of inclusion of the cross-reference links. They suggested the ways of recovering the lost hierarchies and finding new ones; they also suggested helping authors by identifying the properties of the hypermedia structure by the use of the metrics, these metrics can be used to capture the useful properties of the hypermedia structure.

   To find hierarchy in a structure, two tasks must be performed: first the root node must be identified and then hierarchical and cross-referential links must be distinguished, and they recommend following as the fundamental properties of the Root Node:

 1.  It has to reach almost every node in the hypermedia.

 2. The distance from the root to any other node should not be too large.

If the distance from the root to a node is very large, there will be a navigational problem. Users will have to go through a long path before reaching the desired information.

 3. Finally the root must have a reasonable number of children.

Breadth first search (<stockticker>BFS</stockticker />) algorithm can be applied to the root node in-order to differentiate hierarchical and cross-referential links in the hypermedia structure.

 

4.     Our Experiment

We conducted an initial experiment to extend Botafogo et al. approach; the purpose of the experiment was to encode the hypermedia structure into a distance matrix. So we can help users in identifying landmarks within the structure and to make additional observations which could aid navigation in the hypermedia structure.

  An html parser (written in Java) was used to crawl through links on a web page to capture complete link structure of a website in a distance matrix, example is shown in figure 2 to illustrate the idea.

                                       Fig 2.  An example to explain: A graph (a) with its “distance matrix” (b) and “converted distance matrix” (c). In (c), replacing infinity with a large value, K=4 (conversion constant > any single entry in the table)

The experiment was designed to count number of in-links (links from other pages pointed to the current web page) out-links (links pointed to other web pages). Web pages with the highest number of out-links were classified as index pages, and web pages with the highest number of in-links were classified as reference pages.

   The experiment successfully identified the index and reference pages from the captured websites but just by identifying index and reference pages does not provide sufficient help the user to efficiently navigate the structure, especially in large & complex hypermedia structure.

   A basic idea is that with a book, there is a certain set of expectations or schema of what a book is, and how it is comprised. We know that we can look in the index to find out the location of information. As familiarity with the text grows, the reader becomes more familiar with the various landmarks in the text and the relationship between them. The reader is in effect building up a mental model of the text, based on the orientation cues.

These orientation cues, which are so important for constructing a mental model, are absent in many hypermedia systems [31].

   In line with the above, we learnt from the experiment that more comprehensive node classification is required to identify various different types of nodes in a hypermedia system to make the structure more comprehensible and to assist the user in way finding activities in the structure.

5.     Node Classification

  

The classification of the nodes must produce more categories than just index and reference nodes, to analyse every node’s position in the overall structure and possible usability problems within the hypermedia structure.

We classified nodes into following categories: 

Self-referencing Node:                                   

These nodes contain links on the top and are broken into parts such that each link on the top points to the different part of the same node. They can be easily identified by a crawler because their link contains # to denote self-referencing.

Dead End Node:

These nodes do not have any out-links.

Index Node:

This type of nodes contains a large amount of out-links, and their distance to other nodes is relatively small. Also they are one of the only nodes, which can reach most of the nodes in the site. Site Map of a website is a good example of such nodes.


Entry selection Node

This type of node was seen on start of website, either to choose the country of their residence or the language. This node usually points to different parts (sub-structure) of a hypermedia. Their out-links point to the (root of) sub structure.


Information Node

These nodes have relatively high in-links than out-links. The in-links value might be very low or 0 (dead-end nodes can also be information nodes).


Applet Node: These types of nodes contain applet tags and can be categorised as an applet node.

Data Entry Node: Nodes, which contains a number of form tags this, would suggest that user could enter data (user response). In addition, these nodes have a less number of out-links i.e. Submit button.

Search Node:  These types of nodes contain a searching mechanism.

File Download – Pages with file downloads like pdf or exe files.

6.     Conclusions and Further Work

 

In this paper we have discussed the major problem faced by the users of hypermedia system such as the “Lost in Hyperspace” (LIH) problem. We have cited previous research which establishes that it’s a navigational related problem. The solution to this problem is to enhance navigation in hyperspace by employing techniques used for navigation in the physical environment. To provide navigational assistance to the user, the hypermedia system must inform the users about the sense of distance, direction and their current location in the overall structures; similar to a geographical map of a region. Briefly, identifying landmarks in hypermedia structure would provide a simplified view of the structure; it would reduce the cognitive overhead and would assist navigation in the hypermedia structure.

In order to get better understanding of the problem, we conducted an experiment based on Botafogo et al. approach to identify landmarks such as root nodes within hypermedia system.

One of the key finding of our experiment was that, just by identifying root nodes doesn’t really solve LIH problem. We believe that LIH phenomenon can be minimised by identifying nodes into several classes for instance self-referencing nodes, data-entry nodes, etc. This might help users to remember, what kind of node they are looking for? And could apply search filter based on the user required node class.

Another of our key finding was that there should be different classes of links, each class can be represented by different colour for feedback to the user and it will make crawling easy because crawler can be programmed to handle different link class differently.

Furthermore, we believe that once the landmarks have been identified, additional <place>Meta data could be associated with it to list all the important attributes of the landmark such as, Name, No of children, Creation Date, Last modification Date, No of Visits.

These landmark attributes would have to be updated whenever <place>Meta data is updated but if we can automate this process then it would make the crawling a lot more quick because we just would have to read this <place>Meta data of all the landmarks with the hypermedia system and report to the authors on a real-time basis. This will provide very useful information to the hypermedia authors.

No of visits can be displayed as heat maps.

Limitation

Our current crawler can’t read-in secure sockets and the authentication mechanism. Currently, we are not parsing cascading style sheets attached with a webpage.

Advantages

Current implementation can pick up broken URLs quickly and can make a graphical representation of the website link structure. We can merge landmark attributes data with the website link structure, for instance no of visits to a particular landmark could be represented as head map on the website map.

Possible Application

The crawler can be used to build up the website link map and visual tool (under development) can display this map to the user with user’s current location in the overall structure and possible next moves within the structure.

For future work, we are looking to conduct a real user experiment to demonstrate the effectiveness of our node classification proposed in this paper.
  

We are working on a tool which will show simplified view of a hypermedia system to authors and users. Our next article will be based on this tool.

7.     REFERENCES

[1] Asahi T, Turo D and Shneiderman B, Visual decision-making: Using treemaps for the Analytic Hierarchy Process. Proceedings of CHI'95 Conference, <place><city>Denver, <state>Colorado, <country-region>USA, 1995.

[2] Beasley, R.E., & Waugh, M.L. “Cognitive mapping architectures and hypermedia disorientation: An empirical study,” Journal of Educational Multimedia and Hypermedia, (4:2/3), 1995, pp. 239-255.

[3] Berk, E. and J. Devlin (1991). What is hypertext? In E. Berk and J. Devlin (Eds.), Hypertext / Hypermedia Handbook, pp. 3 -- 7. <place><city>Mc Graw-Hill, <state>New York.

[4] Bonfigli M, Casadei G, Salomoni P. Adaptive Intelligent Hypermedia using XML, Proceedings of SAC 2000 - ACM Symposium on Applied Computing, Villa Olmo, Como, Italy, March 2000, Vol. 2, pp.922-926.

[5] Botafogo Rodrigo, Rivlin Ehud and Shneiderman Ben, “Structural Analysis and Useful Metrics.” ACM transactions on Information Systems, Vol. 10, No.2, April 1992, pp. 142-180.

[7] Calvi, L, De Bra, P. Improving the Usability of Hypertext Courseware through Adaptive Linking, Proceedings of the Eighth ACM Conference on Hypertext, pp. 224-225, <place><city>Washington <state>DC, 1997.

[8] Catledge L, & Pitkow J. “Characterzing  Browsing Strategies in the World Wide Web”, In the Journal of Computer Networks and ISDN System, Vol 27, pp 1065-1073, Elsevier Science, 1995.

[9] Charney, D. “Comprehending non-linear text: The role of discourse cues and reading strategies”. In Proceesings of the hypertext ’87 Conference (Charlotte, N.C., Nov. 13-15, 1987). ACM, <state><place>New York, 1987, pp. 109-120.

[10] Chen, S. Y. (2002) A Cognitive Model for Non-linear Learning in Hypermedia Programmes.  British Journal of Educational Technology. 33(4), 453-464.

[11] Collins, A M., & Quillian, M.R (1969). Retrieval time for semantic memory. Journal of Verbal Learning and Verbal Behaviour, 8, 240-247.

[12] Conklin, J. Hypertext: An introduction and survey. IEEE Computer, 20, 9 (Sept. 1987), 17–40.

[13] Darken, P., Sibert, J. 2001, Wayfinding Strategies and Behaviors in Large Virtual Worlds. In Proceedings of the 10th international conference on <stockticker>WWW. <place>Hong Kong, 326 – 333.

[14] David Canter, <place><placename>Rod <placetype>Rivers, and Graham Storrs, “Characterizing User Navigation through Complex Data Structures.” In Behavior and Information Technology, Vol. 4, No. 2, pp. 93-102. 1985.

[15] Dix, Finlay, Abowd, Beale: Human Computer Interaction, Prentice <place><city>Hall, <country-region>U.K., 1998

[16] Ehud Rivlin and Rodrigo Botafogo and Ben Shneiderman. “Navigating in hyperspace: designing a structure-based toolbox”. Communications of the ACM, Vol. 37, No.2, pages: 87-96, 1994.

[17] Elm, W.C., Woods, DD., 1985. Getting lost: a case study in interface design. Proceeding of Human factors society ACM press, 927-931.

[18] Fillion F., Boyle C,. Important issues in hypertext documentation usability. Proceedings of the 9th annual international conference on Systems documentation, <place><city>Chicago, <state>Illinois, <country-region>United States pages: 59–66, 1991.

[19] Foltz, M. A. and Davis, R. (1999). Design principles for navigable information spaces. Summarizes material found in Designing Navigable Information Spaces.

[20] Griffin J, Randolph G. Web Experience And Hypermedia Structure In Online Learning, Proceedings of the seventh annual consortium on Computing in small colleges midwestern conference, 2000, pp. 44 – 53.

[21] Halaz F.G., Moran, T.p, and Trigg, R.G. “NoteCard in a nutshell”. In proceedings of the ACM CHI + GI’87 (Toronta, Ont., Apr. 5-9, 1987). ACM, <state><place>New York, 1987, PP 45-52

[22] HARARY, F., <city><place>NORMAN, R, Z., <stockticker>AND C!ARTWRIGHT, D. Structural models. An Introduction to the Theory of Directed Graphs. Wiley, <state><place>New York, 1965

[23] Hongjing Wu, Erik de Kort, Paul De Bra. Design Issues for General-Purpose Adaptive Hypermedia Systems,

[24] Johnson-Laird, P.N. Mental models. In M.I. Posner, Ed., Foundations of Cognitive Science. MIT Press, <place><city>Cambridge, <state>MA, 1989, 469–499.

[25] KAUFMANN, A. Graphs, dynamic programming and finite games. In Mathematics in Science and Engineering 36. Academic Press, <state><place>New York 1967,

[26] Kevin Lynch, (1960). The Image of the City. <place><city>Cambridge, <state>Massachusetts: The MIT Press.

[27] Lokuge, <place>I., Gilbert, S. A., and Richards, W. 1996. Structuring information with mental models: A tour of <city><place>Boston. In Proceedings of SIGCHI ‘96, 413-419. <state><place>New York: ACM.

[28] Modjeska D, Marsh A. Structure and Memorability of Web Sites, Working Paper in the Department of Industrial Engineering, <city><place>Toronto, <place><placetype>University of <placename>Toronto, 1997.

[29] Nielsen J, Usability Engineering, Academic <place><city>Press, <country-region>UK, 1957

[30] NIELSEN, J. The art of navigating through hypertext. CACM 33, 3 (March 1990), 296–310.

[31] Otter, M. & Johnson, H. (2001): Lost in hyperspace: metrics and mental models. Interacting with Computers 13 (1), 1­40

[32] Reed, W.M., & Oughton, J.M. “Computer Experience and Internal-based Hypermedia Navigation”. Journal on Computing in Education, 30(i), 1997.

[33] Reynolds, S.B., & Dansereau, D.F. “The knowledge hypermap: An alternative to hypertext,” Computers in Education, (14), 1990, pp. 409-416.

[34] Riding, R., & Rayner, S.G. (1998). Cognitive styles and learning strategies. David Fulton Publisher, <city><place>London.

[35] Rudolph P. Darken and John L. Silbert, “Wayfinding Strategies and Behaviors in Large Virtual Worlds.” In Human Factors in Computing Systems: Proceedings of CHI '96. <state><place>New York: ACM, 1996.

[36] Searleman, A., & Herrman, D,. Memory from a Broader Perspective, McGraw-Hill Press, <state><place>New York, 1994

[37] Sougata Mukherjea and James D. Foley, “Showing the Context of Nodes in the World-Wide Web.” In Human Factors in Computing Systems: Proceedings of CHI '95 (Companion Volume). <state><place>New York: ACM, 1995.

[38] Sougata Mukherjea and James D. Foley. “Showing the Context of Nodes in the World-Wide Web.” In Human Factors in Computing Systems: Proceedingsof CHI '95 (Companion Volume). <state><place>New York: ACM,1995.

[39] Theng, Y.L., "Lost in hyperspace? A look at four viable approaches," HCI'95 Adjunct Proceedings, <country-region><place>U.K., 1995.

[40] Thuring, M, Hannemann, J, & Haake, J. M. Hypermedia and cognition: designing for comprehension, Communications of the ACM, (38:8), 1995, pp. 57-66.

[41] Thüring, M., Haake, J.M., and Hannemann, J. What’s ELIZA doing in the Chinese Room? Incoherent hyperdocuments–and how to avoid them. In Proceedings of Hypertext´91. ACM Press,<state><place>New York, 1991, pp. 161-177.

[42] Tomek and H. Maurer. Helping the user to select a link. Hypermedia, 4(2):111-122, June 1992.

[43] Van Dijk, T.A., and Kintsch, W. Strategies of discourse Comprehension. Academic Press, Orlando, 1983.

[44] Woodhead, N., Hypertext and Hypermedia: Theory and Applications, Sigma <place><city>Press, <country-region>U.K., 1991.

License

This article, along with any associated source code and files, is licensed under The Common Development and Distribution License (CDDL)