The following is an excerpt from Keeping found things found: The study and practice of personal information management, by William Jones. It is reprinted here with permission from Morgan Kaufmann, a division of Elsevier; Copyright 2008. Read the excerpt below or download a free .pdf of "Personal information management: Finding information -- again."
This chapter excerpt is about personal information management, which, on a basic level, is not so different from the more thoroughly covered enterprise information management. We have included this chapter excerpt in SearchDataManagement.com's Chapter Download Library because the fundamentals behind collecting, organizing, archiving and finding information are somewhat comparable for both corporate and personal data.
4.1 Starting out
"Where is it? I know it's here somewhere. Where on earth did I put it?" "It" can be the car keys or a pair of shoes. "It" can be an information item such as an email message or a tax-related document. Certainly some of the more painful, memorable failures of personal information management relate to failures to find information that "I know is here . . . somewhere." In my [personal information management] seminars, people report efforts to find information -- especially paper documents such as a title to an automobile they wish to sell or a birth certificate or a passport -- that may extend over a period of a week or more. Finding over such a period of time may be a collaborative effort involving various members of a household or an office work team. Failures to find can be a real source of discord as frustration mounts and accusations and recriminations are exchanged ("What did you do with it?").
If finding activities, with their focus on the location, output, and use of information, represent an endpoint in personal information management, their study is also a natural place to begin taking a closer look at [personal information management]. The finding stage makes apparent, sometimes painfully so, larger failings in a person's practice of [personal information management].
We are aware of larger acts of finding especially when these involve significant time and creative effort to complete or when these involve a conscious search as supported, for example, by a web service or a desktop facility. But large portions of a day can be consumed in many smaller acts of finding -- the many look-ups required to fill out a form, for example, or the repeated references to a calendar in an effort to schedule a meeting. Though the many smaller acts of finding may each take only a little time, together they can still add up to a significant portion of a day.
In this chapter, we will see that finding is as much about interaction as about end result. There are two senses in which this is true. First, we can find the targeted information and yet still feel a sense of failure unless the process of finding the information was reasonably short, pleasurable, and trouble-free. Second, even though our focus is on the targeted information, we may gain considerable benefit from the incidental interactions with information along the way to this targeted information. To take a simple example, if a person checks her calendar to see when today's staff meeting will take place, she may happen to notice that another meeting is also scheduled for later today. More generally, the path taken to targeted information can be a source of serendipitous discovery. Finding is about the journey as well as the destination.
These topics are explored in this chapter as we move through the following sections:
- Getting oriented. Research on finding, as a personal information management activity, is placed in the context of a larger field of research on information seeking. The challenges of finding and the opportunities for tool support vary according to whether we've experienced the items we seek before and whether these are in a store that we own and that is (nominally at least) under our control. Focus in this chapter is primarily on efforts to find (re-find) information we've experienced before and that is inside our PSI in a store that we control. This section also considers the essential movement that underlies finding activities: from a current need to the access and use of information that meets this need.
- Everyday finding: Death by a thousand look-ups. We consider the many small acts of finding that can add up to a significant proportion of a day's time -- and its frustrations.
- Finding is multistep. Any act of finding involves an interplay between recall and recognition. We recall something about the item we are trying to find that might help to narrow the scope of a scan to recognize the item (e.g., in a folder or search results listing). Also, the finding process overall must often repeat several times so that a complete set of items is assembled. Finally, we must remember to find in the first place. Finding can fail because of the failure of any of these steps.
- The limitations in ideal dialogs of finding. We consider an ideal finding dialog with the computer (i.e., with a computer-based search tool) to be one that is much like the dialog we might expect to have with a well-trained human assistant. The computer can make use of anything we can recall. The computer orders and represents candidate items to aid us in our recognition of the item we seek. Support of a dialog like that between two people has many advantages but also one fundamental limitation: we may not always know or be able to express the things we need to find.
- Wayfinding through the PSI. We explore wayfinding as an alternate, complementary metaphor to finding as a dialog. The wayfinding metaphor gives emphasis to finding as a journey through the PSI -- from a need and the situation prompting this need to information and then back again. As with any journey, a journey through the PSI can be serendipitous, yielding useful information we did not expect and would not have thought to ask for.
What is information finding? We begin with Wilson's definition for information seeking (2000):
This definition emphasizes the teleological or purposeful nature of finding: finding to satisfy a goal. Certainly goals matter. Failure to reach the goal, failure to find the sought-for information, is frustrating, costly, and memorable. For example, Sellen and Harper (2002) review studies suggesting that the average manager spends three hours a week looking for documents that have been misfiled.
But in a large number of everyday instances of finding, the end -- if defined as eventually getting the information -- is not in doubt. The information will be found one way or another, sooner or later. The bigger questions are how, and how long it will take. Can we find information easily and in the natural course of our efforts to get things done? Or is finding a separate, timeconsuming, disruptive experience that takes us away from the other activities of a day?
As a supplement to Wilson's definition, this chapter takes the following slant on the finding activities of [personal information management]:
Certainly the efforts of people to re-find information that they have previously saved involve interactions with the PSI. But a person's PSI is also invariably involved in efforts to find new information from a public space of information. The need behind a search on the Web for hotels in Montreal, for example, may be triggered by the glance at a personal calendar indicating that the trip to Montreal is only a month away. The search itself may be made easier, if ever so slightly, by having a favorite search service referenced on the home page -- a part of the PSI. Or the web page may be quickly accessed through a web address that is specified by a keystroke or two and completed by an "auto-complete" suggestion of the browser. The information the browser uses to complete the address is also part of the PSI.
Information, once found, is often applied to an item in the PSI such as an email message or a document being created. Also, the information itself or a path to the same may be kept in the PSI so that it can stay found -- easier and more likely to be re-accessed again later. The "marks" (impacts) of information found may eventually be on, or in, the person in the form of new understanding or new internal knowledge structures. In the meantime, however, people can at least hope that the marks of information found are in their PSIs.
4.2.1 Finding vs. re-finding in public stores vs. private stores
Activities of information finding can be placed in one of four quadrants in Table 4.1 according to two senses of "personal" as described in Chapter 2:
- Is the information controlled ("owned") by us? Most of us control (in principle) the information on the hard drive of our personal computer. We can move or delete it (though many times we're properly reluctant to do so for fear of creating problems). Most of us control very little if any information on the Web. We can't move or delete it.
- Have we experienced (seen) the information before?
Activities of finding in each quadrant are important to a practice of personal information management. Each quadrant presents its own challenges and its own opportunities for tool support.
A. Re-finding information we control (and have seen before)
Getting back to "my" information is especially important in [personal information management]. We took the trouble to keep it and sometimes to create it in the first place. We're not done with it yet and may never be. We may have partially read an email message, for example, or a document, with the intention to read more thoroughly later. Some information -- articles, contacts, or a spreadsheet with passwords -- may be general reference to which we return repeatedly. Desktop search facilities can help, especially if these are integrative in their ability to search across multiple information forms, and even more so as these tools are increasingly integrated into our use of other tools. Generalized support for automatic completions and fill-ins can also help in the reuse of smaller pieces of information such as email addresses, phone numbers, and account numbers (see the "What next for tool development" sidebar). Also relevant are efforts to automate the tagging and grouping of information items by task and efforts to situate information items in the context of the planning of a larger project.
Table 4.1 An event of finding can vary according to where targeted information is and whether the person is trying to find this information again
|The information is...||Controlled by us||Not controlled by us|
|Seen before by us||A||B|
|Not seen before by us||D||C|
B. Re-finding information on the Web
We also return to information on the Web. In a study covering six weeks of web visits for 23 participants, Tauscher and Greenberg (1997a, 1997b) found that there was a 58 percent likelihood that the next page seen by a person was a page the person had already accessed at some point in the past. Jones et al. (2003) found that people use a variety of different methods and supporting tools for returning to Web information. Especially popular are " do-nothing" methods that require no keeping forethought. These methods include (1) clicking through hyperlinks from a familiar starting point such as a web portal or home page, (2) searching again, and (3) "auto-complete" facilities that suggest completions to a partially typed web address where completions are drawn from web addresses for pages previously visited.
C. Finding new information (not seen before) on the Web and other public stores
There is an extensive body of work on information seeking and information retrieval that applies especially to events in the C quadrant. It is important here to note that there is a strong personal component even in efforts to find new information, never before experienced, from a public store such as the Web. For example, our efforts to find information may be directed by an outline or a to-do list that we maintain in our personal space of information. Access to new information items may be through a query that we maintain in our personal space as a bookmark or even as a list of words we keep in written form (or "in mind"). Much more can be done to use existing personal information in efforts to find new information from a public space. Equally important, much more can be done to situate our searches on the Web with respect to informational situations in the PSI that prompts these searches -- a topic of further exploration throughout this book.
D. Finding new information that we control
The amount of information we (ostensibly) control continues to increase along with increases in the capacities of storage devices we own. When we use a desktop search facility, we may be surprised by what we find -- by what we "have" already. One challenge in tool support, discussed later in this chapter, is to call a person's attention to information he or she has already and that may be relevant to the current situation -- and to do this without becoming a nuisance.
Note that two other chapters in the book stand in different, complementary relationships to the current chapter. Problems experienced during finding often originate as earlier failures of keeping and organizing, as explored in Chapter 5. Searching technology, as explored in Chapter 11, can support finding in ways less dependent on careful prior keeping and organization of information.
What are the differences, really?
How much do differences in where and how we find information matter as long as we get the information we need? Certainly there's a difference between new finding and re-finding. If people have a specific item in mind, their search is more focused. They have memories from previous encounters that they can use (or ought to be able to use) in order to narrow the scope of the current search.
What about differences between finding information we control vs. finding the information "out there" on the Web and other public repositories? We may increasingly have the experience of finding new information inside our PSI in a store that we control. Such information, though newly found, is often information we ought to have experienced (that is, known was there to be found) or might want to have experienced, even if we haven't -- so far. Our reaction, for example, to the discovery of an unread email message sent to us by a friend a year or two ago -- even if only an "fyi" pointer to Web site of possible interest -- is likely different from our reaction had we discovered a pointer to the same web site in someone's blog instead. The email message is directed to us personally.
Most important, the experience of failing to re-find an information item is different when the targeted item resides in a store we control vs. the Web or some other public store. We're less surprised when a Web page visited yesterday is not available today. An access failure could be for any number of reasons beyond our control -- frustrating, to be sure, but "these things happen."
Do we generally show the same kind of equanimity when information under our control (nominally at least) can't be re-found? Leave aside documents we have authored which may represent many hours of our own work. Consider, instead, an article written by someone else which could just as easily be found on the Web as on a local hard drive. The failure to find this article inside the PSI often stands for something much larger than the loss of the information in the article itself. Failure can come to represent a larger failure in our lives . . . a loss of control. "Where on earth is it? Am I losing my mind, too?" We take it personally.
This chapter's discussion will focus primarily on fi nding in quadrant A -- where the effort is to find (re-find) information we've already experienced from a store that is under our control.
4.2.2 From need to information (and back again)
We're almost done with this section's brief orientation. The remaining task is to consider two points in relation to this chapter's subtitle.
- The journey from need to information is round-trip. Once information is found, it is either applied -- "used" -- in the situation that prompted its retrieval, or it is possibly kept for use later. Activities of finding and re-finding need to be situated in this larger context.
- Our needs change with every step we take. Our own understanding of a current need, as reflected, for example, in our descriptions or in our seeking behavior, is constantly changing. Many changes in our understanding of a need are brought about by the actions of finding (seeking) itself and by the information retrieved.
The journey from need to information is round-trip
Information management and use are interwoven. This central theme of the book encourages us to think beyond just the location and access of information. Information revealed through browsing or "web surfing" or as referenced in a results listing returned by a search query is subject to several different keeping decisions. Is this information useful? Now, for a current need, or later for an anticipated need? If later, does anything need to be done now to ensure its availability later on? Do reminders need to be set? Should the information or a pointer to this information be kept?
But considerations of the return trip apply even if information is used immediately and then discarded. How is the information used? Is the information sent out in an email message or used in a document? Is information used as is, or are steps taken first to interpret, make sense of, and integrate the information into a larger document?
We might dream of a high-fidelity Copy & paste (or Drag & drop) in which the information found is copied into a new information item without loss of formatting. Better, the reference to this information is also copied. Portions of this reference can automatically be included, for example, in a document's bibliography. Other portions of the reference might be included in a hyperlink that makes it easy to get back to the source for more information as needed and possibly with the excerpted information highlighted in context. In some cases, we might even want to subscribe to updates in the content of web sites from which information is excerpted.
Why stop here? Why not also situate an act of finding itself with respect to an informational context. For example, we see a line item in a spreadsheet budget and send an email to our group's financial person for clarification -- an act of finding. Why not record this act on an optional overlay to the budget's display? Later, when we want to review the response to our inquiry, we then have two ways to return to the email we sent and its responses: either go back to the context (e.g., the spreadsheet budget) that prompted us to send the email, or try to access email responses from the context-free jumble of the inbox. Which would you pick?
Or, as another example, we see a web site describing a new product and are moved to search for blogs giving commentary on the tool. Rather than carefully saving useful results in a separate document or through a separate bookmarking facility, why not save them as an overlay to the web site that prompted us to search?
Alas, current support for the "return trip" -- from information found back to the situation that prompted its finding -- still falls far short of what we might hope for. Finding tools such as search facilities and email applications still function more as worlds unto themselves rather than as an integral part of our informational context. As we use these tools to access needed information, we're still mostly on our own in our efforts to return to the informational context prompting the finding in the first place. And, as Karger (2007) notes, the transfer of information from the source back to the current situation through Copy & paste (Drag & link) is often still text-only. Or worse, we get information in paper form or as a scanned image and must then either transcribe or attempt to use character recognition software.
The larger point here is that we need to think of finding activities as part of a larger journey: from need to information and then back again, to need and the use of information to meet this need. Doing so raises practical questions that are easily overlooked if the focus is only on finding the information itself.
Needs change with every step we take
Even our effort to describe a need creates a refinement in our understanding of it and, sometimes, leads us to abandon the need altogether. "I want to see information on Paris's nicest hotels. . . . On second thought, maybe I'd rather stay at a little pension in a quiet neighborhood."
Belkin (1993) notes, "There is by now a substantial literature, from both theoretical and empirical perspectives, on the non-specifiability of information 'needs'" (p. 59). Several models of information seeking move beyond what Belkin refers to as "the standard view of IR" (information retrieval) in which a person approaches an information system with a well-specified query and "the major issues of concern" are "the representation of texts, and of queries, and techniques for the comparison of text and query representation" (p. 56). Interactions with an information system can be characterized as a process of negotiation, a dialog, or a process in which a person's knowledge changes through interactions with the information retrieved, leading, in turn, to a reassessment of information need.
Especially evocative is M. J. Bates's berry-picking model of search (1989), depicted in Figure 4.1, to account for situations in which search interest and the expression of this interest evolve over time as a function of results returned by previous searches. For example, a search for "hotels in Paris" may return one result for a bed and breakfast inn in Paris. The person might then have the thought, "Hmmm, maybe I would like staying at a bed and breakfast better than at a large hotel," which is then reflected in a follow-up search for "bed and breakfast inns, Paris." The new query reflects a shift in interest -- affected by previously returned results but not simply an attempt to search within these results.
The concept of information need is itself subject to many interpretations. In the dialog between student and teacher, for example, who has a better understanding of the student's information needs with respect to an assigned term paper? Taking a behaviorist's position, Wilson (1981, 2005) suggests abandoning the concept of "information need" altogether in favor of only focusing on observable information-seeking behavior. Dervin (1992) refers instead to "sense-making" activities as motivated by a person's perception of a gap in his or her understanding of a situation.
Figure 4.1 The berry-picking model describes a situation in which a person's search query (Qn) wanders as a function of results returned from previous queries.
However, it is difficult to assess the effectiveness of various finding activities or the tools designed to support them without making some assumptions concerning the motivation behind and desired outcomes of these activities.
Rather than abandoning the notion of need, it may be better to acknowledge that the assessment of need changes radically by person and in the same person with every step taken. The student now perceives only a need to do the bare minimum in research required to get a passing grade so that he can have more time for soccer practice. He may perceive a different need in 10 years when he is looking for a job. Even now he may be persuaded by the teacher to do more work, or maybe he will continue researching just for the fun of learning more about the term paper topic.
No matter what the need (or motivation), even if the student does the bare minimum, he will face the same basic problems of finding. Moreover, problems of finding arise in many everyday situations having little to do with canonical information-seeking situations where the search is for information in a library or on the Web. It's time to talk about everyday finding.
More on personal information management
Continue reading this chapter about personal information management's role in finding information by downloading a free .pdf of "Personal information management: Finding information -- again."
Read another chapter excerpt from this book called "Personal information management: History and details."
Read other excerpts from data management books in the Chapter Download Library.