Textpresso User Guide

[Advanced Retrieval, 680K]


[Simple Retrieval] [Contents] [Fact Extraction]







[Simple Retrieval] [Contents] [Fact Extraction]


Advanced Retrieval

The Advanced Retrieval search engine is a very powerful tool offered by Textpresso for real-time information retrieval from C. elegans literature. Like the Simple Retrieval , the Advanced Retrieval tool is designed to combine keywords with ontology categories in performing searches. However, the Advanced Retrieval tool allows the user to specify boolean terms, ontology category subgroups and number of occurrences of an ontology category or keyword in the search. If used properly, the Advanced Retrieval facilitates the formulation of queries that are closer to semantic questions than any other search engine for biological literature . A well designed Advanced Retrieval query allows the user to ask questions such as, " What genetic expression affects Ras signaling? ". In order to make optimal use of the Advanced Retrieval tool, the user must have a good understanding of sense meaning of each ontology category and its subgroups.

A few notes before we start:

  1. The first row in the query table must be filled out, otherwise no matches will be returned (this prohibits the user from performing just a keyword search with the Advanced Retrieval tool). Any other lines left empty will not affect the query.
  2. The Advanced Retrieval search engine processes rows from top to bottom. In other words, the matches of the first row are found first, the system then applies the boolean operator of the next filled row (if it is present) and matches the next row against the return of the first and so on ....
  3. It is possible to generate some pretty big returns with the Advanced Retrieval tool. The goldturtle server is set to time out after 5min.s, so, until we get a better server, try big return queries at times of low traffic for best results (i.e. at night).
The output of the search is displayed on a " Summary Page ", which summarizes details of the publications which contain matches. The user selects which matches they would like to view and the results are shown on the " Results Page ".

Read on to find out how to use this powerful tool!!!!

Advanced Retrieval search features

1. Search Buttons

Clicking on the "Search!" button initiates a search with the inputed parameters. There are three other buttons. The "Load last query!" button is a useful feature that remembers the last user query and loads it again. The "Clear last query" sets the query table back to the default values. The "Undo current changes!" button allows the user to conveniently undo the last change they did to their search.



2. Text Corpus Selection and Author/Year Search

The user has the option to choose to search any combination of the "Titles", "Abstracts" and "Papers" by selecting the boxes beside these options.

3. Sentence Search vs Publication Search

The user has the option to search the input parameters within individual sentences or entire publications. Note: this only applies when "Abstracts" and/or "Papers" are selected to search against.

4. Setting the number of category and keyword rows in the query table

The user may set an alternative number of rows for ontology category and keyword in the query table if they wish. The default number (three rows of each) may be customized in the customization page . Alternatively, to set a number of rows temporarily, use the "Reproduce query table" button at the bottom of the Advanced Retrieval search page, setting the category and keyword row numbers as desired.

5. The query table features

to the top



[Simple Retrieval] [Contents] [Fact Extraction]


Search Summary Page Features






1. Matches Information

This is where the number of matches of the search parameters is displayed. If the system is searching either sentences or publications, the total number of sentences containing the search parameters is displayed. Also shown is the total number of publications that contain one or more hits.

2. Summary Display Controls

The summary page is returned with ten summaries displayed per page (this is the default, the number of results summaries displayed per page can be customized in the " Customization " page). The summary display controls allow the user to display any page by selecting the page from the drop down menu and pressing the "Display" button. Alternatively, the user can navigate the summary page using the "Previous" and "Next" buttons.

3. Email Settings

The user can opt to have the summary page sent to them via email. To do this the user must enter their email address in the text box and press the "E-mail" button. By selecting the include matches option, the email will also contain the resulting matches from the search. Beware, this can result in very large emails!

4. "View all matches" Button

Clicking this button brings the user to a results page containing all the results for a given search.

5. Abstract Expansion Buttons

By default (and for the sake of clarity) only the first two sentences of an abstract are displayed in the "Abstract" column. Pressing the "Expand abstract" button will display the full abstract in the column. This full abstracts can be collapsed again by pressing the "Collapse abstract" button.

6. "View matches" Button

Clicking this button brings the user to a results page containing the results for a that particular publication.

7. "PDF" Button

Clicking on the PDF button displays the pdf version of that publication (only available to Caltech users) .

8. "Related articles" Button

This button outlinks to the PubMed web-site page of citations that are related to that particular publication.

9. "Results in PDF" Button

Clicking on the Results in PDF button brings the user to a web page where they can opt to download all the resulting hits for their query in PDF format. (This may take a few minutes, depending on the number of resulting hits)

to the top



[Simple Retrieval] [Contents] [Fact Extraction]


Result Page Features






1. Query Display

At the top of the results page the search query is displayed. Note that the different search parameters are displayed with different color as a visual aid.

2. Results Display

The result display is returned with ten matches displayed per page (this is the default, the number of matches displayed per page can be customized in the " Customization " page). The summary display controls allow the user to display any page by selecting the page from the drop down menu and pressing the "Display" button. Alternatively, the user can navigate the summary page using the "Previous" and "Next" buttons.

3. Publication Identifier

The "File ID" identifies the publication from which the match comes according to WormBase abstract nomenclature. The type of publication is displayed in parenthesis after the File ID, i.e. Abstract.

4. Sentence Identifier

The "Sentence ID" specifies the sentence number of the match in the publication.

5. Search Matches

The matching sentences are displayed in boldface font. For the sake of context, the sentences that surround the match in the publication may also be displayed (the default number is ten, the number of surrounding sentences displayed per match can be customized in the " Customization " page).

6. Links to Wormbase

Some words and terms in the matching sentences will link to their corresponding report pages in Wormbase , a database repository for the biology and genome of C. elegans .

to the top



[Simple Retrieval] [Contents] [Fact Extraction]


Performing a Text Search

Below is an example of a text search using the Advanced Retrieval tool in Textpresso. Please see the " Examples " section for many more examples of Textpresso searches.

What genetic expression affects Ras signaling?

A number of ontology classes are employed to formulate a query that asks, "what genetic expression affects Ras signaling?". In the Textpresso Ontology, terms that indicate expression are contained in the Biological Process ontology class and have the attribute value, "expression". Therefore in the first row, where the ontology class "Biological Process" is chosen, an attribute, "biosynthesis" with the value, "expression" is assigned to the ontology class. The next row determines that the match must also contain (by specifying the AND operator) one or more ("greater than 0") directly named genes. The selection of the ontology class "Effect" in the third ontology row influences the relationship between the two entities, the named gene and "Ras", which is entered as a keyword in the fifth row to be matched exactly. The inclusion of one or more occurrences of a "Pathway" class term in the fourth row serves to refine the search to Ras signaling. The substitution of a different "relationship" ontology class here, such as "involvement" or "purpose", could be used to determine subtlety different queries, "how is genetic expression involved in Ras signaling?" and "what role does genetic expression play in Ras signaling?" respectively. The differing result from each of these three subtle variations are shown below:




What genetic expression affects Ras signaling?







How is genetic expression involved in Ras signaling?







What role does genetic expression play in Ras signaling?





to the top



[Simple Retrieval] [Contents] [Fact Extraction]