Example Searches in Textpresso




[Simple Retrieval Examples] [Contents] [Fact Extraction Examples]



Examples


If you have any comments or would like to see examples of more searches please email Textpresso to let us know!

Examples of Advanced Retrieval Searches

A few words before we start .....

The Textpresso Advanced Retrieval search engine is a very powerful tool for finding facts from C. elegans literature. With its user-friendly interface one can utilize the full depth of the Textpresso ontologies and also refine queries with added boolean operators, frequency of term or keyword occurrences and keywords. It may take a little practice to to use this feature optimally. Referring to the Textpresso ontology category definitions is often helpful. The examples below demonstrate the real power of the system. We hope you have fun trying out queries. These examples and the corresponding User Guide section should provide a solid introduction to this tool. The Textpresso team are also at hand to answer any questions you might have along the way. Finally, if you feel you have constructed a particularly useful query that we have not included on the examples page, please let us know so that we can add it!! Thanks.

Some quick pointers:



Now for some examples!!

to the top



Finding out about your favorite gene

Example: eat-4

What does eat-4 encode?
This search utilizes two ontology categories and a keyword. The ontology category "biological process" is selected in the first row. A subgroup of the biological process ontology category, TYPE: "biosynthesis", VALUE: "expression", is selected from the subgroup menu. This subgroup of the biological process ontology contains only terms that indicate genetic expression ("over-express", "encodes" etc.) . The ontology category "molecular function" is selected in the second row. A subgroup of the "molecular function" category, TYPE: "protein", VALUE: "yes" is selected from the subgroup menu. This subgroup contains terms that are protein names . Further, a specification value of "all" is selected so that specifically named proteins ("UNC-60", "DNA ligase" etc.) and proteins that are indirectly referred to (the word "protein") will be matched. The keyword "eat-4" is entered in the third row, where the "exact match" box is checked. The default boolean operator "AND" and the default value of "greater than one occurrences" are applied.



The two diagrams below show parts of the result page for this search. In the first example, Textpresso has returned the published homology from the first paper to characterize the eat-4 gene (cgc3349 - Lee et al, Journal of Neuroscience, vol. 19, 159-167 (1999)).







to the top



In what cells is eat-4 expressed
This search utilizes two ontology categories and a keyword. A subgroup of the biological process ontology category, TYPE: "biosynthesis", VALUE: "expression", is selected from the subgroup menu. This subgroup of the biological process ontology contains only terms that indicate genetic expression ("over-express", "encode" etc.). A subgroup of the "Cell or cell group" category, TYPE: "type", VALUE: "name" is selected from the subgroup menu. This subgroup contains terms that are cell and cell group names. Further, a specification value of "named" is selected so that only specifically named cells and cell groups will be matched ("AVM", "neurons"). The keyword "eat-4" is entered in the third keyword row, where the "exact match" box is checked. The default boolean operator "AND" and the default value of "greater than one occurrences" are applied.



The diagram below shows part of the results page for this search. The search pin-points with high precision sentences discussing the expression of eat-4 in various cells from Textpresso corpus of full-text papers! In fact 10 of the 11 returned sentences for this search answered the question posed. The third sentence, seen below (cgc3038), talks about where eat-4 is not expressed, "... EAT-4 is not expressed in PVC neurons ....".



to the top



What are the published phenotypes for eat-4 mutants?
This search utilizes two ontology categories and a keyword. The ontology category "phenotype" is selected in the first row. A specification value of "all" is selected so that specifically named phenotypes ("Egl", "fertilization defective") and indirectly named phenotypes (the word "phenotype") will be matched. Th ontology category "allele" is selected in the second row. A specification value of "all" is selected so that specifically named alleles ("n2559" , "e307") and alleles that are indirectly referred too (the word "allele") are matched. The keyword "eat-4" is entered in the third keyword row, where the "exact match" box is checked. The default boolean operator "AND" and the default value of "greater than one occurrences" are applied.



The diagram below shows part of the results page for this search. The search was designed to look for phenotypes of eat-4 resulting from mutations to the gene (thus the category "allele" was selected). Here we see two distinct phenotypes, multivulval and wild type, cause by eat-4(ky5) and eat-4(ad572) respectively.



to the top



Example: lin-12

What genes are downstream of lin-12 signaling pathway?
This search utilizes two ontology categories and a keyword. The ontology category "gene" is selected in the first row. A specification value of "named" is selected so that only specifically named genes ("let-60", "lin-12" etc.) will be matched. The ontology category "pathway" is selected in the second row. Two subgroups of the "pathway" category are selected. The first subcategory, TYPE: "type", VALUE: "molecular" is selected from the subgroup menu. A second subcategory, TYPE: "course", VALUE: "downstream" is also selected. This is done by pressing the control key (Ctrl) as you click on the second subcategory. NOTE: ONLY ONE VALUE FOR EACH SUBCATEGORY MAY BE SELECTED. The combination of these subgroups matches terms such as ("cascade", "downstream" etc.). The keyword "lin-12" is entered in the third keyword row, where the "exact match" box is checked. The default boolean operator "AND" and the default value of "greater than one occurrences" are applied.



The diagram below shows part of the results page for this search. This query demonstrates the precision of the Advanced Retrieval search engine. The parameter "pathway" was refined to include only those terms which indicated downstream events at a molecular level. Thus, at least two genes, emb-5 and sel-10 are quickly identified from the literature as acting downstream of the lin-12 signaling pathway.







to the top



At what stage(s?) in C. elegans development is lin-12 expressed?
This search utilizes three ontology categories and a keyword. The ontology category "life stage" is selected in the first row with a specification value of "all" so that life stages that are specifically named ("1-cell embryo") and/or indirectly refereed too will be matched ("adult"). The ontology category "time relation" is selected in the second row with no subcategories selected so that it will match any term in the time relation category ("early", "temporal" etc.). The ontology category "biological process" is selected in the third row. A subgroup of the biological process ontology category, TYPE: "biosynthesis", VALUE: "expression", is selected from the subgroup menu. This subgroup of the biological process ontology contains only terms that indicate genetic expression ("encode", "over-express" etc.). The keyword "lin-12" is entered in the fourth row, where the "exact match" box is checked. The default boolean operator "AND" and the default value of "greater than one occurrences" are applied.



The diagram below shows part of the results page for this search. By combining both the "life stage" and "time relation" ontology categories, one can search for when during C. elegans development a particular event occurs. Below are results from three separate publications (cgc2972, cgc1670, cgc1909) that indicate that lin-12 is expressed during embryonic and post-embryonic development, and specifically during the development of the ventral nerve cord (cgc2972).



to the top



Example: lin-2

Does lin-2 undergo alternative splicing and, if so, what are the products?
This search utilizes one ontology categories and four keywords. The ontology category "biological process" is selected in the first row. A subgroup of the biological process ontology category, TYPE: "biosynthesis", VALUE: "translation", is selected from the subgroup menu. This subgroup of the biological process ontology contains only terms that associated with protein translation ("translate", "transplice" etc.). The keyword "lin-2a" is entered in the second row, where the "exact match" box is checked. In rows three, four and five the keywords "lin-2b", "lin-2c" and "lin-2d" are entered respectively with the "exact match" box checked in each row. In the last three rows the boolean operator is set to "OR". The default value of "greater than one occurrences" is applied.



The diagram below shows part of the results page for this search. A trick is used in this search to looked for splice variants of the lin-2 gene. Taking advantage of the xxx-1a, xxx-1b, xxxx-1c etc. naming convention for splice variants for a given gene and the "OR" boolean operator, one can search specifically for any mention of these splice variants in a sentence. This has an advantage over other search engines where one may search for splice variants of a gene using a wild card (i.e. lin-2*), which in addition to any lin-2 splice variants may also return results matching lin-22 or lin-26 for example. Thus Textpresso facilitates the formulation of exact queries. Please note however, the use of the boolean operator "OR" means that the "biological process" category is not required in the returned hits.



to the top



Finding out about a particular biological process:

Example: Ras signaling

What genes are involved in Ras signaling?
This search utilizes four ontology categories and two keywords. The ontology category "biological process" is selected in the first row. The number of times that a "biological process" category term will be matched in a sentence is set to "greater than" "1". The ontology category "gene" is selected in the second row. A specification value of "named" is selected so that only specifically named genes ("let-60", "lin-12" etc.) will be matched. The ontology category "involvement" is selected in the third row. The ontology category "pathway" is selected in the fourth row. The keyword "ras" is entered in the fifth row, where the "exact match" box is checked. In row six the keyword "let-60" is entered are entered with the "exact match" box checked and the boolean operator set to "NOT".



The diagram below shows part of the results page for this search. The combination of the "gene" and "involvement" categories with the keyword "ras" form the core components of this query. Further, two tricks are used in this search to increase the accuracy of the query. The first is the addition of the "pathway" category, which serves to refine the search to ras signaling . Secondly, to reduce the number of false positives, the let-60 Ras gene is excluded from the search. From a sample of the results page below, we can identify at least four genes thought to be involved in Ras signaling (lin-39, sur-6, ksr-1 and sur-8).



to the top



How does Ras signaling effect cell(s?)?
This search utilizes three ontology categories and one keywords. The ontology category "cell and cell group" is selected in the first row. The ontology category "effect" is selected in the second row. The ontology category "pathway" is selected in the third row. The keyword "ras" is entered in the fourth row, where the "exact match" box is checked.



The diagram below shows part of the results page for this search. This time the category "effect" is used in combination with "cell and cell group". The same trick is used as above, the category "pathway" is specified with the keyword "ras" which refines the query to Ras signaling. The sample results show that the Ras signaling pathway appears to effect at least two different cell groups, the AWC neurons and the Vulval Precursor Cells.



to the top



[Simple Retrieval Examples] [Contents] [Fact Extraction Examples]