 |
Browse or search for tools that you can use. More Info
 |
Search Tools
TAPoRware Date Finder (Plain)
plain-text |  |
| This tool extracts dates from an Plain text document. Dates can be limited to all dates, years, months, weeks, seasons, North American holidays or user defined dates (e.g. specific month(s), week(s), season(s), holiday(s)).
Detailed Info -
TryIt
- Website
|
TAPoRware List Tags (HTML)
html |  |
| This service list all the html tags of an html document.
Detailed Info -
TryIt
- Website
|
TAPoRware Link Extractor
html |  |
| This tool extracts all the URL links (except image source) in the input HTML text. It also changes the relative links to the absolute links so that user can click the links in the TAPoRware result page to open the related pages
Detailed Info -
TryIt
- Website
|
TAPoRware Fixed Phrase
html, plain-text, tei, xml |  |
| This tool locates fixed phrases with a specific word in them and displays the located phrase in several different ways. (Fixed Phrase Tool is contributed by Academic Computing Services, New York University).
Detailed Info -
TryIt
- Website
|
TAPoRware Date Finder (HTML)
html |  |
| This tool extracts dates from an HTML document. Dates can be limited to all dates, years, months, weeks, seasons, North American holidays or user defined dates (e.g. specific month(s), week(s), season(s), holiday(s))
Detailed Info -
TryIt
- Website
|
TAPoRware Date Finder (XML)
tei, xml |  |
| This tool extracts dates from an XML document. Dates can be limited to all dates, years, months, weeks, seasons, North American holidays or user defined dates (e.g. specific month(s), week(s), season(s), holiday(s))
Detailed Info -
TryIt
- Website
|
TAPoRware Find Collocates (HTML)
html |  |
| The collocation tool takes a word from the user and returns all of the words directly before and directly after it based on the given context and returns the results listed alphabetically, by frequency, or by Z-score (an indication of how far and in what direction that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation).
Detailed Info -
TryIt
- Website
|
TAPoRware Find Words - Co-occurrence (HTML)
html |  |
| Co-occurrence tool looks for two words a certain distance apart from one another. By entering a primary and secondary pattern, TAPoR will search the document for anywhere the two patterns are within the user-specified limits of words, sentences, or lines. If desired, the results can be narrowed to include words only found within certain tags.
Detailed Info -
TryIt
- Website
|
TAPoRware Find Words - Concordance (Plain)
plain-text |  |
| The Concordance (Text) tool can find text anywhere in a text document. The search can also be used to view a concordance of either words, sentences, or lines surrounding the result.
Detailed Info -
TryIt
- Website
|
TAPoRware Find Words - Concordance (XML)
tei, xml |  |
| Find Concordance (XML) tool can find text anywhere in an XML document using the Find Text tool. The search can be narrowed to specified elements or attributes, and all results are returned with a concordance of either words/sentences/lines or surrounding elements.
Detailed Info -
TryIt
- Website
|
TAPoRware Acronym Finder
html, plain-text, tei, xml |  |
| This tool tries its best to find all the possible acronyms and their original names in a submitted text. However, for some acronym like IGO with the original name of Intergovernmental Organization, the original name can not be identified.
Detailed Info -
TryIt
- Website
|
TAPoRware CAPs Finder
html, plain-text, tei, xml |  |
| This tool tries to find all the CAPs in the submitted text. It will list all single CAP except the first words of each sentence following the more than one CAP phrases.
Detailed Info -
TryIt
- Website
|
|
 |
Text Gathering Tools
TAPoRware Extract Text (HTML)
html |  |
| Retrieve html text based on user given html tags
Detailed Info -
TryIt
- Website
|
TAPoRware Extract Text (XML)
tei, xml |  |
| Extract Text from XML Documents tool can extract the full body of text from an XML Document. This tool can also pull the text from user-specified elements or attributes.
Detailed Info -
TryIt
- Website
|
|
 |
List and Statistical Tools
TAPoRware Tokenizer (HTML)
html |  |
| This tool splits an HTML document at specified points, or tokens. These tokens can be words, lines, sentences, and paragraphs, as well as certain characters, patterns, or tags. The results can be listed with the token removed, before the split, or after the split.
Detailed Info -
TryIt
- Website
|
Hyperpoet Frequencies
plain-text, tei |  |
| List text frequency of user specified URL.
Detailed Info -
TryIt
- Website
|
TAPoRware List Words (Plain)
plain-text |  |
| List Words (Plain) tool can be used to list all of the words found within a given text document. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order.
Detailed Info -
TryIt
- Website
|
TAPoRware List Words (XML)
tei, xml |  |
| List Words (XML) tool can be used to list all or user specified words found within a specified element. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order. If no element is specified, all words in the xml document will be returned.
Detailed Info -
TryIt
- Website
|
TAPoRware List Words (HTML)
html |  |
| List Words (HTML) tool can be used to list all or user specified words found within a specified tag. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order. If no tag is specified, 'body' tag is used.
Detailed Info -
TryIt
- Website
|
TAPoRware Summarizer (HTML)
html |  |
| Extract statistic info , high frequency words list, concordance etc.
Detailed Info -
TryIt
- Website
|
|
 |
Visualization Tools
Voyeur Tools Cirrus (Word Cloud)
html, plain-text, tei, xml |  |
| Displays a word cloud visualization that emphasizes the highest frequency words.
Detailed Info -
TryIt
- Website
|
|
 |
EditingTools
Neko Transformer
html |  |
| This an HTML "tidying" tool which is based on the CyberNeko HTML Parser. Use Neko Transformer
to balance tags and "fix up many common mistakes that human (and computer) authors
make in writing HTML documents. NekoHTML adds missing parent elements; automatically
closes elements with optional end tags; and can handle mismatched inline element tags."
More information about Neko can be found at http://people.apache.org/~andyc/neko/
Detailed Info -
TryIt
|
HTML Entity Transformer
html |  |
| This is a transformation tool that reads the specified HTML document and converts all
HTML entities into their Unicode counterparts. The tool produces an HTML page once it
completes.
Detailed Info -
TryIt
|
MS Word Transformer
msword |  |
| This a converter tool that extracts plain text from Microsoft Word documents using
Jakarta POI library.
Detailed Info -
TryIt
|
Highlighter
html, msword, pdf |  |
| Highlighter tool uses Apache Lucene library to provied a KWIC search. It highlights all occurrences of the specified query within a document.
Detailed Info -
TryIt
|
Diff Transformer
docbook, html, mep, plain-text, tarl, tei, xml |  |
| This tool compares two files and highlights the differences in each of them.
Detailed Info -
TryIt
|
PDF Transformer
pdf |  |
| This is a converter tool that extracts plain text from a PDF document.
The tool uses PDFBox, an open source Java PDF library for working with PDF. More
information about the library can be found at http://www.pdfbox.org.
Detailed Info -
TryIt
|
|
 |
Miscellaneous Tools
Voyeur Tools (full environment)
html, plain-text, tei, xml |  |
| This opens the text(s) in the full Voyeur Tools environment, using the default skin.
Detailed Info -
TryIt
- Website
|
|
|  |