 |
Browse or search for tools that you can use. More Info
 |
Search Tools
TAPoRware Find Collocates (Plain)
plain-text |  |
| Collocation tool takes a word from the user and returns all of
the words directly before and directly after it based on the given
context. Results are listed alphabetically, by frequency, or by Z-score
(an indication of how far and in what direction that item deviates from
its distribution's mean, expressed in units of its distribution's standard
deviation).
NOTE:
If your select context of "Words" with long context length or you select other context: Lines, Sentences or Paragraph,
it is very likely that the specified pattern appears in each text of corcordance more than once. If this occurs, the words of collocation
will be counted more than once as well. So the counts of collocates are not accurate. For the same reason, the zScore values are not
accurate. We will find way to fix this later.
Detailed Info -
TryIt
- Website
|
TAPoRware Date Finder (Plain)
plain-text |  |
| This tool extracts dates from an Plain text document. Dates can be limited to all dates, years, months, weeks, seasons, North American holidays or user defined dates (e.g. specific month(s), week(s), season(s), holiday(s)).
Detailed Info -
TryIt
- Website
|
TAPoRware Find Co-occurrence (Plain)
plain-text |  |
| Co-occurrence tool looks for two words a certain distance apart
from one another. By entering a primary and secondary pattern, TAPoR will
search the document for anywhere that the two patterns are within the
user-specified limits of words, sentences, or lines.
<br />
<b>Note</b>: If your select context of "Words" with long context length or you select other context: Lines, Sentences or Paragraph,
it is very likely that the specified pattern appears in each text of corcordance more than once. If this occurs, the words of collocation
will be counted more than once as well. So the counts of collocates are not accurate. For the same reason, the zScore values are not
accurate. We will find way to fix this later.
<br/>
Detailed Info -
TryIt
- Website
|
TAPoRware Synonym Finder
html, plain-text, xml |  |
| This tool uses the Roget's Interactive Thesaurus services to get the synonyms/antonyms of a given word
Detailed Info -
TryIt
- Website
|
TAPoRware Date Finder (XML)
tei, xml |  |
| This tool extracts dates from an XML document. Dates can be limited to all dates, years, months, weeks, seasons, North American holidays or user defined dates (e.g. specific month(s), week(s), season(s), holiday(s))
Detailed Info -
TryIt
- Website
|
TAPoRware Date Finder (HTML)
html |  |
| This tool extracts dates from an HTML document. Dates can be limited to all dates, years, months, weeks, seasons, North American holidays or user defined dates (e.g. specific month(s), week(s), season(s), holiday(s))
Detailed Info -
TryIt
- Website
|
TAPoRware Find Collocates (HTML)
html |  |
| The collocation tool takes a word from the user and returns all of
the words directly before and directly after it based on the given context
and returns the results listed alphabetically, by frequency, or by Z-score
(an indication of how far and in what direction that item deviates from
its distribution's mean, expressed in units of its distribution's standard
deviation).
Detailed Info -
TryIt
- Website
|
TAPoRware Find Collocates (XML)
tei, xml |  |
| Collocation tool takes a word from the user and returns all of
the words directly before and directly after it based on the given
context. The results are listed alphabetically, by frequency, or by
Z-score (an indication of how far and in what direction that item deviates
from its distribution's mean, expressed in units of its distribution's
standard deviation).
Detailed Info -
TryIt
- Website
|
TAPoRware Find Co-occurrence (HTML)
html |  |
| Co-occurrence tool looks for two words a certain distance apart
from one another. By entering a primary and secondary pattern, TAPoR will
search the document for anywhere the two patterns are within the
user-specified limits of words, sentences, or lines. If desired, the
results can be narrowed to include words only found within certain tags.
Detailed Info -
TryIt
- Website
|
TAPoRware Find Co-occurrence (XML)
tei, xml |  |
| Co-occurrence tool looks for two words a certain distance apart
from one another. By entering a primary and secondary pattern, TAPoR will
search the document for anywhere where the two patterns are within the
user-specified limits of words/sentences/lines or surrounding elements.
Detailed Info -
TryIt
- Website
|
TAPoRware Find Words - Concordance (HTML)
html |  |
| Find Concordance (HTML) tool can find text anywhere in an HTML
document. The search can be narrowed to specified tags. All results are
returned with a concordance of either words, sentences, or lines.
Detailed Info -
TryIt
- Website
|
TAPoRware Find Words - Concordance (Plain)
plain-text |  |
| The Concordance (Text) tool can find text anywhere in a text document. The
search can also be used to view a concordance of either words, sentences,
or lines surrounding the result.
Detailed Info -
TryIt
- Website
|
TAPoRware Find Words - Concordance (XML)
tei, xml |  |
| Find Concordance (XML) tool can find text anywhere in an XML document
using the Find Text tool. The search can be narrowed to specified elements
or attributes, and all results are returned with a concordance of either
words/sentences/lines or surrounding elements.
Detailed Info -
TryIt
- Website
|
TAPoRware Acronym Finder
html, plain-text, xml |  |
| This tool tries its best to find all the possible acronyms and their original names in a submitted text. However, for some acronym like IGO with the original name of Intergovernmental Organization, the original name can not be identified.
Detailed Info -
TryIt
- Website
|
TAPoRware CAPs Finder
html, plain-text, xml |  |
| This tool tries to find all the CAPs in the submitted text. It will list all single CAP except the first words of each sentence following the more than one CAP phrases.
Detailed Info -
TryIt
- Website
|
TAPoRware Words Distribution -- Weighted Centroid
html, plain-text, xml |  |
| This tool displays a circular graph based on word distribution data or the tablet word distribution data.
The text is divided up into an arbitrary number of units, which are positioned around the circumference of the circle in a clockwise sequence. The more times a word appears in a particular text unit, the closer the word will be to that unit in the circle. If a word appears an equal number of times in all units, it be located in the centre of the circle.
Words are colour coded based on the amount of times they appear in the text as a whole. Blue words have the highest word count. Rolling over a word will display lines representing its connections to the units. Clicking a word will keep its lines visible after you move the mouse off of it. Click the word again to rmeove the lines. The darker the line, the more times the word was found in that unit.
Additionally, all the words found in the graph are listed on the left side of the applet. There is a scroll bar for viewing the words, should they extend past the bottom of the applet. This list of words features the same rollover and clicking functionality as those found in the the graph itself.
This tool uses the processing library.
* This tool requires the JRE (v1.4.2 and up) in order to work properly.
Detailed Info -
TryIt
- Website
|
TAPoRware Principal Component Analysis Tool (Plain)
plain-text |  |
| This tool uses principal components analysis method to analyze words relation among the user specified text units
Detailed Info -
TryIt
- Website
|
|
 |
Text Gathering Tools
TAPoRware Extract Text (XML)
tei, xml |  |
| Extract Text from XML Documents tool can extract the full body
of text from an XML Document. This tool can also pull the text from
user-specified elements or attributes.
Detailed Info -
TryIt
- Website
|
TAPoRware Googlizer
html, plain-text, xml |  |
| This tool calls google search engine directly and give different results. The rule of search terms in google can be applied here directly
Detailed Info -
TryIt
- Website
|
TAPoRware Extract Text (HTML)
html |  |
| Retrieve html text based on user given html tags
Detailed Info -
TryIt
- Website
|
TAPoRware Text Aggregator
html, plain-text, xml |  |
| This tool aggregates multiple text from different locations and different format into a single text. The text source can be pointed by valid URLs, your local file or text you typed in.
Detailed Info -
TryIt
- Website
|
|
 |
List and Statistical Tools
TAPoRware Summarizer (Plain)
plain-text |  |
| Extract statistic info , high frequency words list, concordance etc.
Detailed Info -
TryIt
- Website
|
TAPoRware Tokenizer (XML)
tei, xml |  |
| This tool splits an XML document at specified points, or tokens. These tokens can be words, lines, sentences, paragraphs, characters, patterns, or tags. The results can be listed with the token removed or preserved before or after the split
Detailed Info -
TryIt
- Website
|
QMatrix
plain-text |  |
| This tool implements Raymond Queneau's matrix analysis of language with a given text. The results of the analysis include a breakdown of the text into formatives, signifiers, and bi-words.
N.B. In order to work with texts with accents, load your document to myTexts first, then select the document when you use QMatrix. Accented characters will not display properly but the tool will work otherwise.
Detailed Info -
TryIt
- Website
|
TAPoRware List Tags (HTML)
html |  |
| This service list all the html tags of an html document.
Detailed Info -
TryIt
- Website
|
TAPoRware Tokenizer (HTML)
html |  |
| This tool splits an HTML document at specified points, or tokens. These tokens can be words, lines, sentences, and paragraphs, as well as certain characters, patterns, or tags. The results can be listed with the token removed, before the split, or after the split.
Detailed Info -
TryIt
- Website
|
TAPoRware Tokenizer (Plain)
plain-text |  |
| Tokenize tool splits text document at specified points, or
tokens. These tokens can be words, lines, sentences, and paragraphs, as
well as certain characters or patterns. The results can be listed with the
token removed, before the split, or after the split.
Detailed Info -
TryIt
- Website
|
Hyperpoet Frequencies
plain-text, tei |  |
| List text frequency of user specified URL.
Detailed Info -
TryIt
- Website
|
Test List Words Tool
plain-text |  |
| description
Detailed Info -
TryIt
- Website
|
TAPoRware List Word Pairs
html, plain-text, xml |  |
| This tool will list word pairs in a corpus based on different criteria.
Detailed Info -
TryIt
- Website
|
TAPoRware List Elements (XML)
tei, xml |  |
| List XML Elements tool is used to display all of the elements
contained in an XML document. This tool also allows the user to count all
instances of an element and to view the structure/hierarchy of the
document. It also provides a variety of tools for listing attributes and
attribute values.
Detailed Info -
TryIt
- Website
|
TAPoRware List Words (HTML)
html |  |
| List Words (HTML) tool can be used to list all or user specified words
found within a specified tag. The query results can be displayed
alphabetically, by frequency, by order of appearance, or in reversed
alphabetical order. If no tag is specified, 'body' tag is used.
Detailed Info -
TryIt
- Website
|
TAPoRware List Words (XML)
tei, xml |  |
| List Words (XML) tool can be used to list all or user specified words
found within a specified element. The query results can be displayed
alphabetically, by frequency, by order of appearance, or in reversed
alphabetical order. If no element is specified, all words in the xml
document will be returned.
Detailed Info -
TryIt
- Website
|
TAPoRware Texts Comparator
html, plain-text, xml |  |
| The tool compares two user submitted texts. It performs basic statistics on each text and lists the words and counts side by side.
Detailed Info -
TryIt
- Website
|
TAPoRware List Words (Plain)
plain-text |  |
| List Words (Plain) tool can be used to list all of the words found
within a given text document. The query results can be displayed
alphabetically, by frequency, by order of appearance, or in reversed
alphabetical order.
Detailed Info -
TryIt
- Website
|
|
 |
Visualization Tools
TAPoRware Pattern Distribution (XML)
tei, xml |  |
| This tool will display user specified pattern distribution over different text unit in different format
Detailed Info -
TryIt
- Website
|
TAPoRware Pattern Distribution (HTML)
html |  |
| This tool displays visually pattern distribution in user selected text unit.
Detailed Info -
TryIt
- Website
|
TAPoRware Pattern Distribution (Plain)
plain-text |  |
| This tool will display pattern distribution over plain text string in different format
Detailed Info -
TryIt
- Website
|
TAPoRware Visual Collocator
html, plain-text, xml |  |
| The Visual Collocator displays collocates of words using a graph layout. Words which share similar collocates will be drawn together in the graph, producing new insight into the text. Any word can be double-clicked to fetch its collocates. Any word can be removed from the graph, and new words can be added using the text field. Additionally, words can be made "sticky", then dragged around to new positions, creating a user defined layout.
This tool uses the prefuse library.
* This tool requires the JRE (v1.4.2 and up) in order to work properly.
Detailed Info -
TryIt
- Website
|
TAPoRware Word Cloud
html, plain-text, xml |  |
| This tool count the words in the text and display them in font and color based on their count. The order and the number of words are specified by user.
Detailed Info -
TryIt
- Website
|
TAPoRware Word Brush
html, plain-text, xml |  |
| 'Word Brush' allows the user to paint with words extracted from an online document.
This tool uses Java to display results. The current applet was compiled using Java 1.4.2_08 so you will need at least this version of the Java plug-in for your browser if you wish to use it.
Detailed Info -
TryIt
- Website
|
TAPoRware Raining Words
html, plain-text, xml |  |
| 'Raining Words' is designed to display high frequency words such that high frequency words are rendered larger and move more slowly than words with lower frequencies. The source can be plain text, XML or HTML.
If the source is XML or HTML, the processed text will be limited to the context of user given element.
It is then filtered using a stop-word list. The resulting text is then scanned for the top 20 high frequency words.
This tool uses Java to display results. The current applet was compiled using Java 1.4.2_08 so you will need at least this version of the Java plug-in for your browser if you wish to use it.
Detailed Info -
TryIt
- Website
|
TAPoRware Hypergraph (XML)
xml |  |
| Using hypergraph package to draw the xml document' structure.
Detailed Info -
TryIt
- Website
|
Humanist Trends Viewer (Voyeur)
plain-text |  |
| This tool is intended to help view word trends in the Humanist Discussion Group archive. The archives have been scraped from the web, segmented into yearly volumes, and stripped down to only the plain text from the bodies of the email messages (see the TADA Archives of the Humanist Discussion Group for more information on acquiring and processing the archives).
Detailed Info -
TryIt
- Website
|
|
 |
EditingTools
XSL Transformer
docbook, mep, tarl, tei, xml |  |
| This is a transformation tool that applies the specified XSL stylesheet to the selected XML file.
Detailed Info -
TryIt
|
TEI Transformer
tei |  |
| This is a tranformation tool that uses a set of TEI stylesheets to convert a TEI document into HTML.
For more information see http://www.tei-c.org/Stylesheets/teic/.
Detailed Info -
TryIt
|
Neko Transformer
html |  |
| This an HTML "tidying" tool which is based on the CyberNeko HTML Parser. Use Neko Transformer
to balance tags and "fix up many common mistakes that human (and computer) authors
make in writing HTML documents. NekoHTML adds missing parent elements; automatically
closes elements with optional end tags; and can handle mismatched inline element tags."
More information about Neko can be found at http://people.apache.org/~andyc/neko/
Detailed Info -
TryIt
|
HTML Entity Transformer
html |  |
| This is a transformation tool that reads the specified HTML document and converts all
HTML entities into their Unicode counterparts. The tool produces an HTML page once it
completes.
Detailed Info -
TryIt
|
Raw Entity Transformer
xml |  |
| This is a transformation tool that reads the specified XML document and converts all
entities in the document into their Unicode counterparts. The tool outputs the resulting
XML in its raw form. As the result, most browser will not be able to display the
result properly. Please save the tool invocation results locally to see the XML output.
Detailed Info -
TryIt
|
Pretty Entity Transformer
xml |  |
| This is a transformation tool that reads the specified XML document and converts all
entities in the document into their Unicode counterparts. The tool output is then converted to
a pretty-printed HTML with an XSL stylesheet. This tool is not suitable for processing large
XML files due to the limitations imposed by the XSL transformation. Please use Raw Entity Transformer
if it is necessary to transform large XML files.
Detailed Info -
TryIt
|
MS Word Transformer
msword |  |
| This a converter tool that extracts plain text from Microsoft Word documents using
Jakarta POI library.
Detailed Info -
TryIt
|
Highlighter
html, msword, pdf |  |
| Highlighter tool uses Apache Lucene library to provied a KWIC search. It highlights all occurrences of the specified query within a document.
Detailed Info -
TryIt
|
Diff Transformer
docbook, html, mep, plain-text, tarl, tei, xml |  |
| This tool compares two files and highlights the differences in each of them.
Detailed Info -
TryIt
|
PDF Transformer
pdf |  |
| This is a converter tool that extracts plain text from a PDF document.
The tool uses PDFBox, an open source Java PDF library for working with PDF. More
information about the library can be found at http://www.pdfbox.org.
Detailed Info -
TryIt
|
|
 |
Miscellaneous Tools
LiteMorph
plain-text |  |
| LiteMorph
Detailed Info -
TryIt
- Website
|
TAPoRware XML Transformer
xml |  |
| Provide your xml and xsl file, this tool will transform the xml into the file specified in your xsl file. However, because the output is pre-configured as HTML, if your xsl's target format is xml, you need to view the source of the output to cee the xml
Detailed Info -
TryIt
- Website
|
TokenX
xml |  |
| A text visualization, analysis, and play tool
Detailed Info -
TryIt
- Website
|
|
|  |