Module tf.cheatsheet
A. Advanced API
Initialization, configuration, meta data, and linking
A = use('org/repo')-
start up and load a corpus from a repository and deliver its API.
-
See
tf.about.usefunc A.hoist(globals())-
Make the API handles
F,E,T,Letc available in the global scope. A.load(features)-
Load an extra bunch of features.
A.featureTypes(show=True)-
show for which types each feature is defined
A.showContext(...)-
show app settings
A.header(allMeta=False)-
show colophon
A.showProvenance(...)-
show provenance of code and data
A.webLink(n, ...)-
hyperlink to node
non the web A.flexLink("pages") A.flexLink("tut")-
hyperlink to app tutorial and documentation
A.isLoaded(features=None)-
Show information about loaded features
A.footprint()-
Show memory footprint per feature
Displaying
A.specialCharacters()-
show all hard-to-type characters in the corpus in a widget
A.showFormats()-
show all text formats and their definitions
A.dm(markdownString)-
display markdown string in notebook
A.dh(htmlString)-
display HTML string in notebook
A.method(option1=value1, option2=value2, ...)-
Many of the following methods accept these options as keyword arguments:
A.displayShow(...)-
show display options
A.displayReset(...)-
reset display options
A.displaySetup(...)-
set up display options
A.table(results, ...)-
plain rendering of tuple of tuples of node
A.plainTuple(tup, ...)-
plain rendering of tuple of node
A.plain(node, ...)-
plain rendering of node
A.show(results, ...)-
pretty rendering of tuple of tuples of node
A.prettyTuple(tup, ...)-
pretty rendering of tuple of node
A.pretty(node, ...)-
pretty rendering of node
A.unravel(node, ...)-
convert a graph to a tree
A.getCss()-
get the complete CSS style sheet for this app
Search (high level)
A.search(...)-
search, collect and deliver results, report number of results
Sections and Structure
A.nodeFromSectionStr(...)-
lookup node for section heading
A.sectionStrFromNode(...)-
lookup section heading for node
A.structureStrFromNode(...)-
lookup structure heading for node
Volumes and collections
See also tf.about.volumes.
A.getVolumes()-
list all volumes of this dataset
A.extract(volumes, ...)-
export volumes based on a volume specification
A.collect(volumes, ...)-
collect several volumes into a new collection
Export to Excel
A.export(results, ...)-
export formatted data
Logging
A.dm(markdownString)-
display markdown string in notebook
A.dh(htmlString)-
display HTML string in notebook
A.version-
version number of data of the corpus.
The following methods work also for TF. instead of A.:
A.banner-
banner of the TF program.
A.isSilent()-
report the verbosity of TF
A.silentOn(deep=False)-
make TF (deeply) silent from now on.
A.silentOff()-
make TF talkative from now on.
A.setSilent(silent)-
set the verbosity of TF.
A.indent(level=None, reset=False)-
Sets up indentation and timing of following messages
A.info(msg, tm=True, nl=True, ...)-
informational message
A.warning(msg, tm=True, nl=True, ...)-
warning message
A.error(msg, tm=True, nl=True, ...)-
error message
N. F. E. L. T. S. C. Core API
N. Nodes
Read about the canonical ordering here: tf.core.nodes.
N.walk()-
generator of all nodes in canonical ordering
N.sortNodes(nodes)-
sorts
nodesin the canonical ordering N.otypeRank[nodeType]-
ranking position of
nodeType N.sortKey(node)-
defines the canonical ordering on nodes
N.sortKeyTuple(tup)-
extends the canonical ordering on nodes to tuples of nodes
N.sortKeyChunk(node)-
defines the canonical ordering on node chunks
F. Node features
Fall()-
all loaded feature names (node features only)
F.fff.v(node)-
get value of node feature
fff F.fff.s(value)-
get nodes where feature
fffhasvalue F.fff.freqList(...)-
frequency list of values of
fff F.fff.items(...)-
generator of all entries of
fffas mapping from nodes to values F.fff.meta-
meta data of feature
fff Fs('fff')-
identical to
F.ffff, usable if name of feature is variable
Special node feature otype
Maps nodes to their types.
F.otype.v(node)-
get type of
node F.otype.s(nodeType)-
get all nodes of type
nodeType F.otype.sInterval(nodeType)-
gives start and ending nodes of
nodeType F.otype.items(...)-
generator of all (node, type) pairs.
F.otype.meta-
meta data of feature
otype F.otype.maxSlot-
the last slot node
F.otype.maxNode-
the last node
F.otype.slotType-
the slot type
F.otype.all-
sorted list of all node types
E. Edge features
Eall()-
all loaded feature names (edge features only)
E.fff.f(node)-
get value of feature
ffffor edges from node E.fff.t(node)-
get value of feature
ffffor edges to node E.fff.freqList(...)-
frequency list of values of
fff E.fff.items(...)-
generator of all entries of
fffas mapping from edges to values E.fff.b(node)-
get value of feature
ffffor edges from and to node E.fff.meta-
all meta data of feature
fff Es('fff')-
identical to
E.fff, usable if name of feature is variable
Special edge feature oslots
Maps nodes to the set of slots they occupy.
E.oslots.items(...)-
generator of all entries of
oslotsas mapping from nodes to sets of slots E.oslots.s(node)-
set of slots linked to
node E.oslots.meta-
all meta data of feature
oslots
L. Locality
L.i(node, otype=...)-
go to intersecting nodes
L.u(node, otype=...)-
go one level up
L.d(node, otype=...)-
go one level down
L.p(node, otype=...)-
go to adjacent previous nodes
L.n(node, otype=...)-
go to adjacent next nodes
T. Text
T.text(node, fmt=..., ...)-
give formatted text associated with node
Sections
Rigid 1 or 2 or 3 sectioning system
T.sectionTuple(node)-
give tuple of section nodes that contain node
T.sectionFromNode(node)-
give section heading of node
T.nodeFromSection(section)-
give node for section heading
Structure
Flexible multilevel sectioning system
T.headingFromNode(node)-
give structure heading of node
T.nodeFromHeading(heading)-
give node for structure heading
T.structureInfo()-
give summary of dataset structure
T.structure(node)-
give structure of
nodeand all in it. T.structurePretty(node)-
pretty print structure of
nodeand all in it. T.top()-
give all top-level structural nodes in the dataset
T.up(node)-
gives parent of structural node
T.down(node)-
gives children of structural node
S. Search (low level)
Preparation
S.search(query, limit=None)-
Query the TF dataset with a template
S.study(query, ...)-
Study the query in order to set up a plan
S.showPlan(details=False)-
Show the search plan resulting from the last study.
S.relationsLegend()-
Catalog of all relational devices in search templates
Fetching results
S.count(progress=None, limit=None)-
Count the results, up to a limit
S.fetch(limit=None, ...)-
Fetches the results, up to a limit
S.glean(tup)-
Renders a single result into something human readable.
Implementation
S.tweakPerformance(...)-
Set certain parameters that influence the performance of search.
C. Computed data components.
Access to pre-computed data: Computeds.
All components have just one useful attribute: .data.
Call()-
all pre-computed data component names
Cs('ccc')-
identical to
C.ccc, usable if name of component is variable C.levels.data-
various statistics on node types
C.order.data-
the canonical order of the nodes (
tf.core.nodes) C.rank.data-
the rank of the nodes in the canonical order (
tf.core.nodes) C.levUp.data-
feeds the
Locality.u()function C.levDown.data-
feeds the
Locality.d()function C.boundary.data-
feeds the
Locality.p()andLocality.n()functions C.characters.data-
frequency list of characters in a corpus, separately for all the text formats
C.sections.data["sec1"] C.sections.data["sec2"]-
feeds the section part of
tf.core.text C.sections.data["seqFromNode"] C.sections.data["nodeFromSeq"]-
maps tuples of heading nodes to their corresponding tuples of sequence numbers and vice versa. Only if there are 3 section levels.
C.structure.data-
feeds the structure part of
tf.core.text
TF. Dataset
Loading
TF = Fabric(locations=dirs, modules=subdirs, volume=None, collection=None, silent="auto")-
Initialize API on work or single volume or collection of a work from explicit directories. Use
use()instead wherever you can. See alsotf.about.volumes. TF.isLoaded(features=None)-
Show information about loaded features
TF.explore(show=True)-
Get features by category, loaded or unloaded
TF.loadAll(silent="auto")-
Load all loadable features.
TF.load(features, add=False)-
Load a bunch of features from scratch or additionally.
TF.ensureLoaded(features)-
Make sure that features are loaded.
TF.makeAvailableIn(globals())-
Make the members of the core API available in the global scope
TF.ignored-
Which features have been overridden.
TF.footprint()-
Show memory footprint per feature
Volumes
See also tf.about.volumes.
TF.getVolumes()-
list all volumes of this dataset
TF.extract(volumes, ...)-
export volumes based on a volume specification
TF.collect(volumes, ...)-
collect several volumes into a new collection
Saving and Publishing
TF.save(nodeFeatures={}, edgeFeatures={}, metaData={},,...)-
Save a bunch of newly generated features to disk.
A.publishRelease(increase, message=None, description=None,,...)-
Commit the dataset repo, tag it, release it, and attach the complete zipped data to it.
Export to ZIP
cd ~/backend/org/repo tf-zipall-
store the complete corpus data in a file complete.zip
A.zipAll()-
store the complete corpus data in a file complete.zip
from tf.app import collect collect(backend, org, repo)-
same as
A.zipAll()above, assuming the data is in a Github clone
House keeping
TF.version-
version number of TF.
TF.clearCache()-
clears the cache of compiled TF data
from tf.clean import clean
clean()-
clears the cache of compiled TF data
Volume support
TF datasets per volume or collection of a work.
See also tf.about.volumes.
from tf.volumes import getVolumes getVolumes(volumeDir)-
List volumes in a directory.
from tf.volumes import extract extract(work, volumes, ...)-
Extracts volumes from a work
from tf.volumes import collect collect(volumes, work, ...)-
Collects several volumes into a new collection
Dataset Operations
from tf.dataset import modify modify(source, target, ...)-
Modifies a TF dataset into a new TF dataset
from tf.dataset import Versions Versions(api, va, vb, slotMap)-
Extends a slot mapping between versions of a TF dataset to a complete node mapping
Data Interchange
Custom node sets for search
from tf.lib import readSets
from tf.lib import writeSets
readSets(sourceFile)-
reads a named sets from file
writeSets(sets, destFile)-
writes a named sets to file
Export to Excel
A.export(results, ...)-
export formatted data
Interchange with external annotation tools
from convert.recorder import Recorder
Recorder()-
generate annotatable plain text and import annotations
pandas export
A.exportPandas()-
export dataset as
pandasdata frame
MQL interchange
TF.exportMQL(mqlDb, exportDir=None) A.exportMQL(mqlDb, exportDir=None)-
export loaded dataset to MQL
from tf.convert.mql import importMQL TF = importMQL(mqlFile, saveDir)-
convert MQL file to TF dataset
Walker conversion
from tf.convert.walker import CV
cv = CV(TF)-
convert structured data to TF dataset
Exploding
from tf.convert.tf import explode
explode(inLocation, outLocation)-
explode TF feature files to straight data files without optimizations
TF App development
A.reuse()-
reload configuration data
from tf.advanced.find import loadModule
mmm = loadModule("mmm", *args)-
load specific module supporting the corpus app
~/mypath/myname/app/config.yaml-
settings for a TF App
Layered search
(these work on the command-line if TF is installed)
tf-make {dataset} {client} ship-
generate a static site with a search interface in client-side JavaScript and publish it to GitHub pages. If
{client}is left out, generate all clients that are defined for this dataset. Clients are defined in theapp-{dataset}repo, underlayeredsearch. More commands here. tf-make {dataset} serve-
serve the search interfaces defined for
{dataset}locally.
More commands here.
Annotation tools
(these work in the TF browser and in Jupyter Notebooks)
Named Entity Annotation
tf {org}/{repo} --tool=ner-
Starts the TF browser for the corpus in org/repo and opens the manual annotation tool.
NE = A.makeNer()-
Sets up the 'manual' annotation API for the corpus in
A. -
More info and examples in
tf.about.annotate.
Command-line tools
(these work on the command-line if TF is installed)
tf {org}/{repo} tf {org}/{repo}-
Starts the TF browser for the corpus in org/repo.
tf-zipall-
Zips the TF dataset located by the current directory, with all its additional data modules, but only the latest version, so that it can be attached to a release on GitHub / GitLab.
tf-zip {org}/{repo}-
Zips the TF dataset in org/repo so that it can be attached to a release on GitHub / GitLab.
tf-nbconvert {inDirectory} {outDirectory}-
Converts notebooks in
inDirectoryto HTML and stores them inoutDirectory.