Module tf.cheatsheet

A. Advanced API

Initialization, configuration, meta data, and linking

A = use('org/repo')

start up and load a corpus from a repository and deliver its API.

See tf.about.usefunc

A.hoist(globals())

Make the API handles F, E, T, L etc available in the global scope.

App.load()

A.load(features)

Load an extra bunch of features.

App.load()

A.featureTypes(show=True)

show for which types each feature is defined

App.featureTypes()

A.showContext(...)

show app settings

showContext()

A.header(allMeta=False)

show colophon

header()

A.showProvenance(...)

show provenance of code and data

showProvenance()

A.webLink(n, ...)

hyperlink to node n on the web

webLink()

A.flexLink("pages")
A.flexLink("tut")

hyperlink to app tutorial and documentation

flexLink()

A.isLoaded(features=None)

Show information about loaded features

Api.isLoaded()

A.footprint()

Show memory footprint per feature

Api.footprint()


Displaying

A.specialCharacters()

show all hard-to-type characters in the corpus in a widget

specialCharacters()

A.showFormats()

show all text formats and their definitions

showFormats()

A.dm(markdownString)

display markdown string in notebook

dm()

A.dh(htmlString)

display HTML string in notebook

dh()

A.method(option1=value1, option2=value2, ...)

Many of the following methods accept these options as keyword arguments:

tf.advanced.options

A.displayShow(...)

show display options

displayShow()

A.displayReset(...)

reset display options

displayReset()

A.displaySetup(...)

set up display options

displaySetup()

A.table(results, ...)

plain rendering of tuple of tuples of node

table()

A.plainTuple(tup, ...)

plain rendering of tuple of node

plainTuple()

A.plain(node, ...)

plain rendering of node

plain()

A.show(results, ...)

pretty rendering of tuple of tuples of node

show()

A.prettyTuple(tup, ...)

pretty rendering of tuple of node

prettyTuple()

A.pretty(node, ...)

pretty rendering of node

pretty()

A.unravel(node, ...)

convert a graph to a tree

unravel()

A.getCss()

get the complete CSS style sheet for this app

getCss()


Search (high level)

A.search(...)

search, collect and deliver results, report number of results

search()


Sections and Structure

A.nodeFromSectionStr(...)

lookup node for section heading

nodeFromSectionStr()

A.sectionStrFromNode(...)

lookup section heading for node

sectionStrFromNode()

A.structureStrFromNode(...)

lookup structure heading for node

structureStrFromNode()


Volumes and collections

See also tf.about.volumes.

A.getVolumes()

list all volumes of this dataset

Fabric.getVolumes()

A.extract(volumes, ...)

export volumes based on a volume specification

Fabric.extract()

A.collect(volumes, ...)

collect several volumes into a new collection

export()

Fabric.collect()


Export to Excel

A.export(results, ...)

export formatted data

export()


Logging

A.dm(markdownString)

display markdown string in notebook

dm()

A.dh(htmlString)

display HTML string in notebook

dh()

A.version

version number of data of the corpus.

Fabric.version

The following methods work also for TF. instead of A.:

A.banner

banner of the TF program.

Fabric.banner

A.isSilent()

report the verbosity of TF

Timestamp.isSilent()

A.silentOn(deep=False)

make TF (deeply) silent from now on.

Timestamp.silentOn()

A.silentOff()

make TF talkative from now on.

Timestamp.silentOff()

A.setSilent(silent)

set the verbosity of TF.

Timestamp.setSilent()

A.indent(level=None, reset=False)

Sets up indentation and timing of following messages

Timestamp.indent()

A.info(msg, tm=True, nl=True, ...)

informational message

Timestamp.info()

A.warning(msg, tm=True, nl=True, ...)

warning message

Timestamp.warning()

A.error(msg, tm=True, nl=True, ...)

error message

Timestamp.error()


N. F. E. L. T. S. C. Core API

N. Nodes

Read about the canonical ordering here: tf.core.nodes.

N.walk()

generator of all nodes in canonical ordering

Nodes.walk()

N.sortNodes(nodes)

sorts nodes in the canonical ordering

Nodes.sortNodes()

N.otypeRank[nodeType]

ranking position of nodeType

Nodes.otypeRank

N.sortKey(node)

defines the canonical ordering on nodes

Nodes.sortKey

N.sortKeyTuple(tup)

extends the canonical ordering on nodes to tuples of nodes

Nodes.sortKeyTuple

N.sortKeyChunk(node)

defines the canonical ordering on node chunks

Nodes.sortKeyChunk


F. Node features

Fall()

all loaded feature names (node features only)

Api.Fall()

F.fff.v(node)

get value of node feature fff

NodeFeature.v()

F.fff.s(value)

get nodes where feature fff has value

NodeFeature.s()

F.fff.freqList(...)

frequency list of values of fff

NodeFeature.freqList()

F.fff.items(...)

generator of all entries of fff as mapping from nodes to values

NodeFeature.items()

F.fff.meta

meta data of feature fff

NodeFeature.meta

Fs('fff')

identical to F.ffff, usable if name of feature is variable

Api.Fs()


Special node feature otype

Maps nodes to their types.

F.otype.v(node)

get type of node

OtypeFeature.v()

F.otype.s(nodeType)

get all nodes of type nodeType

OtypeFeature.s()

F.otype.sInterval(nodeType)

gives start and ending nodes of nodeType

OtypeFeature.sInterval()

F.otype.items(...)

generator of all (node, type) pairs.

OtypeFeature.items()

F.otype.meta

meta data of feature otype

OtypeFeature.meta

F.otype.maxSlot

the last slot node

OtypeFeature.maxSlot

F.otype.maxNode

the last node

OtypeFeature.maxNode

F.otype.slotType

the slot type

OtypeFeature.slotType

F.otype.all

sorted list of all node types

OtypeFeature.all


E. Edge features

Eall()

all loaded feature names (edge features only)

Api.Eall()

E.fff.f(node)

get value of feature fff for edges from node

EdgeFeature.f()

E.fff.t(node)

get value of feature fff for edges to node

EdgeFeature.t()

E.fff.freqList(...)

frequency list of values of fff

EdgeFeature.freqList()

E.fff.items(...)

generator of all entries of fff as mapping from edges to values

EdgeFeature.items()

E.fff.b(node)

get value of feature fff for edges from and to node

EdgeFeature.b()

E.fff.meta

all meta data of feature fff

EdgeFeature.meta

Es('fff')

identical to E.fff, usable if name of feature is variable

Api.Es()


Special edge feature oslots

Maps nodes to the set of slots they occupy.

E.oslots.items(...)

generator of all entries of oslots as mapping from nodes to sets of slots

OslotsFeature.items()

E.oslots.s(node)

set of slots linked to node

OslotsFeature.s()

E.oslots.meta

all meta data of feature oslots

OslotsFeature.meta


L. Locality

L.i(node, otype=...)

go to intersecting nodes

Locality.i()

L.u(node, otype=...)

go one level up

Locality.u()

L.d(node, otype=...)

go one level down

Locality.d()

L.p(node, otype=...)

go to adjacent previous nodes

Locality.p()

L.n(node, otype=...)

go to adjacent next nodes

Locality.n()


T. Text

T.text(node, fmt=..., ...)

give formatted text associated with node

Text.text()


Sections

Rigid 1 or 2 or 3 sectioning system

T.sectionTuple(node)

give tuple of section nodes that contain node

Text.sectionTuple()

T.sectionFromNode(node)

give section heading of node

Text.sectionFromNode()

T.nodeFromSection(section)

give node for section heading

Text.nodeFromSection()


Structure

Flexible multilevel sectioning system

T.headingFromNode(node)

give structure heading of node

Text.headingFromNode()

T.nodeFromHeading(heading)

give node for structure heading

Text.nodeFromHeading()

T.structureInfo()

give summary of dataset structure

Text.structureInfo()

T.structure(node)

give structure of node and all in it.

Text.structure()

T.structurePretty(node)

pretty print structure of node and all in it.

Text.structurePretty()

T.top()

give all top-level structural nodes in the dataset

Text.top()

T.up(node)

gives parent of structural node

Text.up()

T.down(node)

gives children of structural node

Text.down()


S. Search (low level)

searchRough

Preparation

S.search(query, limit=None)

Query the TF dataset with a template

Search.search()

S.study(query, ...)

Study the query in order to set up a plan

Search.study()

S.showPlan(details=False)

Show the search plan resulting from the last study.

Search.showPlan()

S.relationsLegend()

Catalog of all relational devices in search templates

Search.relationsLegend()


Fetching results

S.count(progress=None, limit=None)

Count the results, up to a limit

Search.count()

S.fetch(limit=None, ...)

Fetches the results, up to a limit

Search.fetch()

S.glean(tup)

Renders a single result into something human readable.

Search.glean()


Implementation

S.tweakPerformance(...)

Set certain parameters that influence the performance of search.

Search.tweakPerformance()


C. Computed data components.

Access to pre-computed data: Computeds.

All components have just one useful attribute: .data.

Call()

all pre-computed data component names

Api.Call()

Cs('ccc')

identical to C.ccc, usable if name of component is variable

Api.Cs()

C.levels.data

various statistics on node types

levels()

C.order.data

the canonical order of the nodes (tf.core.nodes)

order()

C.rank.data

the rank of the nodes in the canonical order (tf.core.nodes)

rank()

C.levUp.data

feeds the Locality.u() function

levUp()

C.levDown.data

feeds the Locality.d() function

levDown()

C.boundary.data

feeds the Locality.p() and Locality.n() functions

boundary()

C.characters.data

frequency list of characters in a corpus, separately for all the text formats

characters()

C.sections.data["sec1"]
C.sections.data["sec2"]

feeds the section part of tf.core.text

sections()

C.sections.data["seqFromNode"]
C.sections.data["nodeFromSeq"]

maps tuples of heading nodes to their corresponding tuples of sequence numbers and vice versa. Only if there are 3 section levels.

sections()

C.structure.data

feeds the structure part of tf.core.text

structure()


TF. Dataset

Loading

TF = Fabric(locations=dirs, modules=subdirs, volume=None, collection=None, silent="auto")

Initialize API on work or single volume or collection of a work from explicit directories. Use use() instead wherever you can. See also tf.about.volumes.

Fabric

TF.isLoaded(features=None)

Show information about loaded features

Api.isLoaded()

TF.explore(show=True)

Get features by category, loaded or unloaded

FabricCore.explore()

TF.loadAll(silent="auto")

Load all loadable features.

FabricCore.loadAll()

TF.load(features, add=False)

Load a bunch of features from scratch or additionally.

FabricCore.load()

TF.ensureLoaded(features)

Make sure that features are loaded.

Api.ensureLoaded()

TF.makeAvailableIn(globals())

Make the members of the core API available in the global scope

Api.makeAvailableIn()

TF.ignored

Which features have been overridden.

Api.ignored

TF.footprint()

Show memory footprint per feature

Api.footprint()


Volumes

See also tf.about.volumes.

TF.getVolumes()

list all volumes of this dataset

Fabric.getVolumes()

TF.extract(volumes, ...)

export volumes based on a volume specification

Fabric.extract()

TF.collect(volumes, ...)

collect several volumes into a new collection

export()

Fabric.collect()

Saving and Publishing

TF.save(nodeFeatures={}, edgeFeatures={}, metaData={},,...)

Save a bunch of newly generated features to disk.

FabricCore.save()

A.publishRelease(increase, message=None, description=None,,...)

Commit the dataset repo, tag it, release it, and attach the complete zipped data to it.

publishRelease()

Export to ZIP

cd ~/backend/org/repo
tf-zipall

store the complete corpus data in a file complete.zip

zipAll()

A.zipAll()

store the complete corpus data in a file complete.zip

zipAll()

from tf.app import collect
collect(backend, org, repo)

same as A.zipAll() above, assuming the data is in a Github clone

collect()


House keeping

TF.version

version number of TF.

Fabric.version

TF.clearCache()

clears the cache of compiled TF data

FabricCore.clearCache()

from tf.clean import clean
clean()

clears the cache of compiled TF data

tf.clean


Volume support

TF datasets per volume or collection of a work. See also tf.about.volumes.

from tf.volumes import getVolumes

getVolumes(volumeDir)

List volumes in a directory.

getVolumes()

from tf.volumes import extract

extract(work, volumes, ...)

Extracts volumes from a work

tf.volumes.extract

from tf.volumes import collect

collect(volumes, work, ...)

Collects several volumes into a new collection

tf.volumes.collect


Dataset Operations

from tf.dataset import modify

modify(source, target, ...)

Modifies a TF dataset into a new TF dataset

tf.dataset.modify

from tf.dataset import Versions

Versions(api, va, vb, slotMap)

Extends a slot mapping between versions of a TF dataset to a complete node mapping

tf.dataset.nodemaps


Data Interchange

from tf.lib import readSets
from tf.lib import writeSets
readSets(sourceFile)

reads a named sets from file

readSets()

writeSets(sets, destFile)

writes a named sets to file

writeSets()


Export to Excel

A.export(results, ...)

export formatted data

export()


Interchange with external annotation tools

from tf.convert.addnlp import NLPipeline
NLPipeline()

generate plain text, feed into NLP, ingest results

tf.convert.addnlp


from convert.recorder import Recorder
Recorder()

generate annotatable plain text and import annotations

tf.convert.recorder


XML / TEI import

from tf.convert.xml import XML
X = XML(...)

convert XML source to full-fledged TF dataset plus app but no docs; put in your own conversion code, if you wish; see Greek New Testament

tf.convert.xml

from tf.convert.tei import TEI
T = TEI(...)

convert TEI source to full-fledged TF dataset plus app plus docs

tf.convert.tei


WATM export

from tf.app import use
from tf.convert.watm import WATM
A = use(...)
WA = WATM(A, ns, ...)
WA.makeText()
WA.makeAnno()
WA.writeAll()
WA.testAll()

convert TF dataset to text tokens and annotations in JSON format, for consumption by TextRepo/AnnoRepo of KNAW/HuC Digital Infrastructure. See Mondriaan Proeftuin Suriano Letters TransLatin Corpus

tf.convert.watm

from tf.convert.watm import WATMS
W = WATM(org, repo, backend, ns, ...)
W.produce()

convert series of TF datasets to WATM

WATMS


NLP import

in order to use this, install Spacy, see tf.tools.myspacy


from tf.convert.addnlp import addTokensAndSentences
newVersion = addTokensAndSenteces(A)

add NLP output from Spacy to an existing TF dataset. See the docs how this is broken down in separate steps.

tf.convert.addnlp


pandas export

A.exportPandas()

export dataset as pandas data frame

tf.convert.pandas


MQL interchange

TF.exportMQL(mqlDb, exportDir=None)
A.exportMQL(mqlDb, exportDir=None)

export loaded dataset to MQL

exportMQL()

from tf.convert.mql import importMQL

TF = importMQL(mqlFile, saveDir)

convert MQL file to TF dataset

importMQL()


Walker conversion

from tf.convert.walker import CV
cv = CV(TF)

convert structured data to TF dataset

tf.convert.walker


Exploding

from tf.convert.tf import explode
explode(inLocation, outLocation)

explode TF feature files to straight data files without optimizations

explode()


TF App development

A.reuse()

reload configuration data

App.reuse()

from tf.advanced.find import loadModule
mmm = loadModule("mmm", *args)

load specific module supporting the corpus app

loadModule()

~/mypath/myname/app/config.yaml

settings for a TF App

tf.advanced.settings


Layered search

(these work on the command-line if TF is installed)

tf-make {dataset} {client} ship

generate a static site with a search interface in client-side JavaScript and publish it to GitHub pages. If {client} is left out, generate all clients that are defined for this dataset. Clients are defined in the app-{dataset} repo, under layeredsearch. More commands here.

tf.client.make.build

tf-make {dataset} serve

serve the search interfaces defined for {dataset} locally.

More commands here.


Annotation tools

(these work in the TF browser and in Jupyter Notebooks)

Named Entity Annotation

tf {org}/{repo} --tool=ner 

Starts the TF browser for the corpus in org/repo and opens the manual annotation tool.

tf.about.annotateBrowser

NE = A.makeNer()

Sets up the 'manual' annotation API for the corpus in A.

tf.ner.ner

More info and examples in tf.about.annotate .


Command-line tools

(these work on the command-line if TF is installed)

tf {org}/{repo}
tf {org}/{repo}

Starts the TF browser for the corpus in org/repo.

tf.browser.start

tf-zipall

Zips the TF dataset located by the current directory, with all its additional data modules, but only the latest version, so that it can be attached to a release on GitHub / GitLab.

zipAll() and tf.zip

tf-zip {org}/{repo}

Zips the TF dataset in org/repo so that it can be attached to a release on GitHub / GitLab.

tf.advanced.zipdata

tf-nbconvert {inDirectory} {outDirectory}

Converts notebooks in inDirectory to HTML and stores them in outDirectory.

tf.tools.nbconvert

tf-xmlschema analysis {schema}.xsd

Analyses an XML schema file and extracts meaningful information for processing the XML that adheres to that schema.

tf.tools.xmlschema

tf-fromxml

When run in a repo it finds an XML source and converts it to TF. The resulting TF data is delivered in the repo. There is a hook to put your own conversion code in.

tf.convert.xml

tf-fromtei

When run in a repo it finds a TEI source and converts it to TF. The resulting TF data is delivered in the repo.

tf.convert.tei

tf-addnlp

When run in the repo of a TF dataset, it adds NLP output to it after running Spacy to get them.

tf.convert.addnlp

Expand source code Browse git
"""
.. include:: docs/main/cheatsheet.md
"""