Module tf.convert.tf

Raw, unoptimised data from TF files

Functions

def explode(inPath, outPath)
Expand source code Browse git
def explode(inPath, outPath):
    """Explodes `.tf` files into non-optimised `.tf` files without metadata.

    An exploded `.tf` feature file is a TF file with explicit node specifiers,
    no optimizations.

    The format of each line is:

    **Node features**:

        node<tab>value

    If the value is None for a certain `node`, there will be no such line.

    **Edge features without values**:

        node<tab>node

    **Edge features with values**:

        node<tab>node<tab>value

    If the value is `None`, it will be left out, together with the preceding <tab>.
    This way, the empty string is distinguished from a `None` value.

    !!! caution "Ambiguity"
        In the resulting data file, all metadata is gone.
        It is not always possible to infer from the data alone what data type a feature
        has:

        `1<tab>2` could be a node feature assigning integer 2 to node 1, or string `2`
        to node 1.

        It could also be an edge feature assigning `None` to the node pair (1, 2).

    Parameters
    ----------
    inPath: string
        Source file(s).
        If pointing to a file, it should be file containing TF feature data.
        If pointing to a directory, all `.tf` files in that directory will be exploded
        (non-recursively).
        The path may contain `~` which will be expanded to the user's home directory.
    outPath: string
        Destination of the exploded file(s).
        If pointing to a non-existing location, a file or directory will be created
        there, depending on whether `inPath` is a file or directory.
        If pointing to an existing directory, exploded file(s) will be put there.

    Returns
    -------
    boolean
        whether the operation was successful.
    """

    inPath = normpath(inPath)
    outPath = normpath(outPath)
    inLoc = ex(inPath)
    outLoc = ex(outPath)
    if not dirExists(inLoc):
        return f"No such directory: `{inPath}`"

    isInDir = isDir(inLoc)
    outExists = dirExists(outLoc)
    isOutDir = isDir(outLoc) if outExists else None

    tasks = []

    if isInDir:
        with scanDir(inLoc) as sd:
            tasks = [
                (f"{inLoc}/{e.name}", f"{outLoc}/{e.name}")
                for e in sd
                if e.name.endswith(".tf") and e.is_file()
            ]
            if not tasks:
                return "No .tf files in `{inPath}`"
        if outExists and not isOutDir:
            return "Not a directory: `{outPath}`"
        if not outExists:
            dirMake(outLoc)
    else:
        if not isFile(inLoc):
            return "Not a file: `{inPath}"
        if outExists:
            if isOutDir:
                outFile = f"{outLoc}/{fileNm(inLoc)}"
            else:
                outFile = outLoc
        else:
            outDir = dirNm(outLoc)
            dirMake(outDir)
            outFile = outLoc

        tasks = [(inLoc, outFile)]

    msgs = []

    for (inFile, outFile) in sorted(tasks):
        result = _readTf(inFile)
        if type(result) is str:
            msgs.append(f"{ux(inFile)} => {ux(outFile)}:\n\t{result}")
            continue
        (data, valueType, isEdge) = result
        _writeTf(outFile, *result)

    good = True
    if msgs:
        for msg in msgs:
            thisGood = msg[0] != "X"
            (sys.stdout if thisGood else sys.stderr).write(f"{msg}\n")
            if not thisGood:
                good = False
    return good

Explodes .tf files into non-optimised .tf files without metadata.

An exploded .tf feature file is a TF file with explicit node specifiers, no optimizations.

The format of each line is:

Node features:

node<tab>value

If the value is None for a certain node, there will be no such line.

Edge features without values:

node<tab>node

Edge features with values:

node<tab>node<tab>value

If the value is None, it will be left out, together with the preceding . This way, the empty string is distinguished from a None value.

Ambiguity

In the resulting data file, all metadata is gone. It is not always possible to infer from the data alone what data type a feature has:

1<tab>2 could be a node feature assigning integer 2 to node 1, or string 2 to node 1.

It could also be an edge feature assigning None to the node pair (1, 2).

Parameters

inPath : string
Source file(s). If pointing to a file, it should be file containing TF feature data. If pointing to a directory, all .tf files in that directory will be exploded (non-recursively). The path may contain ~ which will be expanded to the user's home directory.
outPath : string
Destination of the exploded file(s). If pointing to a non-existing location, a file or directory will be created there, depending on whether inPath is a file or directory. If pointing to an existing directory, exploded file(s) will be put there.

Returns

boolean
whether the operation was successful.