Module tf.convert.mql

MQL

You can interchange with MQL data. Text-Fabric can read and write MQL dumps. An MQL dump is a text file, like an SQL dump. It contains the instructions to create and fill a complete database.

Correspondence TF and MQL

After exporting a TF dataset to MQL, the resulting MQL database has the following properties with respect to the TF dataset it comes from:

  • the TF slots correspond exactly with the MQL monads and have the same numbers; provided the monad numbers in the MQL dump are consecutive. In MQL this is not obligatory. Even if there gaps in the monads sequence, we will fill the holes during conversion, so the slots are tightly consecutive;
  • the TF nodes correspond exactly with the MQL objects and have the same numbers

Node features in MQL

The values of TF features are of two types, int and str, and they translate to corresponding MQL types integer and string. The actual values do not undergo any transformation.

That means that in MQL queries, you use quotes if the feature is a string feature. Only if the feature is a number feature, you may omit the quotes:

[word sp='verb']
[verse chapter=1 and verse=1]

Enumeration types

It is attractive to use eumeration types for the values of a feature, whereever possible, because then you can query those features in MQL with IN and without quotes:

[chapter book IN (Genesis, Exodus)]

We will generate enumerations for eligible features.

Integer values can already be queried like this, even if they are not part of an enumeration. So we restrict ourselves to node features with string values. We put the following extra restrictions:

  • the number of distinct values is less than 1000
  • all values must be legal C names, in practice: starting with a letter, followed by letters, digits, or _. The letters can only be plain ASCII letters, uppercase and lowercase.

Features that comply with these restrictions will get an enumeration type. Currently, we provide no ways to configure this in more detail.

Instead of creating separate enumeration types for individual features, we collect all enumerated values for all those features into one big enumeration type.

The reason is that MQL considers equal values in different types as distinct values. If we had separate types, we could never compare values for different features.

There is no place for edge values in MQL. There is only one concept of feature in MQL: object features, which are node features. But TF edges without values can be seen as node features: nodes are mapped onto sets of nodes to which the edges go. And that notion is supported by MQL: edge features are translated into MQL features of type LIST OF id_d, i.e. lists of object identifiers.

Legal names in MQL

MQL names for databases, object types and features must be valid C identifiers (yes, the computer language C).

The requirements are for names are:

  • start with a letter (ASCII, upper-case or lower-case)
  • follow by any sequence of ASCII upper/lower-case letters or digits or underscores (_)
  • avoid being a reserved word in the C language

So, we have to change names coming from TF if they are invalid in MQL. We do that by replacing illegal characters by _, and, if the result does not start with a letter, we prepend an x. We do not check whether the name is a reserved C word.

With these provisos:

  • the given dbName correspond to the MQL database name
  • the TF otypes correspond to the MQL objects
  • the TF features correspond to the MQL features

The MQL export is usually quite massive (500 MB for the Hebrew Bible). It can be compressed greatly, especially by the program bzip2.

Exisiting database

If you try to import an MQL file in Emdros, and there exists already a file or directory with the same name as the MQL database, your import will fail spectacularly. So do not do that.

A good way to prevent clashes:

  • export the MQL to outside your text-fabric-data directory, e.g. to ~/Downloads;
  • before importing the MQL file, delete the previous copy;

Delete existing copy:

cd ~/Downloads
rm dataset ; mql -b 3 < dataset.mql
Expand source code Browse git
"""
# MQL

You can interchange with [MQL data](https://emdros.org).
Text-Fabric can read and write MQL dumps.
An MQL dump is a text file, like an SQL dump.
It contains the instructions to create and fill a complete database.

## Correspondence TF and MQL

After exporting a TF dataset to MQL, the resulting MQL database has the
following properties with respect to the TF dataset it comes from:

*   the TF *slots* correspond exactly with the MQL *monads* and have the same
    numbers; provided the monad numbers in the MQL dump are consecutive. In MQL
    this is not obligatory. Even if there gaps in the monads sequence, we will
    fill the holes during conversion, so the slots are tightly consecutive;
*   the TF *nodes* correspond exactly with the MQL *objects* and have the same
    numbers

## Node features in MQL

The values of TF features are of two types, `int` and `str`, and they translate
to corresponding MQL types `integer` and `string`. The actual values do not
undergo any transformation.

That means that in MQL queries, you use quotes if the feature is a string feature.
Only if the feature is a number feature, you may omit the quotes:

```
[word sp='verb']
[verse chapter=1 and verse=1]
```

## Enumeration types

It is attractive to use eumeration types for the values of a feature, whereever
possible, because then you can query those features in MQL with `IN` and without
quotes:

```
[chapter book IN (Genesis, Exodus)]
```

We will generate enumerations for eligible features.

Integer values can already be queried like this, even if they are not part of an
enumeration. So we restrict ourselves to node features with string values. We
put the following extra restrictions:

*   the number of distinct values is less than 1000
*   all values must be legal C names, in practice: starting with a letter,
    followed by letters, digits, or `_`. The letters can only be plain ASCII
    letters, uppercase and lowercase.

Features that comply with these restrictions will get an enumeration type.
Currently, we provide no ways to configure this in more detail.

Instead of creating separate enumeration types for individual features,
we collect all enumerated values for all those features into one
big enumeration type.

The reason is that MQL considers equal values in different types as
distinct values. If we had separate types, we could never compare
values for different features.

There is no place for edge values in
MQL. There is only one concept of feature in MQL: object features,
which are node features.
But TF edges without values can be seen as node features: nodes are
mapped onto sets of nodes to which the edges go. And that notion is supported by
MQL:
edge features are translated into MQL features of type `LIST OF id_d`,
i.e. lists of object identifiers.

!!! caution "Legal names in MQL"
    MQL names for databases, object types and features must be valid C identifiers
    (yes, the computer language C).

The requirements are for names are:

*   start with a letter (ASCII, upper-case or lower-case)
*   follow by any sequence of ASCII upper/lower-case letters or digits or
    underscores (`_`)
*   avoid being a reserved word in the C language

So, we have to change names coming from TF if they are invalid in MQL. We do
that by replacing illegal characters by `_`, and, if the result does not start
with a letter, we prepend an `x`. We do not check whether the name is a reserved
C word.

With these provisos:

*   the given `dbName` correspond to the MQL *database name*
*   the TF *otypes* correspond to the MQL *objects*
*   the TF *features* correspond to the MQL *features*

The MQL export is usually quite massive (500 MB for the Hebrew Bible).
It can be compressed greatly, especially by the program `bzip2`.

!!! caution "Exisiting database"
    If you try to import an MQL file in Emdros, and there exists already a file or
    directory with the same name as the MQL database, your import will fail
    spectacularly. So do not do that.

A good way to prevent clashes:

*   export the MQL to outside your `text-fabric-data` directory, e.g. to
    `~/Downloads`;
*   before importing the MQL file, delete the previous copy;

Delete existing copy:

```sh
cd ~/Downloads
rm dataset ; mql -b 3 < dataset.mql
```
"""

import os
import re
from itertools import chain
from ..parameters import WARP, OTYPE, OSLOTS
from ..core.helpers import (
    cleanName,
    isClean,
    specFromRanges,
    rangesFromList,
    setFromSpec,
    nbytes,
    console,
    expanduser,
)
from ..core.timestamp import SILENT_D, silentConvert

# If a feature, with type string, has less than ENUM_LIMIT values,
# an enumeration type for it will be created
# provided all values of that feature are a valid name for MQL.

ENUM_LIMIT = 1000

ONE_ENUM_TYPE = True


class MQL(object):
    def __init__(self, mqlDir, mqlName, tfFeatures, tmObj, silent=SILENT_D):
        self.silent = silentConvert(silent)
        tmObj.setSilent(silent)
        error = tmObj.error

        mqlDir = expanduser(mqlDir)
        self.mqlDir = mqlDir
        cleanDb = cleanName(mqlName)
        if cleanDb != mqlName:
            error(f'db name "{mqlName}" => "{cleanDb}"')
        self.mqlName = cleanDb
        self.tfFeatures = tfFeatures
        self.tmObj = tmObj
        self.enums = {}
        self._check()

    def write(self):
        silent = self.silent
        tmObj = self.tmObj
        error = tmObj.error
        info = tmObj.info
        indent = tmObj.indent

        if not self.good:
            return
        if not os.path.exists(self.mqlDir):
            try:
                os.makedirs(self.mqlDir, exist_ok=True)
            except Exception:
                error(f'Cannot create directory "{self.mqlDir}"')
                self.good = False
                return
        mqlPath = f"{self.mqlDir}/{self.mqlName}.mql"
        try:
            fm = open(mqlPath, "w", encoding="utf8")
        except Exception:
            error(f"Could not write to {mqlPath}")
            self.good = False
            return

        info(f"Loading {len(self.featureList)} features")
        for ft in self.featureList:
            fObj = self.features[ft]
            fObj.load(silent=silent)

        self.fm = fm
        self._writeStartDb()
        self._writeEnums()
        self._writeTypes()
        self._writeDataAll()
        self._writeEndDb()
        indent(level=0)
        info("Done")

    def _check(self):
        silent = self.silent
        tmObj = self.tmObj
        error = tmObj.error
        info = tmObj.info
        indent = tmObj.indent

        info(f"Checking features of dataset {self.mqlName}")

        self.features = {}
        self.featureList = []
        indent(level=1)
        for (f, fo) in sorted(self.tfFeatures.items()):
            if fo.method is not None or f in WARP:
                continue
            fo.load(metaOnly=True, silent=silent)
            if fo.isConfig:
                continue
            cleanF = cleanName(f)
            if cleanF != f:
                error(f'feature "{f}" => "{cleanF}"')
            self.featureList.append(cleanF)
            self.features[cleanF] = fo
        good = True
        for feat in (OTYPE, OSLOTS, "__levels__"):
            if feat not in self.tfFeatures:
                error(
                    "{} feature {} is missing from data set".format(
                        "Warp"
                        if feat in WARP
                        else "Computed"
                        if feat.startswith("__")
                        else "Data",
                        feat,
                    )
                )
                good = False
            else:
                fObj = self.tfFeatures[feat]
                if not fObj.load(silent=silent):
                    good = False
        indent(level=0)
        if not good:
            error("Export to MQL aborted")
        else:
            info(f"{len(self.featureList)} features to export to MQL ...")
        self.good = good

    def _writeStartDb(self):
        self.fm.write(
            """
CREATE DATABASE '{name}'
GO
USE DATABASE '{name}'
GO
""".format(
                name=self.mqlName
            )
        )

    def _writeEndDb(self):
        self.fm.write(
            """
VACUUM DATABASE ANALYZE
GO
"""
        )
        self.fm.close()

    def _writeEnums(self):
        tmObj = self.tmObj
        info = tmObj.info
        indent = tmObj.indent

        indent(level=0)
        info("Writing enumerations")
        indent(level=1)
        for ft in self.featureList:
            fObj = self.features[ft]
            if fObj.isEdge or fObj.dataType == "int":
                continue
            fMap = fObj.data
            fValues = sorted(set(fMap.values()))
            if len(fValues) > ENUM_LIMIT:
                continue
            eligible = all(isClean(fVal) for fVal in fValues)
            if not eligible:
                unclean = [fVal for fVal in fValues if not isClean(fVal)]
                console(
                    "\t{:<15}: {:>4} values, {} not a name, e.g. «{}»".format(
                        ft,
                        len(fValues),
                        len(unclean),
                        unclean[0],
                    )
                )
                continue
            self.enums[ft] = fValues

        if ONE_ENUM_TYPE:
            self._writeEnumsAsOne()
        else:
            for ft in sorted(self.enums):
                self._writeEnum(ft)
            indent(level=0)
            info(f"Written {len(self.enums)} enumerations")

    def _writeEnumsAsOne(self):
        tmObj = self.tmObj
        info = tmObj.info

        fValues = sorted(
            set(chain.from_iterable((set(fV) for fV in self.enums.values())))
        )
        if len(fValues):
            info(f"Writing an all-in-one enum with {len(fValues):>4} values")
            fValuesEnumerated = ",\n\t".join(
                "{} = {}".format(fVal, i) for (i, fVal) in enumerate(fValues)
            )
            self.fm.write(
                f"""
CREATE ENUMERATION all_enum = {{
    {fValuesEnumerated}
}}
GO
"""
            )

    def _writeEnum(self, ft):
        tmObj = self.tmObj
        info = tmObj.info

        fValues = self.enums[ft]
        if len(fValues):
            info(f"enum {ft:<15} with {len(fValues):>4} values")
            fValuesEnumerated = ",\n\t".join(
                f"{fVal} = {i}" for (i, fVal) in enumerate(fValues)
            )
            self.fm.write(
                f"""
CREATE ENUMERATION {ft}_enum = {{
    {fValuesEnumerated}
}}
GO
"""
            )

    def _writeTypes(self):
        def valInt(n):
            return str(n)

        def valStr(s):
            if "'" in s:
                return '"{}"'.format(s.replace('"', '\\"'))
            else:
                return "'{}'".format(s)

        def valIds(ids):
            return "({})".format(",".join(str(i) for i in ids))

        tmObj = self.tmObj
        error = tmObj.error
        info = tmObj.info
        indent = tmObj.indent

        self.levels = self.tfFeatures["__levels__"].data[::-1]
        indent(level=0)
        info(
            "Mapping {} features onto {} object types".format(
                len(self.featureList),
                len(self.levels),
            )
        )
        otypeSupport = {}
        for (otype, av, start, end) in self.levels:
            cleanOtype = cleanName(otype)
            if cleanOtype != otype:
                error(f'otype "{otype}" => "{cleanOtype}"')
            otypeSupport[cleanOtype] = set(range(start, end + 1))

        self.otypes = {}
        self.featureTypes = {}
        self.featureMethods = {}
        for ft in self.featureList:
            fObj = self.features[ft]
            if fObj.isEdge:
                dataType = "LIST OF id_d"
                method = valIds
            else:
                if fObj.dataType == "str":
                    dataType = 'string DEFAULT ""'
                    method = valInt if ft in self.enums else valStr
                elif fObj.dataType == "int":
                    dataType = "integer DEFAULT 0"
                    method = valInt
                else:
                    dataType = 'string DEFAULT ""'
                    method = valStr
            self.featureTypes[ft] = dataType
            self.featureMethods[ft] = method

            support = set(fObj.data.keys())
            for otype in otypeSupport:
                if len(support & otypeSupport[otype]):
                    self.otypes.setdefault(otype, []).append(ft)

        for otype in (cleanName(x[0]) for x in self.levels):
            self._writeType(otype)

    def _writeType(self, otype):
        self.fm.write(
            f"""
CREATE OBJECT TYPE
[{otype}
"""
        )
        for ft in self.otypes[otype]:
            fType = (
                "{}_enum".format("all" if ONE_ENUM_TYPE else ft)
                if ft in self.enums
                else self.featureTypes[ft]
            )
            self.fm.write(f"  {ft}:{fType};\n")
        self.fm.write(
            """
]
GO
"""
        )

    def _writeDataAll(self):
        tmObj = self.tmObj
        info = tmObj.info

        info(
            "Writing {} features as data in {} object types".format(
                len(self.featureList),
                len(self.levels),
            )
        )
        oslotsData = self.tfFeatures[OSLOTS].data
        self.oslots = oslotsData[0]
        self.maxSlot = oslotsData[1]
        for (otype, av, start, end) in self.levels:
            self._writeData(otype, start, end)

    def _writeData(self, otype, start, end):
        tmObj = self.tmObj
        info = tmObj.info
        indent = tmObj.indent

        fm = self.fm

        indent(level=1, reset=True)
        info(f"{otype} data ...")
        oslots = self.oslots
        maxSlot = self.maxSlot
        oFeats = self.otypes[otype]
        features = self.features
        featureMethods = self.featureMethods
        fm.write(
            """
DROP INDEXES ON OBJECT TYPE[{o}]
GO
CREATE OBJECTS
WITH OBJECT TYPE[{o}]
""".format(
                o=otype
            )
        )
        curSize = 0
        LIMIT = 50000
        t = 0
        j = 0
        indent(level=2, reset=True)
        for n in range(start, end + 1):
            oMql = """
CREATE OBJECT
FROM MONADS= {{ {m} }}
WITH ID_D={i} [
""".format(
                m=n
                if n <= maxSlot
                else specFromRanges(rangesFromList(oslots[n - maxSlot - 1])),
                i=n,
            )
            for ft in oFeats:
                method = featureMethods[ft]
                fMap = features[ft].data
                if n in fMap:
                    oMql += f"{ft}:={method(fMap[n])};\n"
            oMql += """
]
"""
            fm.write(oMql)
            curSize += len(bytes(oMql, encoding="utf8"))
            t += 1
            j += 1
            if j == LIMIT:
                fm.write(
                    """
GO
CREATE OBJECTS
WITH OBJECT TYPE[{o}]
""".format(
                        o=otype
                    )
                )
                info(
                    f"batch of size {nbytes(curSize):>20} with {j:>7} of {t:>7} {otype}s"
                )
                j = 0
                curSize = 0

        info(f"batch of size {nbytes(curSize):>20} with {j:>7} of {t:>7} {otype}s")
        fm.write(
            """
GO
CREATE INDEXES ON OBJECT TYPE[{o}]
GO
""".format(
                o=otype
            )
        )

        indent(level=1)
        info("{} data: {} objects".format(otype, t))


# MQL IMPORT

uniscan = re.compile(r"(?:\\x..)+")


def makeuni(match):
    """Make proper unicode of a text that contains byte escape codes
    such as backslash xb6
    """
    byts = eval('"' + match.group(0) + '"')
    return byts.encode("latin1").decode("utf-8")


def uni(line):
    return uniscan.sub(makeuni, line)


def tfFromMql(mqlFile, tmObj, slotType=None, otext=None, meta=None):
    """Generate TF from MQL

    Parameters
    ----------
    tmObj: object
        A `tf.core.timestamp.Timestamp` object
    mqlFile, slotType, otype, meta: various
        See `tf.core.fabric.Fabric.importMQL
    """
    mqlFile = expanduser(mqlFile)
    error = tmObj.error

    if slotType is None:
        error("ERROR: no slotType specified")
        return (False, {}, {}, {})
    (good, objectTypes, tables, edgeF, nodeF) = parseMql(mqlFile, tmObj)
    if not good:
        return (False, {}, {}, {})
    return tfFromData(tmObj, objectTypes, tables, edgeF, nodeF, slotType, otext, meta)


def parseMql(mqlFile, tmObj):
    info = tmObj.info
    error = tmObj.error

    info("Parsing mql source ...")
    fh = open(mqlFile, encoding="utf8")

    objectTypes = dict()
    tables = dict()

    edgeF = dict()
    nodeF = dict()

    curId = None
    curEnum = None
    curObjectType = None
    curTable = None
    curObject = None
    curValue = None
    curFeature = None
    seeObjects = False

    inObjectTypeFeatures = False

    STRING_TYPES = {"ascii", "string"}

    enums = dict()

    chunkSize = 1000000
    inThisChunk = 0

    good = True

    for (ln, line) in enumerate(fh):
        inThisChunk += 1
        if inThisChunk == chunkSize:
            info(f"\tline {ln + 1:>9}")
            inThisChunk = 0
        if line.startswith("CREATE OBJECTS WITH OBJECT TYPE") or line.startswith(
            "WITH OBJECT TYPE"
        ):
            comps = line.rstrip().rstrip("]").split("[", 1)
            curTable = comps[1]
            info(f"\t\tobjects in {curTable}")
            curObject = None
            if curTable not in tables:
                tables[curTable] = dict()
            seeObjects = True
        elif line == "CREATE OBJECT\n":
            curObject = None
            curObject = dict(feats=dict(), monads=None)
            curId = None
            seeObjects = True
        elif curEnum is not None:
            if line.startswith("}"):
                curEnum = None
                continue
            comps = line.strip().rstrip(",").split("=", 1)
            comp = comps[0].strip()
            words = comp.split()
            if words[0] == "DEFAULT":
                enums[curEnum]["default"] = uni(words[1])
                value = words[1]
            else:
                value = words[0]
            enums[curEnum]["values"].append(value)
        elif curObjectType is not None:
            if line.startswith("]"):
                curObjectType = None
                inObjectTypeFeatures = False
                continue
            if curObjectType is True:
                if line.startswith("["):
                    curObjectType = line.rstrip()[1:]
                    objectTypes[curObjectType] = dict()
                    info(f"\t\totype {curObjectType}")
                    inObjectTypeFeatures = True
                    continue
            if inObjectTypeFeatures:
                comps = line.strip().rstrip(";").split(":", 1)
                feature = comps[0].strip()
                fInfo = comps[1].strip()
                fCleanInfo = fInfo.replace("FROM SET", "")
                fInfoComps = fCleanInfo.split(" ", 1)
                fMQLType = fInfoComps[0]
                if len(fInfoComps) == 2:
                    fDefaultComps = fInfoComps[1].strip().split(" ", 1)
                    fDefault = fDefaultComps[1] if len(fDefaultComps) > 1 else None
                else:
                    fDefault = None
                if fDefault is not None and fMQLType in STRING_TYPES:
                    fDefault = uni(fDefault[1:-1])
                default = enums.get(fMQLType, {}).get("default", fDefault)
                ftype = (
                    "str"
                    if fMQLType in enums
                    else "int"
                    if fMQLType == "integer"
                    else "str"
                    if fMQLType in STRING_TYPES
                    else "int"
                    if fInfo == "id_d"
                    else "str"
                )
                isEdge = fMQLType == "id_d"
                if isEdge:
                    edgeF.setdefault(curObjectType, set()).add(feature)
                else:
                    nodeF.setdefault(curObjectType, set()).add(feature)

                objectTypes[curObjectType][feature] = (ftype, default)
                info(
                    "\t\t\tfeature {} ({}) =def= {} : {}".format(
                        feature, ftype, default, "edge" if isEdge else "node"
                    )
                )
        elif seeObjects:
            if curObject is not None:
                if line.startswith("]"):
                    objectType = objectTypes[curTable]
                    for (feature, (ftype, default)) in objectType.items():
                        if feature not in curObject["feats"] and default is not None:
                            curObject["feats"][feature] = default
                    tables[curTable][curId] = curObject
                    curObject = None
                    continue
                elif line.startswith("["):
                    name = line.rstrip()[1:]
                    if len(name):
                        curTable = name
                        if curTable not in tables:
                            tables[curTable] = dict()
                elif line.startswith("FROM MONADS"):
                    monads = (
                        line.split("=", 1)[1]
                        .replace("{", "")
                        .replace("}", "")
                        .replace(" ", "")
                        .strip()
                    )
                    curObject["monads"] = setFromSpec(monads)
                elif line.startswith("WITH ID_D"):
                    comps = line.replace("[", "").rstrip().split("=", 1)
                    curId = int(comps[1])
                elif line.startswith("GO"):
                    pass
                elif line.strip() == "":
                    pass
                else:
                    if curValue is not None:
                        toBeContinued = not line.rstrip().endswith('";')
                        if toBeContinued:
                            curValue += line
                        else:
                            curValue += line.rstrip().rstrip(";").rstrip('"')
                            curObject["feats"][curFeature] = uni(curValue)
                            curValue = None
                            curFeature = None
                        continue
                    if ":=" in line:
                        (featurePart, valuePart) = line.split("=", 1)
                        feature = featurePart[0:-1].strip()
                        valuePart = valuePart.lstrip()
                        isText = ':="' in line
                        toBeContinued = isText and not line.rstrip().endswith('";')
                        if toBeContinued:
                            # this happens if a feature value
                            # contains a new line
                            # we must continue scanning lines
                            # until we meet the end of the value
                            curFeature = feature
                            curValue = valuePart.lstrip('"')
                        else:
                            value = valuePart.rstrip().rstrip(";").strip('"')
                            curObject["feats"][feature] = (
                                uni(value) if isText else value
                            )
                    else:
                        error(f"ERROR: line {ln}: unrecognized line -->{line}<--")
                        good = False
                        break
            else:
                if line.startswith("CREATE OBJECT"):
                    curObject = dict(feats=dict(), monads=None)
                    curId = None
        else:
            if line.startswith("CREATE ENUMERATION"):
                words = line.split()
                curEnum = words[2]
                enums[curEnum] = dict(default=None, values=[])
                info(f"\t\tenum {curEnum}")
            elif line.startswith("CREATE OBJECT TYPE"):
                curObjectType = True
                inObjectTypeFeatures = False
    info(f"{ln + 1} lines parsed")
    fh.close()
    for table in tables:
        info(f"{len(tables[table])} objects of type {table}")

    if len(tables) == 0:
        info("No objects found")
    return (good, objectTypes, tables, nodeF, edgeF)


def tfFromData(tmObj, objectTypes, tables, nodeF, edgeF, slotType, otext, meta):
    info = tmObj.info

    info("Making TF data ...")

    NIL = {"nil", "NIL", "Nil"}

    tableOrder = [slotType] + [t for t in sorted(tables) if t != slotType]

    iddFromMonad = dict()
    slotFromMonad = dict()

    nodeFromIdd = dict()
    iddFromNode = dict()

    nodeFeatures = dict()
    edgeFeatures = dict()
    metaData = dict()

    # metadata that ends up in every feature
    metaData[""] = meta.get("", {})
    distinctFeatures = chain(chain.from_iterable(nodeF.values()), chain.from_iterable(edgeF.values()))
    for f in distinctFeatures:
        metaInfo = meta.get(f, None)
        if metaInfo is not None:
            metaData[f] = metaInfo

    # the config feature otext
    metaData["otext"] = otext

    good = True

    info("Monad - idd mapping ...")
    for idd in tables.get(slotType, {}):
        monad = list(tables[slotType][idd]["monads"])[0]
        iddFromMonad[monad] = idd

    info("Removing holes in the monad sequence")
    # we set up a monad - slot mapping
    curSlot = 0
    otype = dict()
    for monad in sorted(iddFromMonad):
        curSlot += 1
        slotFromMonad[monad] = curSlot
        idd = iddFromMonad[monad]
        nodeFromIdd[idd] = curSlot
        iddFromNode[curSlot] = idd
        otype[curSlot] = slotType

    maxSlot = curSlot
    info(f"maxSlot={maxSlot}")

    info("Node mapping and otype ...")
    node = maxSlot
    for t in tableOrder[1:]:
        for idd in sorted(tables[t]):
            node += 1
            nodeFromIdd[idd] = node
            iddFromNode[node] = idd
            otype[node] = t

    nodeFeatures["otype"] = otype
    metaData["otype"] = dict(valueType="str")

    info("oslots ...")
    oslots = dict()
    for t in tableOrder[1:]:
        for idd in tables.get(t, {}):
            node = nodeFromIdd[idd]
            monads = tables[t][idd]["monads"]
            oslots[node] = {slotFromMonad[m] for m in monads}
    edgeFeatures["oslots"] = oslots
    metaData["oslots"] = dict(valueType="str")

    info("metadata ...")
    for t in nodeF:
        for f in nodeF[t]:
            ftype = objectTypes[t][f][0]
            metaData.setdefault(f, {})["valueType"] = ftype
    for t in edgeF:
        for f in edgeF[t]:
            metaData.setdefault(f, {})["valueType"] = "str"

    info("features ...")
    chunkSize = 100000
    for t in tableOrder:
        info(f"\tfeatures from {t}s")
        inThisChunk = 0
        thisTable = tables.get(t, {})
        for (i, idd) in enumerate(thisTable):
            inThisChunk += 1
            if inThisChunk == chunkSize:
                info(f"\t{i + 1:>9} {t}s")
                inThisChunk = 0
            node = nodeFromIdd[idd]
            features = tables[t][idd]["feats"]
            for (f, v) in features.items():
                isEdge = f in edgeF.get(t, set())
                if isEdge:
                    if v not in NIL:
                        edgeFeatures.setdefault(f, {}).setdefault(node, set()).add(
                            nodeFromIdd[int(v)]
                        )
                else:
                    nodeFeatures.setdefault(f, {})[node] = v
        info(f"\t{len(thisTable):>9} {t}s")

    return (good, nodeFeatures, edgeFeatures, metaData)

Functions

def makeuni(match)

Make proper unicode of a text that contains byte escape codes such as backslash xb6

Expand source code Browse git
def makeuni(match):
    """Make proper unicode of a text that contains byte escape codes
    such as backslash xb6
    """
    byts = eval('"' + match.group(0) + '"')
    return byts.encode("latin1").decode("utf-8")
def parseMql(mqlFile, tmObj)
Expand source code Browse git
def parseMql(mqlFile, tmObj):
    info = tmObj.info
    error = tmObj.error

    info("Parsing mql source ...")
    fh = open(mqlFile, encoding="utf8")

    objectTypes = dict()
    tables = dict()

    edgeF = dict()
    nodeF = dict()

    curId = None
    curEnum = None
    curObjectType = None
    curTable = None
    curObject = None
    curValue = None
    curFeature = None
    seeObjects = False

    inObjectTypeFeatures = False

    STRING_TYPES = {"ascii", "string"}

    enums = dict()

    chunkSize = 1000000
    inThisChunk = 0

    good = True

    for (ln, line) in enumerate(fh):
        inThisChunk += 1
        if inThisChunk == chunkSize:
            info(f"\tline {ln + 1:>9}")
            inThisChunk = 0
        if line.startswith("CREATE OBJECTS WITH OBJECT TYPE") or line.startswith(
            "WITH OBJECT TYPE"
        ):
            comps = line.rstrip().rstrip("]").split("[", 1)
            curTable = comps[1]
            info(f"\t\tobjects in {curTable}")
            curObject = None
            if curTable not in tables:
                tables[curTable] = dict()
            seeObjects = True
        elif line == "CREATE OBJECT\n":
            curObject = None
            curObject = dict(feats=dict(), monads=None)
            curId = None
            seeObjects = True
        elif curEnum is not None:
            if line.startswith("}"):
                curEnum = None
                continue
            comps = line.strip().rstrip(",").split("=", 1)
            comp = comps[0].strip()
            words = comp.split()
            if words[0] == "DEFAULT":
                enums[curEnum]["default"] = uni(words[1])
                value = words[1]
            else:
                value = words[0]
            enums[curEnum]["values"].append(value)
        elif curObjectType is not None:
            if line.startswith("]"):
                curObjectType = None
                inObjectTypeFeatures = False
                continue
            if curObjectType is True:
                if line.startswith("["):
                    curObjectType = line.rstrip()[1:]
                    objectTypes[curObjectType] = dict()
                    info(f"\t\totype {curObjectType}")
                    inObjectTypeFeatures = True
                    continue
            if inObjectTypeFeatures:
                comps = line.strip().rstrip(";").split(":", 1)
                feature = comps[0].strip()
                fInfo = comps[1].strip()
                fCleanInfo = fInfo.replace("FROM SET", "")
                fInfoComps = fCleanInfo.split(" ", 1)
                fMQLType = fInfoComps[0]
                if len(fInfoComps) == 2:
                    fDefaultComps = fInfoComps[1].strip().split(" ", 1)
                    fDefault = fDefaultComps[1] if len(fDefaultComps) > 1 else None
                else:
                    fDefault = None
                if fDefault is not None and fMQLType in STRING_TYPES:
                    fDefault = uni(fDefault[1:-1])
                default = enums.get(fMQLType, {}).get("default", fDefault)
                ftype = (
                    "str"
                    if fMQLType in enums
                    else "int"
                    if fMQLType == "integer"
                    else "str"
                    if fMQLType in STRING_TYPES
                    else "int"
                    if fInfo == "id_d"
                    else "str"
                )
                isEdge = fMQLType == "id_d"
                if isEdge:
                    edgeF.setdefault(curObjectType, set()).add(feature)
                else:
                    nodeF.setdefault(curObjectType, set()).add(feature)

                objectTypes[curObjectType][feature] = (ftype, default)
                info(
                    "\t\t\tfeature {} ({}) =def= {} : {}".format(
                        feature, ftype, default, "edge" if isEdge else "node"
                    )
                )
        elif seeObjects:
            if curObject is not None:
                if line.startswith("]"):
                    objectType = objectTypes[curTable]
                    for (feature, (ftype, default)) in objectType.items():
                        if feature not in curObject["feats"] and default is not None:
                            curObject["feats"][feature] = default
                    tables[curTable][curId] = curObject
                    curObject = None
                    continue
                elif line.startswith("["):
                    name = line.rstrip()[1:]
                    if len(name):
                        curTable = name
                        if curTable not in tables:
                            tables[curTable] = dict()
                elif line.startswith("FROM MONADS"):
                    monads = (
                        line.split("=", 1)[1]
                        .replace("{", "")
                        .replace("}", "")
                        .replace(" ", "")
                        .strip()
                    )
                    curObject["monads"] = setFromSpec(monads)
                elif line.startswith("WITH ID_D"):
                    comps = line.replace("[", "").rstrip().split("=", 1)
                    curId = int(comps[1])
                elif line.startswith("GO"):
                    pass
                elif line.strip() == "":
                    pass
                else:
                    if curValue is not None:
                        toBeContinued = not line.rstrip().endswith('";')
                        if toBeContinued:
                            curValue += line
                        else:
                            curValue += line.rstrip().rstrip(";").rstrip('"')
                            curObject["feats"][curFeature] = uni(curValue)
                            curValue = None
                            curFeature = None
                        continue
                    if ":=" in line:
                        (featurePart, valuePart) = line.split("=", 1)
                        feature = featurePart[0:-1].strip()
                        valuePart = valuePart.lstrip()
                        isText = ':="' in line
                        toBeContinued = isText and not line.rstrip().endswith('";')
                        if toBeContinued:
                            # this happens if a feature value
                            # contains a new line
                            # we must continue scanning lines
                            # until we meet the end of the value
                            curFeature = feature
                            curValue = valuePart.lstrip('"')
                        else:
                            value = valuePart.rstrip().rstrip(";").strip('"')
                            curObject["feats"][feature] = (
                                uni(value) if isText else value
                            )
                    else:
                        error(f"ERROR: line {ln}: unrecognized line -->{line}<--")
                        good = False
                        break
            else:
                if line.startswith("CREATE OBJECT"):
                    curObject = dict(feats=dict(), monads=None)
                    curId = None
        else:
            if line.startswith("CREATE ENUMERATION"):
                words = line.split()
                curEnum = words[2]
                enums[curEnum] = dict(default=None, values=[])
                info(f"\t\tenum {curEnum}")
            elif line.startswith("CREATE OBJECT TYPE"):
                curObjectType = True
                inObjectTypeFeatures = False
    info(f"{ln + 1} lines parsed")
    fh.close()
    for table in tables:
        info(f"{len(tables[table])} objects of type {table}")

    if len(tables) == 0:
        info("No objects found")
    return (good, objectTypes, tables, nodeF, edgeF)
def tfFromData(tmObj, objectTypes, tables, nodeF, edgeF, slotType, otext, meta)
Expand source code Browse git
def tfFromData(tmObj, objectTypes, tables, nodeF, edgeF, slotType, otext, meta):
    info = tmObj.info

    info("Making TF data ...")

    NIL = {"nil", "NIL", "Nil"}

    tableOrder = [slotType] + [t for t in sorted(tables) if t != slotType]

    iddFromMonad = dict()
    slotFromMonad = dict()

    nodeFromIdd = dict()
    iddFromNode = dict()

    nodeFeatures = dict()
    edgeFeatures = dict()
    metaData = dict()

    # metadata that ends up in every feature
    metaData[""] = meta.get("", {})
    distinctFeatures = chain(chain.from_iterable(nodeF.values()), chain.from_iterable(edgeF.values()))
    for f in distinctFeatures:
        metaInfo = meta.get(f, None)
        if metaInfo is not None:
            metaData[f] = metaInfo

    # the config feature otext
    metaData["otext"] = otext

    good = True

    info("Monad - idd mapping ...")
    for idd in tables.get(slotType, {}):
        monad = list(tables[slotType][idd]["monads"])[0]
        iddFromMonad[monad] = idd

    info("Removing holes in the monad sequence")
    # we set up a monad - slot mapping
    curSlot = 0
    otype = dict()
    for monad in sorted(iddFromMonad):
        curSlot += 1
        slotFromMonad[monad] = curSlot
        idd = iddFromMonad[monad]
        nodeFromIdd[idd] = curSlot
        iddFromNode[curSlot] = idd
        otype[curSlot] = slotType

    maxSlot = curSlot
    info(f"maxSlot={maxSlot}")

    info("Node mapping and otype ...")
    node = maxSlot
    for t in tableOrder[1:]:
        for idd in sorted(tables[t]):
            node += 1
            nodeFromIdd[idd] = node
            iddFromNode[node] = idd
            otype[node] = t

    nodeFeatures["otype"] = otype
    metaData["otype"] = dict(valueType="str")

    info("oslots ...")
    oslots = dict()
    for t in tableOrder[1:]:
        for idd in tables.get(t, {}):
            node = nodeFromIdd[idd]
            monads = tables[t][idd]["monads"]
            oslots[node] = {slotFromMonad[m] for m in monads}
    edgeFeatures["oslots"] = oslots
    metaData["oslots"] = dict(valueType="str")

    info("metadata ...")
    for t in nodeF:
        for f in nodeF[t]:
            ftype = objectTypes[t][f][0]
            metaData.setdefault(f, {})["valueType"] = ftype
    for t in edgeF:
        for f in edgeF[t]:
            metaData.setdefault(f, {})["valueType"] = "str"

    info("features ...")
    chunkSize = 100000
    for t in tableOrder:
        info(f"\tfeatures from {t}s")
        inThisChunk = 0
        thisTable = tables.get(t, {})
        for (i, idd) in enumerate(thisTable):
            inThisChunk += 1
            if inThisChunk == chunkSize:
                info(f"\t{i + 1:>9} {t}s")
                inThisChunk = 0
            node = nodeFromIdd[idd]
            features = tables[t][idd]["feats"]
            for (f, v) in features.items():
                isEdge = f in edgeF.get(t, set())
                if isEdge:
                    if v not in NIL:
                        edgeFeatures.setdefault(f, {}).setdefault(node, set()).add(
                            nodeFromIdd[int(v)]
                        )
                else:
                    nodeFeatures.setdefault(f, {})[node] = v
        info(f"\t{len(thisTable):>9} {t}s")

    return (good, nodeFeatures, edgeFeatures, metaData)
def tfFromMql(mqlFile, tmObj, slotType=None, otext=None, meta=None)

Generate TF from MQL

Parameters

tmObj : object
A Timestamp object
mqlFile, slotType, otype, meta : various
See `tf.core.fabric.Fabric.importMQL
Expand source code Browse git
def tfFromMql(mqlFile, tmObj, slotType=None, otext=None, meta=None):
    """Generate TF from MQL

    Parameters
    ----------
    tmObj: object
        A `tf.core.timestamp.Timestamp` object
    mqlFile, slotType, otype, meta: various
        See `tf.core.fabric.Fabric.importMQL
    """
    mqlFile = expanduser(mqlFile)
    error = tmObj.error

    if slotType is None:
        error("ERROR: no slotType specified")
        return (False, {}, {}, {})
    (good, objectTypes, tables, edgeF, nodeF) = parseMql(mqlFile, tmObj)
    if not good:
        return (False, {}, {}, {})
    return tfFromData(tmObj, objectTypes, tables, edgeF, nodeF, slotType, otext, meta)
def uni(line)
Expand source code Browse git
def uni(line):
    return uniscan.sub(makeuni, line)

Classes

class MQL (mqlDir, mqlName, tfFeatures, tmObj, silent='terse')
Expand source code Browse git
class MQL(object):
    def __init__(self, mqlDir, mqlName, tfFeatures, tmObj, silent=SILENT_D):
        self.silent = silentConvert(silent)
        tmObj.setSilent(silent)
        error = tmObj.error

        mqlDir = expanduser(mqlDir)
        self.mqlDir = mqlDir
        cleanDb = cleanName(mqlName)
        if cleanDb != mqlName:
            error(f'db name "{mqlName}" => "{cleanDb}"')
        self.mqlName = cleanDb
        self.tfFeatures = tfFeatures
        self.tmObj = tmObj
        self.enums = {}
        self._check()

    def write(self):
        silent = self.silent
        tmObj = self.tmObj
        error = tmObj.error
        info = tmObj.info
        indent = tmObj.indent

        if not self.good:
            return
        if not os.path.exists(self.mqlDir):
            try:
                os.makedirs(self.mqlDir, exist_ok=True)
            except Exception:
                error(f'Cannot create directory "{self.mqlDir}"')
                self.good = False
                return
        mqlPath = f"{self.mqlDir}/{self.mqlName}.mql"
        try:
            fm = open(mqlPath, "w", encoding="utf8")
        except Exception:
            error(f"Could not write to {mqlPath}")
            self.good = False
            return

        info(f"Loading {len(self.featureList)} features")
        for ft in self.featureList:
            fObj = self.features[ft]
            fObj.load(silent=silent)

        self.fm = fm
        self._writeStartDb()
        self._writeEnums()
        self._writeTypes()
        self._writeDataAll()
        self._writeEndDb()
        indent(level=0)
        info("Done")

    def _check(self):
        silent = self.silent
        tmObj = self.tmObj
        error = tmObj.error
        info = tmObj.info
        indent = tmObj.indent

        info(f"Checking features of dataset {self.mqlName}")

        self.features = {}
        self.featureList = []
        indent(level=1)
        for (f, fo) in sorted(self.tfFeatures.items()):
            if fo.method is not None or f in WARP:
                continue
            fo.load(metaOnly=True, silent=silent)
            if fo.isConfig:
                continue
            cleanF = cleanName(f)
            if cleanF != f:
                error(f'feature "{f}" => "{cleanF}"')
            self.featureList.append(cleanF)
            self.features[cleanF] = fo
        good = True
        for feat in (OTYPE, OSLOTS, "__levels__"):
            if feat not in self.tfFeatures:
                error(
                    "{} feature {} is missing from data set".format(
                        "Warp"
                        if feat in WARP
                        else "Computed"
                        if feat.startswith("__")
                        else "Data",
                        feat,
                    )
                )
                good = False
            else:
                fObj = self.tfFeatures[feat]
                if not fObj.load(silent=silent):
                    good = False
        indent(level=0)
        if not good:
            error("Export to MQL aborted")
        else:
            info(f"{len(self.featureList)} features to export to MQL ...")
        self.good = good

    def _writeStartDb(self):
        self.fm.write(
            """
CREATE DATABASE '{name}'
GO
USE DATABASE '{name}'
GO
""".format(
                name=self.mqlName
            )
        )

    def _writeEndDb(self):
        self.fm.write(
            """
VACUUM DATABASE ANALYZE
GO
"""
        )
        self.fm.close()

    def _writeEnums(self):
        tmObj = self.tmObj
        info = tmObj.info
        indent = tmObj.indent

        indent(level=0)
        info("Writing enumerations")
        indent(level=1)
        for ft in self.featureList:
            fObj = self.features[ft]
            if fObj.isEdge or fObj.dataType == "int":
                continue
            fMap = fObj.data
            fValues = sorted(set(fMap.values()))
            if len(fValues) > ENUM_LIMIT:
                continue
            eligible = all(isClean(fVal) for fVal in fValues)
            if not eligible:
                unclean = [fVal for fVal in fValues if not isClean(fVal)]
                console(
                    "\t{:<15}: {:>4} values, {} not a name, e.g. «{}»".format(
                        ft,
                        len(fValues),
                        len(unclean),
                        unclean[0],
                    )
                )
                continue
            self.enums[ft] = fValues

        if ONE_ENUM_TYPE:
            self._writeEnumsAsOne()
        else:
            for ft in sorted(self.enums):
                self._writeEnum(ft)
            indent(level=0)
            info(f"Written {len(self.enums)} enumerations")

    def _writeEnumsAsOne(self):
        tmObj = self.tmObj
        info = tmObj.info

        fValues = sorted(
            set(chain.from_iterable((set(fV) for fV in self.enums.values())))
        )
        if len(fValues):
            info(f"Writing an all-in-one enum with {len(fValues):>4} values")
            fValuesEnumerated = ",\n\t".join(
                "{} = {}".format(fVal, i) for (i, fVal) in enumerate(fValues)
            )
            self.fm.write(
                f"""
CREATE ENUMERATION all_enum = {{
    {fValuesEnumerated}
}}
GO
"""
            )

    def _writeEnum(self, ft):
        tmObj = self.tmObj
        info = tmObj.info

        fValues = self.enums[ft]
        if len(fValues):
            info(f"enum {ft:<15} with {len(fValues):>4} values")
            fValuesEnumerated = ",\n\t".join(
                f"{fVal} = {i}" for (i, fVal) in enumerate(fValues)
            )
            self.fm.write(
                f"""
CREATE ENUMERATION {ft}_enum = {{
    {fValuesEnumerated}
}}
GO
"""
            )

    def _writeTypes(self):
        def valInt(n):
            return str(n)

        def valStr(s):
            if "'" in s:
                return '"{}"'.format(s.replace('"', '\\"'))
            else:
                return "'{}'".format(s)

        def valIds(ids):
            return "({})".format(",".join(str(i) for i in ids))

        tmObj = self.tmObj
        error = tmObj.error
        info = tmObj.info
        indent = tmObj.indent

        self.levels = self.tfFeatures["__levels__"].data[::-1]
        indent(level=0)
        info(
            "Mapping {} features onto {} object types".format(
                len(self.featureList),
                len(self.levels),
            )
        )
        otypeSupport = {}
        for (otype, av, start, end) in self.levels:
            cleanOtype = cleanName(otype)
            if cleanOtype != otype:
                error(f'otype "{otype}" => "{cleanOtype}"')
            otypeSupport[cleanOtype] = set(range(start, end + 1))

        self.otypes = {}
        self.featureTypes = {}
        self.featureMethods = {}
        for ft in self.featureList:
            fObj = self.features[ft]
            if fObj.isEdge:
                dataType = "LIST OF id_d"
                method = valIds
            else:
                if fObj.dataType == "str":
                    dataType = 'string DEFAULT ""'
                    method = valInt if ft in self.enums else valStr
                elif fObj.dataType == "int":
                    dataType = "integer DEFAULT 0"
                    method = valInt
                else:
                    dataType = 'string DEFAULT ""'
                    method = valStr
            self.featureTypes[ft] = dataType
            self.featureMethods[ft] = method

            support = set(fObj.data.keys())
            for otype in otypeSupport:
                if len(support & otypeSupport[otype]):
                    self.otypes.setdefault(otype, []).append(ft)

        for otype in (cleanName(x[0]) for x in self.levels):
            self._writeType(otype)

    def _writeType(self, otype):
        self.fm.write(
            f"""
CREATE OBJECT TYPE
[{otype}
"""
        )
        for ft in self.otypes[otype]:
            fType = (
                "{}_enum".format("all" if ONE_ENUM_TYPE else ft)
                if ft in self.enums
                else self.featureTypes[ft]
            )
            self.fm.write(f"  {ft}:{fType};\n")
        self.fm.write(
            """
]
GO
"""
        )

    def _writeDataAll(self):
        tmObj = self.tmObj
        info = tmObj.info

        info(
            "Writing {} features as data in {} object types".format(
                len(self.featureList),
                len(self.levels),
            )
        )
        oslotsData = self.tfFeatures[OSLOTS].data
        self.oslots = oslotsData[0]
        self.maxSlot = oslotsData[1]
        for (otype, av, start, end) in self.levels:
            self._writeData(otype, start, end)

    def _writeData(self, otype, start, end):
        tmObj = self.tmObj
        info = tmObj.info
        indent = tmObj.indent

        fm = self.fm

        indent(level=1, reset=True)
        info(f"{otype} data ...")
        oslots = self.oslots
        maxSlot = self.maxSlot
        oFeats = self.otypes[otype]
        features = self.features
        featureMethods = self.featureMethods
        fm.write(
            """
DROP INDEXES ON OBJECT TYPE[{o}]
GO
CREATE OBJECTS
WITH OBJECT TYPE[{o}]
""".format(
                o=otype
            )
        )
        curSize = 0
        LIMIT = 50000
        t = 0
        j = 0
        indent(level=2, reset=True)
        for n in range(start, end + 1):
            oMql = """
CREATE OBJECT
FROM MONADS= {{ {m} }}
WITH ID_D={i} [
""".format(
                m=n
                if n <= maxSlot
                else specFromRanges(rangesFromList(oslots[n - maxSlot - 1])),
                i=n,
            )
            for ft in oFeats:
                method = featureMethods[ft]
                fMap = features[ft].data
                if n in fMap:
                    oMql += f"{ft}:={method(fMap[n])};\n"
            oMql += """
]
"""
            fm.write(oMql)
            curSize += len(bytes(oMql, encoding="utf8"))
            t += 1
            j += 1
            if j == LIMIT:
                fm.write(
                    """
GO
CREATE OBJECTS
WITH OBJECT TYPE[{o}]
""".format(
                        o=otype
                    )
                )
                info(
                    f"batch of size {nbytes(curSize):>20} with {j:>7} of {t:>7} {otype}s"
                )
                j = 0
                curSize = 0

        info(f"batch of size {nbytes(curSize):>20} with {j:>7} of {t:>7} {otype}s")
        fm.write(
            """
GO
CREATE INDEXES ON OBJECT TYPE[{o}]
GO
""".format(
                o=otype
            )
        )

        indent(level=1)
        info("{} data: {} objects".format(otype, t))

Methods

def write(self)
Expand source code Browse git
def write(self):
    silent = self.silent
    tmObj = self.tmObj
    error = tmObj.error
    info = tmObj.info
    indent = tmObj.indent

    if not self.good:
        return
    if not os.path.exists(self.mqlDir):
        try:
            os.makedirs(self.mqlDir, exist_ok=True)
        except Exception:
            error(f'Cannot create directory "{self.mqlDir}"')
            self.good = False
            return
    mqlPath = f"{self.mqlDir}/{self.mqlName}.mql"
    try:
        fm = open(mqlPath, "w", encoding="utf8")
    except Exception:
        error(f"Could not write to {mqlPath}")
        self.good = False
        return

    info(f"Loading {len(self.featureList)} features")
    for ft in self.featureList:
        fObj = self.features[ft]
        fObj.load(silent=silent)

    self.fm = fm
    self._writeStartDb()
    self._writeEnums()
    self._writeTypes()
    self._writeDataAll()
    self._writeEndDb()
    indent(level=0)
    info("Done")