Changes for page AISOP domains
Last modified by Paul Libbrecht on 2025/04/17 21:43
Change comment:
There is no comment for this version
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -38,15 +38,6 @@ 38 38 * the `l1-model` directory is the spacy model for the classifier for the l1-topics 39 39 * the `l2-models` contains a directory for each l1-topic which contains a spacy model for the sub-labels of this l1-topic 40 40 * any extra file or directory mentioned as link 41 -* the `tests.txt` file contains the test fragments so that the debug tool can be used right away, one line per fragment 42 -* the `log.txt` file contains the statistical output of the training and/or statistitics: one line per label, one column par dimension 43 -* optionally, any file used for development, documented by a `README.md` (see below) 41 +* any file used for development, documented by a `README.md` 44 44 45 45 All paths of links used in the `about.json` and `pipeline.json` files can be resolved in a relative manner. For them to be recognized, we recommend to express relative paths with the syntax of starting with `./` as in `"logo":"./my-logo.svg"`. This allows the web-app to perform relative resolution in a secure way (not going outside of the domain directory except for known places) before it is given to the web-server or to the analysis scripts. 46 - 47 -While the README.md should be the main entry point for the source work for creating the domain, we propose the following folder names: 48 - 49 -- `source-content`: a collection of files (e.g. PDFs, pictures, texts, pptx, ...) that represent the source input from where an extraction is made 50 -- `extracts` is the result of the extraction process and is made of JSON files, one, or one folder, per source collection 51 -- `annotations` is the result of the annotations exported from prodigy in the form of JSONL files 52 -- moreover, instructions used and the log of all processes is visible in the `README.md` file