Changes for page The AISOP recipe
Last modified by Paul Libbrecht on 2025/06/15 23:32
Change comment:
There is no comment for this version
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -9,10 +9,7 @@ 9 9 ))) 10 10 ))) 11 11 12 -(% class="row" %) 13 -((( 14 -(% class="col-xs-12 col-sm-8" %) 15 -((( 12 + 16 16 == Basic Terms == 17 17 18 18 The context of the AISOP-web-app usage is that of a course at learning institution which typically has fixed students and fixed contents. A course can contain multiple courses or modules. ... ... @@ -25,7 +25,6 @@ 25 25 26 26 ---- 27 27 28 -(% class="wikigeneratedid" %) 29 29 == 1) Data Preparation == 30 30 31 31 === 1.1: Make a Concept Map === ... ... @@ -40,8 +40,14 @@ 40 40 > Flow Charts 41 41 > Programming 42 42 > Programming Paradigm 43 -> Imperative Programming.... 39 +> Imperative Programming 40 +>Data-Structure 41 +> .... 42 +>Operating System 43 +> .... 44 44 45 +We'll name this file labels-all-depths.txt. From this text file, extract a text file with only the top labels (in the extract above only Algorithmization, Data-Structure and Operating System), named labels-depth1.txt. 46 + 45 45 === 1.2: Extract Text of the Course Content === 46 46 47 47 In order for the topic recognition to work, a model needs to be trained that will recognize the words used by the students to denote a part or another of the course. This allows to create relations between the concepts of the course and the paragraphs of the portfolio and offer these in the interactive dashboards. The training is the result of annotating fragments of texts which, first, need to be extracted from their media, be them PDF files, PowerPoint slides, scanned texts or student works. These texts will not be shared so that even protected material or even personal-information carrying texts can be used. ... ... @@ -48,12 +48,24 @@ 48 48 49 49 Practically: 50 50 51 -* Assemble the documents 53 +* Make all documents accessible for you to open and browse (e.g. download them or get the authorized accesses) 54 +* Install and launch the [[clipboard extractor>>https://gitlab.com/aisop/aisop-hacking/-/tree/main/aisop-clipboard-extractor?ref_type=heads]] which will gather the fragments in a text file 55 +* Go through all contents and copy each fragment. A fragment is expected to be the size of a paragraph so this is what you should copy. 56 +* The extractor should have copied all the fragments in one file. Which we shall call extraction.json. 57 +* The least amount of content to be extracted is the complete set of slides and their comments. We recommend to use past students' e-portfolios too. We had rather good experience with about 1000 fragments for a course. 58 +* If interrupted, the process may create several JSON files. You can combine them using the [[merge-tool>>https://gitlab.com/aisop/aisop-hacking/-/tree/main/merge-json-files]]. 52 52 53 53 === 1.3: Annotate Text Fragments === 54 54 55 - Loremipsumdolor sitamet,consecteturadipiscingelit,seddo eiusmodtemporincididuntut laboreetdolore magna aliqua. Ut enimad minimveniam,quisnostrudexercitationullamco laborisnisiutaliquipexeacommodo consequat.Duisauteirure dolorinreprehenderitin voluptate velit esse cillumdoloreeu fugiatnullapariatur.Excepteursintoccaecatcupidatatnon proident, suntin culpaqui officiadeseruntmollitanim idestlaborum.62 +It is time to endow the fragments with topics so that we can recognize students' paragraphs' topics. In AISOP, we have used the (commercial) [[prodigy>>https://prodi.gy/]] for this task in two steps which, both, iterate through all fragments to give them topics. 56 56 64 +**The first step: top-level-labels:** This is the simple [["text classifier" recipe>>https://prodi.gy/docs/recipes#textcat]] of prodigy: we can invoke the following command for this: prodigy textcat.manual the-course-name ./fragments.jsonl ~-~-label labels-depth1.txt which will offer a web-interface on which each fragment is annotated with the (top-level) label. This web-interface can be left running for several days. 65 + 66 +**The second step is the hierarchical annotation** [[custom recipe>>https://gitlab.com/aisop/aisop-nlp/-/tree/main/hierarchical_annotation?ref_type=heads]] (link to become public soon): The same fragments are now annotated with the top-level annotation and all their children. E.g. using the command xxx 67 + 68 +The resulting data-set can be extracted out of prodigy using the db-out recipe, e.g. prodigy db-out the-course-name-l2 the-course-name-l2 69 + 70 + 57 57 ---- 58 58 59 59 == 2) Deployment == ... ... @@ -93,5 +93,3 @@ 93 93 === 3.4 Gather Enhancements === 94 94 95 95 ... on the web-app, on the creation process, and on the course 96 -))) 97 -)))