<
From version < 1.2 >
edited by Paul Libbrecht
on 2025/01/14 16:42
To version < 1.6 >
edited by AISOP Admin
on 2025/01/14 21:04
>
Change comment: There is no comment for this version

Summary

Details

Page properties
Title
... ... @@ -1,1 +1,1 @@
1 -the-AISOP-recipe
1 +The AISOP recipe
Author
... ... @@ -1,1 +1,1 @@
1 -XWiki.polx
1 +XWiki.AISOPAdmin
Content
... ... @@ -2,57 +2,108 @@
2 2  (((
3 3  (% class="container" %)
4 4  (((
5 -= My new article =
5 += The AISOP recipe =
6 6  
7 -Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed viverra enim quis tristique tincidunt. Morbi nec hendrerit mi. Mauris convallis tortor et justo gravida elementum. Mauris dictum imperdiet quam, quis sodales velit tempus varius. Ut convallis mi rutrum imperdiet eleifend. Ut diam sapien, iaculis facilisis nisl non, varius cursus eros. Praesent vitae ipsum molestie enim pulvinar semper nec a nisi.
7 +The AISOP webapp is a service built as the result of various training and configurations.
8 +This recipe explains how to extract the content fragments, annotate them, and create model trained on it. This will let us create a pipeline and a seminar on which we can analyse portfolios.
8 8  )))
9 9  )))
10 10  
11 -(% class="row" %)
12 -(((
13 -(% class="col-xs-12 col-sm-8" %)
14 -(((
15 -= Paragraph 1 =
16 16  
17 -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
13 +== Basic Terms ==
18 18  
19 -== Sub-paragraph ==
15 +The context of the AISOP-web-app usage is that of a course at learning institution which typically has fixed students and fixed contents. A course can contain multiple courses or modules.
20 20  
21 -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
17 +* **AISOP Web-app:** The nodeJS server that interfaces with the portfolio-composing system.
18 +* **Portfolio:** the content written by a student in order to represent his or her progress, learning and knowledge using a textual and graphical form. Generally expressed in HTML, can be embedded in various web-pages.
19 +* **Course-contents:** The set of slides, their annotations, the videos and handouts that normally read by students and teachers.
20 +* **Analysis:** The set of programmes that recognize and measure the contents of a portfolio. Often also the name of the resulting interactive presentation (which can feature summaries or enriched portfolio views).
21 +* **Composition Platform:** A space where the portfolio is written. Normally a web-space. In AISOP we have focussed on the classical e-portfol;io composition platform Mahara (a PHP server).
22 22  
23 -== Sub-paragraph ==
23 +----
24 24  
25 -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
25 +== 1) Data Preparation ==
26 26  
27 -=== Sub-sub paragraph ===
27 +=== 1.1: Make a Concept Map ===
28 28  
29 -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
29 +Employing tools such as CMapTools, create a graphical concept map that represents the topics of the course. This concept map can be familiar with the teachers and learners of this course as a way to show the paths through the content.
30 30  
31 +From the concept map, extract a .cxl file which carries the same information and will be presented on the web-page.
31 31  
32 -= Paragraph 2 =
33 +From the concept map, also extract a hierarchy of topics, assuming there is more than (approx) 10 topics in the map. The hierarchy should be a text file with a label per line and the label indented to the right in case of children relation as in the following example:
33 33  
34 -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
35 +>Algorithmization
36 +> Flow Charts
37 +> Programming
38 +> Programming Paradigm
39 +> Imperative Programming
40 +>Data-Structure
41 +> ....
42 +>Operating System
43 +> ....
35 35  
36 -== Sub-paragraph ==
45 +We'll name this file labels-all-depths.txt. From this text file, extract a text file with only the top labels (in the extract above only Algorithmization, Data-Structure and Operating System), named labels-depth1.txt.
37 37  
38 -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
47 +=== 1.2: Extract Text of the Course Content ===
39 39  
40 -== Sub-paragraph ==
49 +In order for the topic recognition to work, a model needs to be trained that will recognize the words used by the students to denote a part or another of the course. This allows to create relations between the concepts of the course and the paragraphs of the portfolio and offer these in the interactive dashboards. The training is the result of annotating fragments of texts which, first, need to be extracted from their media, be them PDF files, PowerPoint slides, scanned texts or student works. These texts will not be shared so that even protected material or even personal-information carrying texts can be used.
41 41  
42 -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
43 -)))
51 +Practically:
44 44  
53 +* Make all documents accessible for you to open and browse (e.g. download them or get the authorized accesses)
54 +* Install and launch the [[clipboard extractor>>https://gitlab.com/aisop/aisop-hacking/-/tree/main/aisop-clipboard-extractor?ref_type=heads]] which will gather the fragments in a text file
55 +* Go through all contents and copy each fragment. A fragment is expected to be the size of a paragraph so this is what you should copy.
56 +* The extractor should have copied all the fragments in one file. Which we shall call extraction.json.
57 +* The least amount of content to be extracted is the complete set of slides and their comments. We recommend to use past students' e-portfolios too. We had rather good experience with about 1000 fragments for a course.
58 +* If interrupted, the process may create several JSON files. You can combine them using the [[merge-tool>>https://gitlab.com/aisop/aisop-hacking/-/tree/main/merge-json-files]].
45 45  
46 -(% class="col-xs-12 col-sm-4" %)
47 -(((
48 -{{box title="**Contents**"}}
49 -{{toc/}}
50 -{{/box}}
60 +=== 1.3: Annotate Text Fragments ===
51 51  
52 -[[image:Templates.Article.Template.WebHome@image1.jpg]]
53 -//Figure 1: [[Sea>>https://commons.wikimedia.org/wiki/File:Isle_of_Icacos_II.jpg]]//
62 +It is time to endow the fragments with topics so that we can recognize students' paragraphs' topics. In AISOP, we have used the (commercial) [[prodigy>>https://prodi.gy/]] for this task in two steps which, both, iterate through all fragments to give them topics.
54 54  
55 -[[image:Templates.Article.Template.WebHome@image2.jpg]]
56 -//Figure 2: [[Waves>>https://commons.wikimedia.org/wiki/File:Culebra_-_Playa_de_Flamenco.jpg]]//
57 -)))
58 -)))
64 +**The first step: top-level-labels:** This is the simple [["text classifier" recipe>>https://prodi.gy/docs/recipes#textcat]] of prodigy: we can invoke the following command for this: prodigy textcat.manual the-course-name ./fragments.jsonl ~-~-label labels-depth1.txt  which will offer a web-interface on which each fragment is annotated with the (top-level) label. This web-interface can be left running for several days.
65 +
66 +**The second step is the hierarchical annotation** [[custom recipe>>https://gitlab.com/aisop/aisop-nlp/-/tree/main/hierarchical_annotation?ref_type=heads]] (link to become public soon): The same fragments are now annotated with the top-level annotation and all their children. E.g. using the command xxx
67 +
68 +The resulting data-set can be extracted out of prodigy using the db-out recipe, e.g. prodigy db-out the-course-name-l2 the-course-name-l2
69 +
70 +
71 +----
72 +
73 +== 2) Deployment ==
74 +
75 +=== 2.1 Train a Recognition Model ===
76 +
77 +...
78 +
79 +=== 2.2 Create a Pipeline ===
80 +
81 +...
82 +
83 +=== 2.3 Create a Seminar and Import Content ===
84 +
85 +...
86 +
87 +=== 2.4 Interface with the composition platform ===
88 +
89 +...
90 +
91 +----
92 +
93 +== 3) Usage ==
94 +
95 +=== 3.1 Invite Users ===
96 +
97 +...
98 +
99 +=== 3.2 Verify Imports and Analyses ===
100 +
101 +...
102 +
103 +=== 3.3 Observe Usage and Reflect on Quality ===
104 +
105 +...
106 +
107 +=== 3.4 Gather Enhancements ===
108 +
109 +... on the web-app, on the creation process, and on the course

Need help?

If you need help with XWiki you can contact: