چکیده
مقدمه
پیشینه و مطالعات مرتبط
جمع آوری داده ها و پیش پردازش
رویکرد ارائه شده
نتایج تجربی
بحث
نتیجه گیری
منابع
Abstract
Introduction
Background and Related Literature
Data Collection and Preprocessing
Proposed Approach
Experimental Results
Discussion
Conclusion
References
چکیده
برنامه نویسی کامپیوتر در توسعه فناوری اطلاعات و ارتباطات در دنیای واقعی توجه زیادی را به خود جلب کرده است. پاسخگویی به تقاضای رو به رشد برای برنامه نویسان ماهر در صنعت ICT یکی از چالش های اصلی است. در این مرحله، سیستمهای داور آنلاین (OJ) علاوه بر یادگیری مبتنی بر کلاس، فرصتهای یادگیری برنامهنویسی و تمرین را افزایش میدهند. در نتیجه، سیستمهای OJ تعداد زیادی از دادههای حل مسئله (کدهای راهحل، گزارشها و امتیازات) بایگانی ایجاد کردهاند که میتوانند مواد خام ارزشمندی برای تحقیقات آموزش برنامهنویسی باشند. در این مقاله، ما یک چارچوب داده کاوی آموزشی برای حمایت از یادگیری برنامه نویسی با استفاده از الگوریتم های بدون نظارت پیشنهاد می کنیم. این چارچوب شامل توالی مراحل زیر است: (i) جمعآوری دادههای حل مسئله (گزارشها و امتیازات از OJ جمعآوری میشوند) و پیش پردازش. (ii) الگوریتم خوشه بندی MK-means برای خوشه بندی داده ها در فضای اقلیدسی استفاده می شود. (iii) ویژگی های آماری از هر خوشه استخراج می شود. (IV) الگوریتم رشد مکرر (FP) برای هر خوشه برای استخراج الگوهای داده و قوانین تداعی اعمال می شود. (v) مجموعه ای از پیشنهادات بر اساس ویژگی های استخراج شده، الگوهای داده ها و قوانین ارائه شده است. پارامترهای مختلف برای دستیابی به بهترین نتایج برای الگوریتمهای استخراج قانون خوشهبندی و تداعی تنظیم میشوند. برای این آزمایش، تقریباً 70000 داده حل مسئله در دنیای واقعی از 537 دانشجوی یک دوره برنامه نویسی (الگوریتم و ساختارهای داده) استفاده شد. علاوه بر این، دادههای مصنوعی برای آزمایشها برای نشان دادن عملکرد الگوریتم MK-means به کار گرفته شدهاند. نتایج تجربی نشان می دهد که چارچوب پیشنهادی به طور موثر ویژگی ها، الگوها و قوانین مفید را از داده های حل مسئله استخراج می کند. علاوه بر این، این ویژگیها، الگوها و قوانین استخراجشده، نقاط ضعف و دامنه پیشرفتهای احتمالی در یادگیری برنامهنویسی را برجسته میکنند.
توجه! این متن ترجمه ماشینی بوده و توسط مترجمین ای ترجمه، ترجمه نشده است.
Abstract
Computer programming has attracted a lot of attention in the development of information and communication technologies in the real world. Meeting the growing demand for highly skilled programmers in the ICT industry is one of the major challenges. In this point, online judge (OJ) systems enhance programming learning and practice opportunities in addition to classroom-based learning. Consequently, OJ systems have created a large number of problem-solving data (solution codes, logs, and scores) archives that can be valuable raw materials for programming education research. In this paper, we propose an educational data mining framework to support programming learning using unsupervised algorithms. The framework includes the following sequence of steps: ( i ) problem-solving data collection (logs and scores are collected from the OJ) and preprocessing; ( ii ) MK-means clustering algorithm is used for data clustering in Euclidean space; ( iii ) statistical features are extracted from each cluster; ( iv ) frequent pattern (FP)-growth algorithm is applied to each cluster to mine data patterns and association rules; ( v ) a set of suggestions are provided on the basis of the extracted features, data patterns, and rules. Different parameters are adjusted to achieve the best results for clustering and association rule mining algorithms. For the experiment, approximately 70,000 real-world problem-solving data from 537 students of a programming course (Algorithm and Data Structures) were used. In addition, synthetic data have leveraged for experiments to demonstrate the performance of MK-means algorithm. The experimental results show that the proposed framework effectively extracts useful features, patterns, and rules from problem-solving data. Moreover, these extracted features, patterns, and rules highlight the weaknesses and the scope of possible improvements in programming learning.
Introduction
Today’s information and communication technology (ICT) industry demands for highly skilled programmers for further development. The conventional computer programming learning environment is insufficient to prepare highly skilled programmers due to the limited number of exercise classes, limited practice opportunities, and lack of individual tutoring. In addition, most educational institutions, such as schools, colleges, and universities are struggling to build more educational facilities to increase academic activity (e.g., additional exercise classes, practice, and individual tutoring) due to logistical and organizational constraints [1]. The growing number of people in classrooms in educational institutions, the large number of students per class, and some lectures are conducted with more than a thousand participants in the massive open online courses which complicate the individual tutoring process [2]. Furthermore, the growing ratio between students and educators raises the question of how to provide individual support to students to improve their problem-solving skills. Especially, when learning computer programming, students need a lot of practice and individual tutoring to improve their programming knowledge and skills. Computer programming is one of the fundamental courses in ICT discipline. Programming practice and competition can play an important role in acquiring good programming skills. More programming practice can help to improve a student’s problem-solving skills. Due to logistical and institutional constraints, the traditional classroom-based programming learning approach is a major obstacle to the development of students’ programming skills.
Conclusion
In this paper, we proposed an EDM framework for data clustering, patterns, and rules mining using real-world problem-solving data. A mathematical model for data preprocessing, MK-means, and FP-growth algorithms were used to conduct this study. For programming education, OJ systems have been adopted by many institutions as academic tools. As a result, a huge number of programming-related resources (source codes, logs, scores, activities, etc.) are regularly accumulated in OJ systems. In this study, a large amount of real-world problem-solving data collected from the AOJ system was used in the experiments. Problem-solving data preprocessing is one of the main tasks to achieve accurate EDM results. Therefore, a mathematical model for problem-solving data preprocessing is developed. Then, the processed data are clustered using Elbow and MK-means algorithms. Various statistical features, data patterns and rules are extracted from each cluster based on different threshold values (K , minConf , minSup ). These results can effectively contribute to the improvement of overall programming education. Moreover, based on the experimental results, some pertinent suggestions have been made. Furthermore, the proposed framework can be applied to other practical/exercise courses to demonstrate data patterns, statistical features, and rules. Besides, any third-party applications with similar data resources such as AlgoA , ProgA , FCT , and FPT , can use the proposed approach for EDM and analysis.