User Interfaces Development in openDIEL

User Interfaces Development in openDIEL

Argens Ng August 5, 2016

Abstract This paper serves the purpose of reporting the progress of user interfaces enhancement in the work flow engine openDIEL. While openDIEL has the potential to become a powerful work flow engine, its interface towards users has been staggering in development which magnifies the urge of this project. In this paper, I will describe the progress of a module making python script known as modMaker, as well as the development tool timer.

1 Introduction

OpenDIEL stands for open Distributive Interoperable Executive Library. It is a lightweight software framework which aims at combining different interoperable computational componenets to simulate system-wide scientific application. It uses Message Passing Interface (MPI) to facilitate the cooperation between loosely coupled modules and outputs a single executable.

To use openDIEL, user needs to provide: 1. Modules 2. Configuration File (using libconfig) 3. Driver (driver.c)

Figure 1: The structure of openDIEL. Our ultimate goal is of course to automate the generation of all of the above three files, or sets of files. In this project, we will focus on the generation of the module first. This will be done by a python script called modMaker.

1

2 ModMaker

2.1 What is modMaker?

At the current stage, modMaker is a package of 2 python scripts ? modMaker.py and worker.py. Together they can transform a C-file or a directory of C-files (and other files from C family) into a module (or modules). This is done by a series of pattern matching of strings as well as the addition of static supporting files.

2.2 What is a module?

For user defined code or simulation models to run in openDIEL, it has to be in the format of an openDIEL module. There are a few requirements. For example, it has to be rid of the main program. It also has to be rid of MPI COMM WORLD, MPI Init and MPI Finalize to facilitate the cooperation between different modules. This complicated formatting would be done by modMaker in the following manner.

Figure 2: Illustration of modules.

2.3 How to transform a module?

To transform a module, we have to first understand the syntactic structure of the language. As the C-family is our first target, we would naturally focus on it first.

I started off by focusing on the target "int main".. As "int main" is now located in driver.c, we would need to replace the main program of individual modules with a function header. To do that, we have to accurately identify "int main" and modify it. (To be more precise, we would also need to consider "void main" but we would focus on "int main" only for the purpose of this paper)

Thinking as a human, I quickly realize that as long as "int" and "main" are two separate and individual strings, it would be unique and correct as the target we would like to change. This actually holds true for all other targets that we wish to change and hence the problem becomes "identifying individual strings correctly separated". From this insight, the following flow diagram is constructed.

2

Figure 3: ModMaking workflow. 2.3.1 Locating identifying feature Two approaches were considered for locating identifying feature. The first one is "matching by character" and the other is "matching by string". To illustrate the difference, take a look at Fig.4 below.

Figure 4: Matching-By-Character vs Matching-By-String The upper diagram illustrates the process of matching-by-character. By reading in and comparing at each character, we can determine if it is a match to any of the patterns that we are trying to match. So since the first character is a '/', which is neither a match to pattern 1 nor pattern 2, we will continues to match the next character with the heads of both string (we will not proceed within the patterns). The bottom diagram illustrates the process of matching-by-string. By reading in a line at a time, we can use python function string.find(string) to see if the line contains the target patterns, looping by pattern. Again, we found out that there is no match for both patterns at both position 1 and 2 in 4 comparisons. While the benefit is not significant in this case. Matching-by-character can lead to a performance boost if the patterns have similar sub-string head to start with. For example, when matching patter "MPI Init" and "MPI Finalize", we can determine that both patterns have first 4 characters matched or not with 4 comparisons instead of 8, after understanding that they starts with the same 4

3

characters. However, it was soon realized that this require extra effort in identifying com-

mon sub-string start. As long as the built-in string.find(string) can terminate comparisons prematurely upon finding unmatched characters, the performance boost would be insignificant, especially so when compared with the time waiting user input. Hence matching-by-string was used.

2.3.2 Finding Replacement Candidate First we need to understand the difference between "locating identifying feature" and "finding replacement candidate". Identifying feature refers to pattern such as "int[space]main" and "int[tab]main" (Notice that they are no different from the compiler's perspective). However, we would like to replace much more than identifying feature itself. For example, we might want to replace "int main(int argc, char** argv)" or simply "int main ()". This is when syntactic freedom of C proves to become a barrier.

Luckily, C is a language depending heavily on separators, in contrast to Python, which is a indentation based language, and Fortran, which has its own set of strict formatting rules, C can have the whole code in one line and minimal separation. Its extensive use of special characters is both a threat and opportunity for our module transformation. In this case, it is the solution to the above problem.

Figure 5: Finding replacement candidate

It soon became apparent that the closing parenthesis ")" marks the end of our replacement candidate. To tackle this problem more systematically, we divide the statements that we need to convert in C into 3 categories.

1. Function Title

2. Statement

3. Variable

Function title refers to cases like "int main". They are likely ended with a closing parenthesis and then followed by a open bracket "". Statement such as "MPI Init()" can be function call or assignment of variables. Luckily in C, they are usually ended with a semi-colon ";". Lastly, variables are usually not enclosed in separators and they are identifiable by themselves alone. Examples are "MPI COMM WORLD".

By locating the separator in front of an "Identifying Feature" and the one after, we can now locate with high accuracy the "Replacement Candidate" for our modules. After checking the spaces within and making sure each token is in fact a word by itself, we can pass on the results for the user to verify.

4

Figure 6: Modular approach to various checking method

2.3.3 User Participation

Notice that we have been using the word "Replacement Candidate". This is because we believe that there might be missing factors even after serious investigation, and we do not wish to alter the program without user consent. This can cause serious problem that is hard to debug.

Hence all the replacement that we would like to change would appear on screen in a highlighted manner, with replacement suggestion listed out to see if the user see fit.

This soon raises another problem, which is the huge number of prompts generated. While the modMaker script was created with test cases of 2 files and around 40 lines in total, it was soon discovered that hundreds of files and tens of thousands of codes are common in real life scenario. Hence 3 tactics were employed to combat this problem.

The first one is the combination of similar prompts. In our case, MPI COMM WORLD is the most common token and hence generates the most prompts. Hence we decided to count, combine and ask for confirmation all together as illustrated in Fig.8.

The second tactic is to use extension matching to prevent going into un-

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download