Research Interests

IUST Reverse Engineering Laboratory Research Interests

Foreword

Our laboraotry research interests lie in the areas of software engineering and compilers. I relate these two subjects through reverse engineering techniques. For the time being the main focus of my research is in software testing and reverse engineering. I use a mixture of compilers and statistical techniques and highly practical software engineering methods. I am interested in automated test data generation and fault localization. I have developed new methods and algorithms for detecting the domain of a program inputs rather than the inputs data, themselves. I believe that statistical approaches to fault localization are biased by the test sets and the bias could be avoided by applying the domain coverage criterion as a basis for evaluating the data sets. We have designed and developed some components to facilitate test data generation and automated fault localization. My main research areas are as following.

  • Software Engineering: Including software methodologies, software refactoring,  dashboard system design, key performance indicators, and graph analysis.
  • Software Testing: Including test data generation, domain testing, fuzz testing, and performance testing.
  • Software Debugging: Including automatic fault localization, defect prediction, and automatic software repair.

For a brief description of my research activities, please refer to our research laboratory. If you are interested in pure research in software testing, especially at the Ph.D. level, you may contact me through my email, parsa@iust.ac.ir.

Regards,

Saeed Parsa,  Associate professor at Iran University of Science and Technology.

Software Engineering

Topics

Refactoring is a maintenance activity intended to restructure code to improve different quality attributes such as readability and usability. Refactoring appeared first in Opdyke’s Ph.D. thesis to address restructuring at the code level. Refactoring is defined as “A change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior”. Refactoring is an important activity in test-driven development (TDD) and Agile software development.

Kent Beck coined the term “code smell” in the context of identifying quality issues in code that can be refactored to improve the maintainability of a software. He emphasized that the presence of excessive number of smells in a software system makes the software hard to maintain and evolve. However, code smells are technically not erroneous but their presence point towards flimsiness in design, which could initiate the malfunction of system and risk of bugs in the near future.

Recently we encountered some projects in software testing and quality assurance that contains messy codes, design, architecture, and documents. We had problems to apply automatic software testing tools on these repositories, therefore we decided to provide some tools to detect smells in software and performed automatic refactoring. Our aim is to improve the testability of the software by applying these tools before testing. For the time being, we have focused on automatic software refactoring techniques for C++ applications.

Business dashboard design and implementation [3] is an omitted area of software engineering. we are trying in our laboratory to provide methodologies for this area of research and add it as a new field to software engineering. Some research areas in this scope are:

  • Dashboard system design
  • Process mining
  • Graph analysis to model various aspect of the system based on its data.

A missing part in the field of system analysis and design is the management requirements and the system goals and objectives. There has been not much emphasis on this area in system analysis and design. In this respect, we are trying to use techniques applied in management such as strategic planning and roadmaps to improve goal modeling techniques. Performance evaluation and key performance indicators could be naturally extracted from the goal models. Also, business intelligence provides a suitable basis for the analysis of the management requirements to evaluate the performance of a system.

Software Testing

Topics

Statement coverage, branch coverage, and more importantly path coverage are the three criteria applied for evaluating an input data set. We have formally introduced a new criterion called domain coverage. Domain coverage has been referred before this implicitly as input space partitioning (ISP) technique, but it has not been used as a criterion for evaluating test data. For the time being, we are working on polyhedral algebra, and the techniques applied for solving none equations to find input sub-space satisfying a given path constraint.

Test data adequacy is a major challenge in software testing literature. The difficulty is to provide sufficient test data to assure the correctness of the program under test. Especially, in the case of latent faults, the fault does not reveal itself unless specific combinations of input values are used to run the program. In this respect, detection of subdomains of the input domain that cover a specific execution path seems promising. A subdomain covers an execution path provided that each test data taken from the subdomain satisfies the path constraint. Dynamic Domain Reduction, or DDR in short, is a very well-known test data generation procedure, targeted at detection of the subdomains of the input domain that satisfy a given path constraint.

Fuzz testing or Fuzzing is a famous and practical software testing technique. In this technique with repeated generation and injection of malformed test data to the software under test (SUT), we are looking for the possible errors and vulnerabilities. To this goal, fuzz testing requires varieties of test data. Fuzzing is the art of automatic bug finding, and its role is to find software implementation faults, and identify them if possible. Fuzz testing was developed at the University of Wisconsin Madison in 1989 by Professor Barton Miller and his students. A fuzzer is a program which injects semi-random data automatically into a program/stack and detects bugs.

Tools

We have developed a tool for fuzzing software with complex input format, called IUST-DeepFuzz.

Performance testing is mainly concern with stress and load tastings of websites. The main difficulty is to interpret and analyze the report provided by the listeners. This analysis is primarily concern with the vast knowledge of web-servers, communication networks, database systems and the impact of operating system services and programming languages on the performance of the websites. In this respect, we are trying to build a recommender system to provide solutions to the inefficiencies observed in the listeners’ reports. Performance testing also is a significant concern in systems modeling and evaluation as a practical laboratory; we are also trying to give real meaning to the abstract models built of system modeling fields.

Software Debugging

Topics

To eliminate a bug, programmers employ all means to identify the location of the bug and figure out its cause. This process is referred to as software fault localization, which is one of the most expensive activities of debugging. Due to intricacy and inaccuracy of manual fault localization, an enormous amount of research has been carried out to develop automated techniques and tools to assist developers in finding bugs. In our laboratory, we are working on improving these techniques by applying various methods such as statistical models, machine learning and information theory.

Defect prediction methods can be used to determine fault-prone software modules through software metrics to focus testing activities on them. Aim that alleviating the present challenges of this context, we are focusing on using metaheuristic approaches and classification methods.

Automatic software repair consists of automatically finding a solution to software bugs, without human intervention. The key idea of these techniques is to try to automatically repair software systems by producing an actual fix that can be validated by the testers before it is finally accepted, or that can be adapted to fit the system properly.

Case Studies

Cyber-physical systems

Cyber-physical systems are at the heart of bringing intelligence and automation to the real-world. These systems have consist of both hardware and quality assurance in them is a challenging task. Especially in the case of software, adapting the existing testing and debugging method for traditional software to the new cyber-physical and embedded software have not studied.

We have focused on new concepts and criteria in testing such systems like domain analysis while exploiting commodity problem-solving tools include artificial intelligence and compilers. Most of the cyber-physical systems are safety-critical for example, autonomous driving and should be dependable and secure, hence providing tools for automatically finding faults in such system are important. We are going to present novel and efficient algorithms that can be used in practice.

More about research

  • Previous Research Topics

    • Compiler Optimization & Programming Languages
    • Reverse Engineering & Refactoring
    • Grid and Distributed Computing
    • Formal Methods
    • Algorithms (Heuristic & Deterministic)
  • Previous Research Plans

    • Automatic Distribution of Sequential Code, 2008.
    • Automatic Parallelization of Sequential Code, 2007.
  • Supervision of Ph.D. Dissertation

    • Genetic Approach to Automatic Parallelization of Nested loops, 2008.
    • Synetic Approach to Automatic Parallelization of Nested loops, 2008.
  • Supervision of M.Sc Thesis

    • Task graph scheduling
    • Reverse engineering
    • Compiler Optimizations
    • Workflow engines and web services
    • Grid computing
  • Theories, Techniques, and Tools Used in Our Research

    • Statistics and Applied Mathematics
    • Machine Learning
    • Neural Networks
    • Deep Learning
    • Information Theory
    • Game Theory
    • Languages Models
    • Graph Theory
    • Data Mining