(» hide abstracts)
[40] Tim Furche, Giovanni Grasso, Giorgio Orsi, Christian Schallhart, and Cheng Wang. Automatically Learning Gazetteers from the Deep Web. In International Word Wide Web Conference (WWW'12), 2012. accepted for publication.
[ » bib | » pdf ]
Wrapper induction faces a dilemma: To reach web scale, it requires automatically generated examples, but to produce accurate results, these examples must have the quality of human annotations. We resolve this conflict with AMBER, a system for fully automated data extraction from result pages. In contrast to previous approaches, AMBER employs domain specific gazetteers to discern basic domain attributes on a page, and leverages repeated occurrences of similar attributes to group related attributes into records rather than relying on the noisy structure of the DOM. With this approach AMBER is able to identify records and their attributes with almost perfect accuracy (> 98%) on a large sample of websites. To make such an approach feasible at scale, AMBER automatically learns domain gazetteers from a small seed set. In this demonstration, we show how AMBER uses the repeated structure of records on deep web result pages to learn such gazetteers. This is only possible with a highly accurate extraction system. Depending on its parametrization, this learning process runs either fully automatically or with human interaction. We show how AMBER bootstraps a gazetteer for UK locations in 4 iterations: From a small seed sample we achieve 94.4 locations in the 4th iteration.

[39] Xiaonan Guo, Jochen Kranzdorf, Tim Furche, Giovanni Grasso, Giorgio Orsi, and Christian Schallhart. OPAL: A Passe-partout for Web Forms. In International Word Wide Web Conference (WWW'12), 2012. accepted for publication.
[ » bib | » pdf ]
Web forms are the interfaces of the deep web. Though modern web browsers provide facilities to assist in form filling, this assistance is limited to prior form fillings or keyword matching.

Automatic form understanding enables a broad range of applications, including crawlers, meta-search engines, and usability and accessibility support for enhanced web browsing. In this demonstration, we use a novel form understanding approach, OPAL, to assist in form filling even for complex, previously unknown forms. OPAL associates form labels to fields by analyzing structural properties in the HTML encoding and visual features of the page rendering. OPAL interprets this labeling and classifies the fields according to a given domain ontology. The combination of these two properties, allows OPAL to deal effectively with many forms outside of the grasp of existing form filling techniques. In the UK real estate domain, OPAL achieves > 99 form understanding.

[38] Jochen Kranzdorf, Andrew Sellers, Giovanni Grasso, Christian Schallhart, and Tim Furche. Visual OXPath: Robust Wrapping by Example. In International Word Wide Web Conference (WWW'12), 2012. accepted for publication.
[ » bib | » pdf ]
Good examples are hard to find, particularly in wrapper induction: Picking even one wrong example can spell disaster by yielding overgeneralized or overspecialized wrappers. Such wrappers extract data with low precision or recall, unless adjusted by human experts at significant cost.

Visual OXPath is an open-source, visual wrapper induction system that requires minimal examples and eases wrapper refinement: Often it derives the intended wrapper from a single example through sophisticated heuristics that determine the best set of similar examples. To ease wrapper refinement, it offers a list of wrappers ranked by example similarity and robustness. Visual OXPath offers extensive visual feedback for this refinement which can be performed without any knowledge of the underlying wrapper language. Where further refinement by a human wrapper is needed, Visual OXPath profits from being based on OXPath, a declarative wrapper language that extends XPath with a thin layer of features necessary for extraction and page navigation.

[37] Tim Furche, Georg Gottlob, Giovanni Grasso, Xiaonan Guo, Giorgio Orsi, and Christian Schallhart. Forms form Patterns: Reusable Form Understanding. In International Word Wide Web Conference (WWW'12), 2012. accepted for publication.
[ » bib | » pdf ]
Forms are our gates to the web. They enable us to access the deep content of web sites. Automatic form understanding unlocks this content for applications ranging from crawlers to meta-search engines and is essential for improving usability and accessibility of the web. Form understanding has received surprisingly little attention other than as component in specific applications such as crawlers. No comprehensive approach to form understanding exists and previous works disagree even in the definition of the problem.

In this paper, we present OPAL, the first comprehensive approach to form understanding. We identify form labeling and form interpretation as the two main tasks involved in form understanding. On both problems OPAL pushes the state of the art: For form labeling, it combines signals from the text, structure, and visual rendering of a web page, yielding robust characterisations of common design patterns. In extensive experiments on the ICQ and TEL-8 benchmarks and a set of 200 modern web forms OPAL outperforms previous approaches by a significant margin. For form interpretation, we introduce a template language to describe frequent form patterns. These two parts of OPAL combined yield form understanding with near perfect accuracy (> 98%).

[36] Andreas Holzer, Daniel Kroening, Christian Schallhart, Michael Tautschnig, and Helmut Veith. Proving Reachability using FShell (Competition Contribution). In International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), 2012. accepted for publication.
[ » bib | » pdf ]
FShell is an automated white-box test-input generator for C programs, computing test data with respect to user-specified code coverage criteria. The pillars of FShell are the declarative specification language FQL (FShell Query Language), an efficient back end for com- puting test data, and a mathematical framework to reason about cover- age criteria. To solve the reachability problem posed in SV-COMP we specify coverage of ERROR labels. As back end, FShell uses bounded model checking, building upon components of CBMC and leveraging the power of SAT solvers for efficient enumeration of a full test suite.

[35] Tim Furche, Georg Gottlob, Giovanni Grasso, Giorgio Orsi, Christian Schallhart, and Cheng Wang. Little Knowledge Rules The Web: Domain-Centric Result Page Extraction. In International Conference on Web Reasoning and Rule Systems (RR'11), pages 61-76, 2011.
[ » bib | » pdf ]
Web extraction is the task of turning unstructured HTML into structured data. Previous approaches rely exclusively on detecting repeated structures in result pages. These approaches trade intensive user interaction for precision.

In this paper, we introduce the Amber (“Adaptable Model-based Extraction of Result Pages”) system that replaces the human interaction with a domain ontology applicable to all sites of a domain. It models domain knowledge about (1) records and attributes of the domain, (2) low-level (textual) representations of these concepts, and (3) constraints linking representations to records and attributes. Parametrized with these constraints, otherwise domain-independent heuristics exploit the repeated structure of result pages to derive attributes and records. Amber is implemented in logical rules to allow an explicit formulation of the heuristics and easy adaptation to different domains.

We apply Amber to the UK real estate domain where we achieve near perfect accuracy on a representative sample of 50 agency websites.

[34] Tim Furche, Georg Gottlob, Xiaonan Guo, Christian Schallhart, Andrew Sellers, and Cheng Wang. How the Minotaur turned into Ariadne: Ontologies in Web Data Extraction. In International Conference on Web Engineering (ICWE'11), pages 13-27, 2011. invited.
[ » bib | » pdf ]
Humans require automated support to profit from the wealth of data nowadays available on the web. To that end, the linked open data initiative and others have been asking data providers to publish structured, semantically annotated data. Small data providers, such as most UK real-estate agencies, however, are overburdened with this task-often just starting to move from simple, table- or list-like directories to web applications with rich interfaces.

We argue that fully automated extraction of structured data can help resolve this dilemma. Ironically, automated data extraction has seen a recent revival thanks to ontologies and linked open data to guide data extraction. First results from the DIADEM project illustrate that high quality, fully automated data extraction at a web scale is possible, if we combine domain ontologies with a phenomenology describing the representation of domain concepts. We briefly summarise the DIADEM project and discuss a few preliminary results.

[33] Tim Furche, Georg Gottlob, Giovanni Grasso, Christian Schallhart, and Andrew Sellers. OXPath: A Language for Scalable, Memory-efficient Data Extraction from Web Applications. Proceedings of the VLDB Endowment/International Conference on Very Large Databases (VLDB'11), 4(11):1016-1027, 2011.
[ » bib | » pdf ]
The evolution of the web has outpaced itself: The growing wealth of information and the increasing sophistication of interfaces necessitate automated processing. Web automation and extraction technologies have been overwhelmed by this very growth.

To address this trend, we identify four key requirements of web extraction: (1) Interact with sophisticated web application interfaces, (2) Precisely capture the relevant data for most web extraction tasks, (3) Scale with the number of visited pages, and (4) Readily embed into existing web technologies.

We introduce OXPath, an extension of XPath for interacting with web applications and for extracting information thus revealed. It addresses all the above requirements. OXPath’s page-at-a-time evaluation guarantees memory use independent of the number of visited pages, yet remains polynomial in time. We validate experimentally the theoretical complexity and demonstrate that its evaluation is dominated by the page rendering of the underlying browser.

Our experiments show that OXPath outperforms existing commercial and academic data extraction tools by a wide margin. OXPath is available under an open source license.

[32] Andrew Sellers, Tim Furche, Georg Gottlob, Giovanni Grasso, and Christian Schallhart. Exploring the Web with OXPath. In International Workshop on Linked Web Data Management (LWDM'11), 2011.
[ » bib | » pdf ]
OXPath is a careful extension of XPath that facilitates data extraction from the deep web. It is designed to facilitate the large-scale extraction of data from sophisticated modern web interfaces with client-side scripting and asynchronous server communication. Its main characteristics are (1) a minimal extension of XPath to allow page navigation and action execution, (2) a set-theoretic formal semantics for full OXPath, (3) and a sophisticated memory management that minimizespagebuffering.Inthisposter,webrieflyreview the main features of the language and discuss ongoing and future work.

[31] Tim Furche, Georg Gottlob, Giovanni Grasso, Xiaonan Guo, Giorgio Orsi, and Christian Schallhart. Real Understanding for Real Estate Forms. In International Conference on Web Intelligence, Mining and Semantics (WIMS'11), 2011.
[ » bib | » pdf ]
Finding an apartment is a lengthy and tedious process. Once decided, one can never be sure not to have missed an even better offer which would have been just one click away. Form understanding is key to automatically access and process all the relevant—and nowadays readily available—data.

We introduce opal (ontology-based web pattern analysis with logic), a novel, purely logical approach to web form un- derstanding: opal labels, structures, and groups form fields according to a domain-specific ontology linked through phe- nomenological rules to a logical representation of a DOM. The phenomenological rules describe how ontological con- cepts appear on the web; the ontology formalizes and struc- tures common patterns of web pages observed in a domain. A unique feature of opal is that all domain-independent as- sumptions about web forms are represented in rules, whereas domain-specific assumptions are represented in the ontology. This yields a coherent logical framework, robust in face of changing web trends.

We apply opal to a significant, randomly selected sample of UK real estate sites, showing that straightforward rules suffice to achieve high precision form understanding.

[30] Andrew Sellers, Tim Furche, Georg Gottlob, Giovanni Grasso, and Christian Schallhart. OXPath: Little Language, Little Memory, Great Value. In International Word Wide Web Conference (WWW'11), pages 261-264, 2011.
[ » bib | » pdf ]
Data about everything is readily available on the web—but often only accessible through elaborate user interactions. For automated decision support, extracting that data is es- sential, but infeasible with existing heavy-weight data ex- traction systems. In this demonstration, we present OX- Path, a novel approach to web extraction, with a system that supports informed job selection and integrates informa- tion from several different web sites. By carefully extend- ing XPath, OXPath exploits its familiarity and provides a light-weight interface, which is easy to use and embed. We highlight how OXPath guarantees optimal page buffering, storing only a constant number of pages for non-recursive queries.

[29] Andreas Holzer, Visar Januzaj, Stefan Kugele, Boris Langer, Christian Schallhart, Michael Tautschnig, and Helmut Veith. Seamless Testing for Models and Code. In Fundamental Approaches to Software Engineering (FASE'11), Lecture Notes in Computer Science (LNCS), pages 278-293, 2011.
[ » bib | » pdf ]
This paper describes an approach to model-based testing where a test suite is generated from a model and automatically con- cretized to drive an implementation. Motivated by an industrial project involving DO-178B compliant avionics software, where the models are UML activity diagrams and the implementation is ANSI C, we devel- oped a seamless testing environment based on our test specification lan- guage FQL. We demonstrate how to apply FQL to activity diagrams in such a way that FQL test specifications easily translate from UML to C code. Our approach does not require any additional glue or auxil- iary code but is fully automatic except for straightforward source code annotations that link source and model. In this way, we can check for modeled but unimplemented behavior and vice versa, and we can also evaluate the degree of abstraction between model and implementation.

[28] Andrew Sellers, Tim Furche, Georg Gottlob, Giovanni Grasso, and Christian Schallhart. Taking the OXPath down the Deep Web. In International Conference on Extending Database Technology (EDBT'11), pages 542-545, 2011.
[ » bib | » pdf ]
Although deep web analysis has been studied extensively, there is no succinct formalism to describe user interactions with AJAX-enabled web applications.

Toward this end, we introduce OXPath as a superset of XPath 1.0. Beyond XPath, OXPath is able (1) to fill web forms and trigger DOM events, (2) to access dynamically computed CSS attributes, (3) to navigate between visible form fields, and (4) to mark relevant information for extrac- tion. This way, OXPath expressions can closely simulate the human interaction relevant for navigation rather than rely exclusively on the HTML structure. Thus, they are quite resilient against technical changes.

We demonstrate the expressiveness and practical efficacy of OXPath to tackle a group flight planning problem. We use the OXPath implementation and visual interface to access the popular, highly-scripted travel site Kayak. We show, both visually and manually, how to formulate OXPath ex- pressions to extract all booking information with just a few lines of code.

[27] Andreas Holzer, Michael Tautschnig, Christian Schallhart, and Helmut Veith. An Introduction to Test Specification in FQL. In Haifa Verfication Confernce (HVC'10), volume 6504 of Lecture Notes in Computer Science (LNCS), pages 9-22, 2010. invited.
[ » bib | » pdf ]
In a recent series of papers, we introduced a new framework for white-box testing which aims at a separation of concerns between test specications and test generation engines. We believe that establishing a common language for test criteria will have similar benets to testing as temporal logic had to model checking and SQL had to databases. The main challenge was to nd a specication language which is expressive, simple, and precise. This paper gives an introduction to the test specication language FQL and its tool environment.

[26] Visat Januzaj, Stefan Kugele, Boris Langer, Christian Schallhart, and Helmut Veith. New Challenges in the Development of Critical Embedded Systems-An “aeromotive” Perspective. In International Symposium on Leveraging Applications (ISoLA'10), volume 6415 of Lecture Notes in Computer Science (LNCS), pages 1-2, 2010.
[ » bib ]
[25] Andreas Holzer, Michael Tautschnig, Christian Schallhart, and Helmut Veith. How did you specify your test suite ? In Automated Software Engineering (ASE'10), pages 407-416, 2010.
[ » bib | » pdf ]
Although testing is central to debugging and software certification, there is no adequate language to specify test suites over source code. Such a language should be simple and concise in daily use, feature a precise semantics, and of course, it has to facilitate suitable engines to compute test suites and assess the coverage achieved by a test suite.

This paper introduces the language FQL designed to fit these purposes. We achieve the necessary expressive power by a natural extension of regular expressions which matches test suites rather than individual executions. To evaluate the language, we show for a list of informal requirements how to express them in FQL. Moreover, we present a test case generation engine for C programs and perform practical experiments with the sample specifications.

[24] Andreas Bauer, Martin Leucker, and Christian Schallhart. Runtime Verification for LTL and TLTL. ACM Transactions on Software Engineering and Methodology (TOSEM), 20(4), 2010.
[ » bib | » pdf ]
This paper studies runtime verification of properties expressed either in lineartime temporal logic (LTL) or timed lineartime temporal logic (TLTL). It classifies runtime verification in identifying its distinguishing features to model checking and testing, respectively. It introduces a three-valued semantics (with truth values true, false, inconclusive) as an adequate interpretation as to whether a partial observation of a running system meets an LTL or TLTL property.

For LTL, a conceptually simple monitor generation procedure is given, which is optimal in two respects: First, the size of the generated deterministic monitor is minimal, and, second, the monitor identifies a continuously monitored trace as either satisfying or falsifying a property as early as possible. The feasibility of the developed methodology is demontrated using a collection of real-world temporal logic specifications. Moreover, the presented approach is related to the properties monitorable in general and is compared to existing concepts in the literature. It is shown that the set of monitorable properties does not only encompass the safety and co-safety properties but is strictly larger.

For TLTL, the same road map is followed by first defining a three-valued semantics. The corresponding construction of a timed monitor is more involved, yet, as shown, possible.

[23] Somesh Jha, Stefan Katzenbeisser, Christian Schallhart, Helmut Veith, and Stephen Chenney. Semantic Integrity in Large-Scale Online Simulations. ACM Transactions on Internet Technology (TOIT), 10(1), 2010.
[ » bib | » pdf ]
As large-scale online simulations such as Second Life and World of Warcraft are gaining economic significance, there is a growing incentive for attacks against such simulation software. We focus on attacks against the semantic integrity of the simulation. This class of attacks exploits the client-server architecture and is specific to online simulations which, for performance reasons, have to delegate the detailed rendering of the simulated world to the clients. Attacks against semantic integrity often compromise the physical laws of the simulated world—enabling the user’s simulation persona to fly, walk through walls, or to run faster than anybody else.

We introduce the Secure Semantic Integrity Protocol (SSIP) which enables the simulation provider to audit the client computations. Then we analyze the security and scalability of SSIP. First, we show that under standard cryptographic assumptions SSIP will detect semantic integrity attacks. Second, we analyze the network overhead, and determine the optimum tradeoff between cost of bandwidth and audit frequency for our protocol.

[22] Andreas Bauer, Martin Leucker, and Christian Schallhart. Comparing LTL Semantics for Runtime Verification. Journal of Logic and Computation (JLC), 20(3):651-674, 2010.
[ » bib | » pdf ]
When monitoring a system w.r.t. a property defined in a temporal logic such as LTL, a major concern is to settle with an adequate interpretation of observable system events; that is, models of temporal logic formulae are usually infinite words of events, whereas at runtime only finite but incrementally expanding prefixes are available.

In this work, we review LTL-derived logics for finite traces from a runtime-verification perspective. In doing so, we establish four maxims to be satisfied by any LTL-derived logic aimed at runtime verification. As no pre-existing logic readily satisfies all of them, we introduce a new four-valued logic RuntimeVerification Linear Temporal Logic RV-LTL in accordance to these maxims. The semantics of Runtime Verification Linear Temporal Logic (RV-LTL) indicates whether a finite word describes a system behaviour which either (i) satisfies the monitored property, (ii) violates the property, (iii) will presumably violate the property, or (iv) will presumably conform to the property in the future, once the system has stabilized. Notably, (i) and (ii) correspond to the classical semantics of LTL, whereas (iii) and (iv) are chosen whenever an observed system behaviour has not yet lead to a violation or acceptance of the monitored property.

Moreover, we present a monitor construction for RV-LTL properties in terms of Moore machines signalizing the semantics of the so far obtained execution trace w.r.t. the monitored property.

[21] Andreas Bauer, Martin Leucker, Christian Schallhart, and Michael Tautschnig. Don't care in SMT-Building flexible yet efficient abstraction/refinement solvers. Software Tools for Technology Transfer (STTT), 12(1):23-37, 2010.
[ » bib | » pdf ]
This paper describes a method for combining “off-the-shelf” SAT and constraint solvers for building an efficient Satisfiability Modulo Theories (SMT) solver for a wide range of theories. Our method follows the abstraction/refinement approach to simplify the implementation of custom SMT solvers. The expected performance penalty by not using an interweaved combination of SAT and theory solvers is reduced by generalising a Boolean solution of an SMT problem first via assigning don’t care to as many variables as possible. We then use the generalised solution to determine a thereby smaller constraint set to be handed over to the constraint solver for a background theory.We showthat for many benchmarks and real-world problems, this optimisation results in considerably smaller and less complex constraint problems. The presented approach is particularly useful for assembling a practically viable SMT solver quickly, when neither a suitable SMT solver nor a corresponding incremental theory solver is available.We have implemented our approach in the ABsolver framework and applied the resulting solver successfully to an industrial case-study: the verification problems arising in verifying an electronic car steering control system impose non-linear arithmetic constraints, which do not fall into the domain of any other available solver.

[20] Johannes Kinder, Stefan Katzenbeisser, Christian Schallhart, and Helmut Veith. Proactive Detection of Computer Worms Using Model Checking. IEEE Transactions on Dependable and Secure Computing (TDSC), 4(7):424-438, 2010.
[ » bib | » pdf ]
Although recent estimates are speaking of 200,000 different viruses, worms, and Trojan horses, the majority of them are variants of previously existing malware. As these variants mostly differ in their binary representation rather than their functionality, they can be recognized by analyzing the program behavior, even though they are not covered by the signature databases of current antivirus tools. Proactive malware detectors mitigate this risk by detection procedures which use a single signature to detect whole classes of functionally related malware without signature updates. It is evident that the quality of proactive detection procedures depends on their ability to analyze the semantics of the binary.

In this paper, we propose the use of model checking—a well established software verification technique—for proactive malware detection. We describe a tool which extracts an annotated control flow graph from the binary and automatically verifies it against a formal malware specification. To this end, we introduce the new specification language CTPL, which balances the high expressive power needed for malware signatures with efficient model checking algorithms. Our experiments demonstrate that our technique indeed is able to recognize variants of existing malware with a low risk of false positives.

[19] Martin Leucker and Christian Schallhart. A Brief Account of Runtime Verification. Journal of Logic and Algebraic Programming (JLAP), 78(5):293-303, 2009.
[ » bib | » pdf ]
In this paper, a brief account of the field of runtime verification is given. Starting with a definition of runtime verification, a comparison to well-known verification techniques like model checking and testing is provided, and applications in which runtime verification brings out its distinguishing features are pointed out. Moreover, extensions of runtime verification such as monitor-oriented programming, and monitor-based runtime reflection are sketched and their similarities and differences are discussed. Finally, the use of runtime verification for contract enforcement is briefly pointed out.

[18] Andreas Holzer, Michael Tautschnig, Christian Schallhart, and Helmut Veith. Query-Dirven Program Testing. In Verification, Model Checking, and Abstract Interpretation (VMCAI'09), volume 5403 of Lecture Notes in Computer Science (LNCS), pages 151-166, 2009.
[ » bib | » pdf ]
We present a new approach to program testing which enables the programmer to specify test suites in terms of a versatile query language. Our query language subsumes standard coverage criteria ranging from simple basic block coverage all the way to predicate complete coverage and multiple condition coverage, but also facilitates on-the-fly requests for test suites specific to the code structure, to external requirements, or to ad hoc needs arising in program understanding/ exploration. The query language is supported by a model checking backend which employs the CBMC framework. Our main algorithmic contribution is a method called iterative constraint strengthening which enables us to solve a query for an arbitrary coverage criterion by a single call to the model checker and a novel form of incremental SAT solving: Whenever the SAT solver finds a solution, our algorithm compares this solution against the coverage criterion, and strengthens the clause database with additional clauses which exclude redundant new solutions. We demonstrate the scalability of our approach and its ability to compute compact test suites with experiments involving device drivers, automotive controllers, and open source projects.

[17] Wei Dong, Martin Leucker, and Christian Schallhart. Impartial Anticipation in Runtime-Verification. In Automated Technology for Verification and Analysis (ATVA'08), volume 5311 of Lecture Notes in Computer Science (LNCS), 2008. 386-396.
[ » bib | » pdf ]
In this paper, a uniform approach for synthesizing monitors checking correctness properties specified in linear-time logics at runtime is provided. Therefore, a generic three-valued semantics is introduced reflecting the idea that prefixes of infinite computations are checked. Then a conceptual framework to synthesize monitors from a logical specification to check an execution incrementally is established, with special focus on resorting to the automata-theoretic approach. The merits of the presented framework are shown by providing monitor synthesis approaches for a variety of different logics such as LTL, the linear-time μ-calculus, PLTLmod, S1S, and RLTL.

[16] Andreas Holzer, Michael Tautschnig, Christian Schallhart, and Helmut Veith. FSHELL: Systematic Test Case Generation for Dynamic Analysis and Measurement. In Computer Aided Verification (CAV'08), volume 5123 of Lecture Notes in Computer Science (LNCS), pages 209-213, 2008.
[ » bib | » pdf ]
Although the principal analogy between counterexample generation and white box testing has been repeatedly addressed, the usage patterns and performance requirements for software testing are quite different from formal verification. Our tool FSHELL provides a versatile testing environment for C programs which supports both interactive explorative use and a rich scripting language. More than a frontend for software model checkers, FSHELL is designed as a database engine which dispatches queries about the program to program analysis tools. We report on the integration of CBMC into FSHELL and describe architectural modifications which support efficient test case generation.Although the principal analogy between counterexample generation and white box testing has been repeatedly addressed, the usage patterns and performance requirements for software testing are quite different from formal verification. Our tool FSHELL provides a versatile testing environment for C programs which supports both interactive explorative use and a rich scripting language. More than a frontend for software model checkers, FSHELL is designed as a database engine which dispatches queries about the program to program analysis tools. We report on the integration of CBMC into FSHELL and describe architectural modifications which support efficient test case generation.

[15] Andreas Bauer, Martin Leucker, Christian Schallhart, and Michael Tautschnig. Don't care in SMT-Building flexible yet efficient abstraction/refinement solvers. In Workshop On Leveraging Applications of Formal Methods, Verification and Validation (ISoLA'07), pages 135-146, 2007.
[ » bib | » pdf ]
This paper describes a method for combining “off-the-shelf” SAT and constraint solvers for building an efficient Satisfiability Modulo Theories (SMT) solver for a wide range of theories. Our method follows the abstraction/refinement approach to simplify the implementation of customSMT solvers. The expected performance penalty by not using an interweaved combination of SAT and theory solvers is reduced by generalising a Boolean solution of an SMT problem first via assigning don’t care to as many variables as possible. We then use the generalised solution to determine a thereby smaller constraint set to be handed over to the constraint solver for a background theory. We show that for many benchmarks and real-world problems, this optimisation results in considerably smaller and less complex constraint problems.

The presented approach is particularly useful for assembling a practically viable SMT solver quickly, when neither a suitable SMT solver nor a corresponding incremental theory solver is available. We have implemented our approach in the ABSOLVER framework and applied the resulting solver successfully to an industrial case-study: The verification problems arising in verifying an electronic car steering control system impose non-linear arithmetic constraints, which do not fall into the domain of any other available solver.

[14] Andreas Bauer, Martin Leucker, and Christian Schallhart. The good, the bad, and the ugly, but how ugly is ugly? In Workshop on Runtime Verification (RV'07), volume 4839 of Lecture Notes in Computer Science (LNCS), pages 126-138, 2007.
[ » bib | » pdf ]
When monitoring a system wrt. a property defined in some temporal logic, e. g., LTL, a major concern is to settle with an adequate interpretation of observable system events; that is, models of temporal logic formulae are usually infinite streams of events, whereas at runtime only prefixes are available. This work defines a four-valued semantics for LTL over finite traces, which extends the classical semantics, and allows to infer whether a system behaves (1) according to the monitored property, (2) violates the property, (3) will possibly violate the property in the future, or (4) will possibly conform to the property in the future, once the system has stabilised. Notably, (1) and (2) correspond to the classical semantics of LTL, whereas (3) and (4) are chosen whenever an observed system behaviour has not yet lead to a violation or acceptance of the monitored property.

Moreover, we present a monitor construction for RV-LTL properties in terms of a Moore machine signalising the semantics of the so far obtained execution trace.

[13] Somesh Jha, Stefan Katzenbeisser, Helmut Veith, and Stephen Chenny. Enforcing Semantic Integrity on Untrusted Clients in Networked Virtual Environments (Extended Abstract). In IEEE Security and Privacy (SP'07), pages 179-186, 2007.
[ » bib | » pdf ]
In the computer gaming industry, large-scale simulations of realistic physical environments over the Internet have at- tained increasing importance. Networked virtual environ- ments (NVES) are typically based on a client-server archi- tecture where part of the simulation workload is delegated to the clients. This architecture renders the simulation vul- nerable to attacks against the semantic integrity of the sim- ulation: malicious clients may attempt to compromise the physical and logical rules governing the simulation, or to alter the causality of events. This paper initiates the sys- tematic study of semantic integrity in NVES from a security point of view. We present a new provably secure semantic integrity protocol which enables the server system to audit the local computations of the clients on demand.

[12] Sagar Chaki, Christian Schallhart, and Helmut Veith. Verification Across Intellectual Property Boqundaries. In Computer Aided Verification (CAV'07), volume 4590 of Lecture Notes in Computer Science (LNCS), pages 82-94, 2007.
[ » bib | » pdf ]
In many industries, the share of software components provided by third-party suppliers is steadily increasing. As the suppliers seek to secure their intellectual property (IP) rights, the customer usually has no direct access to the suppliers’ source code, and is able to enforce the use of verification tools only by legal requirements. In turn, the supplier has no means to convince the customer about successful verification without revealing the source code. This paper presents a new approach to resolve the conflict between the IP interests of the supplier and the quality interests of the customer. We introduce a protocol in which a dedicated server (called the “amanat”) is controlled by both parties: the customer controls the verification task performed by the amanat, while the supplier controls the communication channels of the amanat to ensure that the amanat does not leak information about the source code. We argue that the protocol is both practically useful and mathematically sound. As the protocol is based on well-known (and relatively lightweight) cryptographic primitives, it allows a straightforward implementation on top of existing verification tool chains. To substantiate our security claims, we establish the correctness of the protocol by cryptographic reduction proofs.

[11] Andreas Bauer, Martin Leucker, and Christian Schallhart. Monitoring of Realtime Properties. In Foundations of Software Technology and Theoretical Computer Science (FSTTCS'06), volume 4337 of Lecture Notes in Computer Science (LNCS), pages 260-272, 2006.
[ » bib | » pdf ]
This paper presents a construction for runtime monitors that check real-time properties expressed in timed LTL (TLTL). Due to D’Souza’s results, TLTL can be considered a natural extension of LTL towards real-time. Moreover, a typical obstacle in runtime verification is solved both for untimed and timed formulae, in that standard models of linear temporal logic are infinite traces, whereas in runtime verification only finite system behaviours are at hand. Therefore, a 3- valued semantics (true, false, inconclusive) for LTL and TLTL on finite traces is defined that resembles the infinite trace semantics in a suitable and intuitive manner. Then, the paper describes how to construct, given a (T)LTL formula, a deterministic monitor with three output symbols that reads a finite trace and yields its according 3-valued (T)LTL semantics. Notably, the monitor rejects a trace as early as possible, in that any minimal bad prefix results in false as a return value.

[10] Andreas Bauer, Martin Leucker, and Christian Schallhart. Runtime Reflection: Dynamic model-based analyis of component-based distributed embedded systems. In Modellierung von Automotive Systems, 2006.
[ » bib | » pdf ]
Distributed embedded systems have pervaded the automotive domain, but often still lack measures to ensure adequate behaviour in the presence of unforeseen events, or even errors at runtime. As interactions and dependencies within distributed automotive systems increase, the problem of detecting failures which depend on the exact situation and environment conditions they occur in grows. As a result, not only the detection of failures is increasingly difficult, but also the differentiation between the symptoms of a fault, and the actual fault itself, i. e., the cause of a problem.

In this paper, we present a novel and efficient approach built around the notion of a software component similar to AUTOSAR, for dynamically analysing distributed embedded systems in the testing phase or even in standard operation, in that we provide a framework for detecting failures as well as identifying their causes. Our approach is based upon monitoring safety properties, specified in a language that allows to express dynamic system properties. For such specifications so-called monitor components are generated automatically to detect violations of software components. Based on the results of the monitors, a dedicated diagnosis is then performed in order to identify explanations for the misbehaviour of a system. These may be used to store detailed error logs, or to trigger recovery measures.

[9] Andreas Bauer, Martin Leucker, and Christian Schallhart. Model-Based Runtime Analysis of Distributed Reactive Systems. In Australian Software Engineering Conference (ASWEC'06), pages 243-252, 2006.
[ » bib | » pdf ]
Reactive distributed systems have pervaded everyday life and objects, but often lack measures to ensure adequate behaviour in the presence of unforeseen events or even errors at runtime. As interactions and dependencies within distributed systems increase, the problem of detecting failures which depend on the exact situation and environment conditions they occur in grows. As a result, not only the detection of failures is increasingly difficult, but also the differentiation between the symptoms of a fault, and the actual fault itself, i. e., the cause of a problem.

In this paper, we present a novel and efficient approach for analysing reactive distributed systems at runtime, in that we provide a framework for detecting failures as well as identifying their causes. Our approach is based upon monitoring safety-properties, specified in the linear time temporal logic LTL (respectively, TLTL) to automatically generate monitor components which detect violations of these properties. Based on the results of the monitors, a dedicated diagnosis is then performed in order to identify explanations for the misbehaviour of a system. These may be used to store detailed log files, or to trigger recovery measures. Our framework is built modular, layered, and uses merely a minimal communication overhead—especially when compared to other, similar approaches. Further, we sketch first experimental results from our implementations, and describe how it can be used to build a variety of distributed systems using our techniques.

[8] Christian Schallhart and Luca Trevisan. Approximating Succinct MaxSat. Journal of Logic and Computation (JLC), 15(4):551-557, 2005.
[ » bib | » pdf ]
We study the approximability of the version of MaxSat where exponentially large instances are succinctly represented using circuits. First, we prove that the NP-hardness for approximating MaxSat can be lifted to a corresponding NEXP-hardness for approximating circuit-succinct MaxSat for some constant performance ratio. Second, we consider the approximability of circuit-succinct MaxSat with respect to lower complexity classes: In particular, we prove that computing (2-ε)-approximate solutions for circuit-succinct MaxSat is at least as hard as inverting one-way permutations. On the other hand, a simple randomized approximation algorithm computes a (2+ε)-approximate solution with high probability. Recall that the standard (not succinctly represented) version of the MaxSat problem is approximable to within a .78 factor [?] and that the MaxThreeSat problem is approximable to within a 7/8 factor [?].

[7] Axel Belinfante, Lars Franzen, and Christian Schallhart. Model-Based Testing of Reactive Systems, volume 3472 of Lecture Notes in Computer Science (LNCS), chapter Tools for Test Case Generation, pages 391-438. Springer, 2005.
[ » bib | » pdf ]
[6] Johannes Kinder, Stefan Katzenbeisser, Christian Schallhart, and Helmut Veith. Detecting Malicious Code by Model Checking. In Conference on Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA'05), volume 3548 of Lecture Notes in Computer Science (LNCS), pages 174-187, 2005.
[ » bib | » pdf ]
The ease of compiling malicious code from source code in higher programming languages has increased the volatility of malicious programs: The first appearance of a new worm in the wild is usually followed by modified versions in quick succession. As demonstrated by Christodorescu and Jha, however, classical detection software relies on static patterns, and is easily outsmarted. In this paper, we present a flexible method to detect malicious code patterns in executables by model checking. While model checking was originally developed to verify the correctness of systems against specifications, we argue that it lends itself equally well to the specification of malicious code patterns. To this end, we introduce the specification language CTPL (Computation Tree Predicate Logic) which extends the well-known logic CTL, and describe an efficient model checking algorithm. Our practical experiments demonstrate that we are able to detect a large number of worm variants with a single specification.

[5] Jana Dittmann, Stefan Katzenbeisser, Christian Schallhart, and Helmut Veith. Ensuring Media Integrity on Third-Party Infrastructures. In IFIP International Information Security Conference, pages 493-508, 2005.
[ » bib | » pdf ]
In many heterogeneous networked applications the integrity of multimedia data plays an essential role, but is not directly supported by the application. In this paper, we propose a method which enables an individual user to detect tampering with a multimedia file without changing the software application provided by the third party. Our method is based on a combination of cryptographic signatures and fragile watermarks, i.e., watermarks that are destroyed by illegitimate tampering. We show that the proposed system is provably secure under standard cryptographic assumptions.

[4] Stefan Katzenbeisser, Christian Schallhart, and Helmut Veith. Malware Engineering. In Sicherheit 2005, Sicherheit - Schutz und Zuverlässigkeit, 2. Jahrestagung, Fachbereich Sicherheit der Gesellschaft für Informatik, pages 139-148, 2005.
[ » bib | » pdf ]
Starting from simple hand-crafted viruses, today’s malware has evolved to constitute highly infectious computer diseases. The technical development of malware was mainly driven by the wish to improve and accelerate both attacks and proliferation. Although these programs have incurred significant hazard and financial losses, their mechanisms are relatively simple and are amenable to effective countermeasures— once, the first attack has been launched. From a software technology point of view, malicious software in fact is often very similar to network services with the main difference that security holes are exploited to enforce participation in the protocol.

In this position paper we outline the wide range of possible malware-specific engineering techniques which are not used in known viruses and worms, but are technically feasible and will therefore be realized in the foreseeable future—less likely by hackers than by organized illegal entities. The techniques we describe enable the malware to obfuscate its functionality, monitor and analyze its environment, and modify or extend itself in non-trivial ways. Consequently, future security policies and risk assessments have to account for these new classes of malware.

[3] Markus Holzer, Stefan Katzenbeisser, and Christian Schallhart. Towards a formal semantics for odrl. In First ODRL International Workshop, pages 137-148, 2004.
[ » bib | » pdf ]
We give a brief overview of a new way to model the semantics of ODRL permissions in a formal manner by using finite-automata like structures. The constructed automata capture the sequence of actions that a user is allowed to perform according to a specific permission. In contrast to previous approaches, our semantics is able to model sell and lend permissions.

[2] Christian Schallhart. Transaction Processing for Clustered Virtual Environments. In NATO Advanced Research Workshop on Concurrent Information Processing and Computing, pages 146-158, 2003. invited.
[ » bib | » pdf ]
This paper introduces Massively Multi-Player Online Role Games (MMORGS) which are currently amain focus of the gaming industry. MMORGS are Networked Virtual Environments (NVES) where a player can navigate his or her character through a large game-world which is populated by thousands of other players and non player characters. Based the problems arising in the context of MMORGS, we will motivate the development of a domain-independent middleware to support MMORG in particular and networked NVES in general. Finally, we will introduce APEIRON, a middleware which is based on a flexible transaction processing framework. APEIRON is designed to serve as basis for the next generation MMORG.

[1] Ulrich Neumerkel, Christoph Rettig, and Christian Schallart. Visualizing solutions with viewers. In Workshop on Logic Programming Environment (LPE'97), pages 43-50, 1997.
[ » bib ]

Copyright Notice. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder. Also, some of these works have been submitted for publication. Copyright may be transferred without further notice and this version may no longer be accessible.

Acknowledgement. This page was generated with » bibtex2html 1.95.