Contact
For all enquiries regarding APLAS 2019, please contact the following address:
Email: aplas2019@cs.ui.ac.id
For all enquiries regarding APLAS 2019, please contact the following address:
Email: aplas2019@cs.ui.ac.id
Abstract: Program calculation, a programming technique to derive efficient programs from naive ones by program transformation, is a challenging activity for program optimization. Tesson et al. have shown that Coq, a popular theorem proof assistant, provides a cheap way to implement a powerful system for verifying correctness of the transformations, but its applications are limited to the calculation of list functions in Theory of Lists. In this paper, we prove more advanced calculation rules in Coq for various recursion schemes, which capture the recursive programs on an arbitrary algebraic datatype. We construct Coq formal proof of all the lemmas and theorems about recursion schemes including histomorphism and futumorphism proposed by Uustalu and Vene. The feature of our proposal is that it is possible to obtain certified runnable programs from definitions written with recursion schemes in Coq scripts. We have succeeded in obtaining a certified runnable program to compute the n-th Fibonacci number from its histomorphic definition.
Abstract: We present new proofs—formalized in the Coq proof assistant—of the correspondence among call-by-need and (various definitions of) call-by-name evaluations of lambda-calculus with mutually recursive bindings. For non-strict languages, the equivalence between the high-level specification (call-by-name) and the actual implementation (call-by-need) is of foundational interest. A particular milestone is Launchbury’s natural semantics of call-by-need evaluation and proof of its adequacy with respect to call-by-name denotational semantics, which are recently formalized in Isabelle/HOL by Breitner (2018). Equational theory by Ariola et al. is another well-known formalization of call-by-need. Mutual recursion is especially challenging for their theory: reduction is complicated by the traversal of dependency (the “need” relation), and the correpondnence of call-by-name and call-by-need reductions becomes non-trivial, requiring sophisticated structures such as graphs or infinite trees. In this paper, we give arguably simpler proofs solely based on (finite) terms and natural semantics, which are easier to handle for proof assistants (Coq in our case). Our proofs can be summarized as follows: (1) we prove the equivalence between Launchbury’s call-by-need semantics and heap-based call-by-name natural semantics, where we define a sufficiently (but not too) general correspondence between the two heaps, and (2) we also show the correspondence among three styles of call-by-name semantics: (i) the natural semantics used in (1); (ii) closure-based natural semantics that informally corresponds to Launchbury’s denotational semantics; and (iii) conventional substitution-based semantics.
Abstract: Information-flow security type systems ensure confidentiality by enforcing noninterference: a program cannot leak private data to public channels. However, in practice, programs need to selectively declassify information about private data. Several approaches have provided a notion of relaxed noninterference, supporting selective and expressive declassification while retaining a formal security property. The labels-as-functions approach provides relaxed noninterference by means of declassification policies expressed as functions. The labels-as-types approach expresses declassification policies using type abstraction and faceted types, a pair of types representing the secret and public facets of values. The original proposal of labels-as-types is formulated in an object-oriented setting where type abstraction is realized by subtyping. The object-oriented approach however suffers from limitations due to its receiver-centric paradigm. In this work, we consider an alternative approach to labels-as-types that allows us to express more advanced declassification policies, such as extrinsic policies, based on a different form of type abstraction: existential types. An existential type exposes abstract types and operations on these; we leverage this abstraction mechanism to express secrets that can be declassified using the provided operations. We formalize the approach in a core calculus with existential types, define the corresponding notion of relaxed noninterference that accounts for abstract types, and prove that well-typed programs satisfy this form of relaxed noninterference.
Abstract: We study a dependently typed extension of a multi-stage programming language a la MetaOCaml, which supports quasi-quotation and cross-stage persistence for manipulation of code fragments as first-class values and eval for the execution of programs dynamically generated by the code manipulation. Dependent types are expected to bring to multi-stage programming enforcement of strong invariants—beyond simple type safety—on the behavior of dynamically generated code. An extension is, however, not trivial because a type system would have to take stages—roughly speaking, the number of surrounding quotations—of types into account. To rigorously study properties of such an extension, we develop lambda^{MD}, which is an extension of Hanada and Igarashi’s typed calculus lambda^{\triangleright%} with dependent types, and prove its properties including preservation, confluence, strong normalization for full reduction, and progress for staged reduction. Motivated by code generators such that the type of generated code depends on a value from outside of quotations, we argue the significance of cross-stage persistence in dependently typed multi-stage programming and certain type equivalence that is not directly derived from reduction rules.
Abstract: Strings represent one of the most common and most intricate data-types found in software programs, with correct string processing often being a decisive factor for correctness and security properties. This has led to a wide range of recent research results on how to analyse programs operating on strings, using methods like testing, fuzzing, symbolic execution, abstract interpretation, or model checking, and, increasingly, support for strings is also added to constraint solvers and SMT solvers. In this paper, we focus on the verification of software programs with strings using model checking. We give a survey of the existing approaches to handle strings in this context, and propose methods based on algebraic data-types, Craig interpolation, and automata learning.
Abstract: MapReduce framework for data-parallel computation was first proposed by Google and later implemented in the Apache Hadoop project. Under the MapReduce framework, a reducer computes output values from a sequence of input values transmitted over the network. Due to the non-determinism in data transmission, the order of input values arrives at the reducer is not fixed. The commutativity problem of reducers asks if the output of a reducer is independent of the order of its inputs. There are several advantages for a reducer being commutative, e.g., the verification problem of a MapReduce program can be reduced to the problem of verifying a sequential program. In this paper, we present a tool J-ReCoVer (Java Reducer Commutativity Verifier) that implements effective heuristics for reducer commutativity analysis. J-ReCoVer is the first tool that is specialized in checking reducer commutativity. Experimental results over 118 benchmark examples collected from open repositories are very positive; J-ReCoVer correctly handles over 97% of them.
Abstract: Concurrent data structures implemented using software transactional memory (STM) perform poorly when operations which do not conflict in the definition of the abstract data type nonetheless incur conflicts in the concrete state of an implementation. Several works addressed various aspects of this problem, yet will still lack efficient, general-purpose mechanisms that allow one to readily integrate black-box concurrent data-structures into existing STM frameworks. In this paper we take a step further toward this goal, by focusing on the challenge of how to use black-box concurrent data structures in an optimistic transactional manner, while exploiting an off-the-shelf STM for transaction-level conflict detection. To this end, we introduce two new enabling concepts. First, we define data-structure conflict in terms of commutativity but, unlike prior work, we introduce a new format called conflict abstractions, which is kept separate from the object implementation and is fit for optimistic conflict detection. Second, we describe shadow speculation for wrapping off-the-shelf concurrent objects so that updates can be speculatively and opaquely applied—and even return values observed—but then later dropped (on abort) or else atomically applied (on commit). We have realized these concepts in a new open-source transactional system called ScalaProust built on top of ScalaSTM and report encouraging experimental results.
Abstract: Multitasking is a fundamental mechanism of the Android mobile operating system, which has substantially enhanced user experiences of the system. However, because of its complex nature, this mechanism is infamously hard to understand and is plagued by serious security concerns. In this paper we formalize the semantics of the multitasking mechanism and develop efficient static analysis methods with automated tool supports. For the formalization, we propose an extension of the existing Android stack machine model to capture all the core elements of the mechanism, in particular, the intent flags used in inter-component communication. We define the formal semantics in a succinct and structured way, with some underlying principles identified. Moreover, we pinpoint the discrepancy in the semantics for different Android versions. We also validate the semantics by examining the conformance to the actual behavior of the Android system via exhaustive experiments. For the static analysis, we consider the configuration reachability and stack boundedness problem, designing new algorithms and developing a prototype tool TaskDroid to fully support automated model construction and analysis of Android apps. The experimental results show that TaskDroid is effective and efficient in analyzing Android apps in practice.
Abstract: Proof assistants, such as Isabelle/HOL, offer tools to facilitate inductive theorem proving. Isabelle experts know how to use these tools effectively; however they did not have a systematic way to encode their expertise. To address this problem, we present our domain-specific language, LiFtEr. LiFtEr allows experienced Isabelle users to encode their induction heuristics in a style independent of any problem domain. LiFtEr’s interpreter mechanically checks if a given application of induction tool matches the heuristics, thus transferring the Isabelle experts’ expertise to new Isabelle users.
Abstract: Modern memory allocators have to balance many simultaneous demands, including performance, security, the presence of concurrency, and application-specific demands depending on the context of their use. One increasing use-case for allocators is as back-end implementations of languages, such as Swift and Python, that use reference counting to automatically deallocate objects. We present mimalloc, a memory allocator that effectively balances these demands, shows significant performance advantages over existing allocators, and is tailored to support languages that rely on the memory allocator as a backend for reference counting. Mimalloc combines several innovations to achieve this result. First, it uses three page-local sharded free lists to increase locality, avoid contention, and support a highly-tuned allocate and free fast path. These free lists also support temporal cadence, which allows the allocator to predictably leave the fast past for regular maintenance tasks such as supporting deferred freeing, handling frees from non-local threads, etc. While influenced by the allocation workload of the reference-counted Lean and Koka programming language, we show that mimalloc has superior performance to modern commercial memory allocators, including tcmalloc and jemalloc, with speed improvements of 7% and 14%, respectively, on redis, and consistently out performs over a wide range of sequential and concurrent benchmarks. Allocators tailored to provide an efficient runtime for reference-counting languages reduce the implementation burden on developers and encourage the creation of innovative new language designs.
Abstract: Meta-interpreters in Prolog are a powerful and elegant way to implement language extensions and non-standard semantics. But how can we bring the benefits of Prolog-style meta-interpreters to systems that combine functional and logic programming? In Prolog, a program can access its own structure via reflection, and meta-interpreters are simple to implement because the “pure” core language is small—not so, in larger systems that combine different paradigms. In this paper, we present a particular kind of functional logic meta- programming, based on embedding a small first-order logic system in an expressive host language. Embedded logic engines are not new, as exemplified by various systems including miniKanren in Scheme and LogicT in Haskell. However, previous embedded systems generally lack meta- programming capabilities in the sense of meta-interpretation. Instead of relying on reflection for meta-programming, we show how to adapt popular multi-stage programming techniques to a logic program- ming setting and use the embedded logic to generate reified first-order structures, which are again simple to interpret. Our system has an appealing power-to-weight ratio, based on the simple and general notion of dynamically scoped mutable variables. We also show how, in many cases, non-standard semantics can be realized without explicit reification and interpretation, but instead by customizing program execution through the host language. As a key example, we extend our system with a tabling/memoization facility. The need to interact with mutable variables renders this is a highly nontrivial challenge, and the crucial insight is to extract symbolic representations of their side effects from memoized rules. We demonstrate that multiple independent semantic modifications can be combined successfully in our system, for example tabling and tracing.
Abstract: Networks today achieve robustness not by adhering to precise formal specifications but by building implementations that tolerate modest deviations from correct behavior. This philosophy can be seen in the slogan used by the Internet Engineering Task Force,” we believe in rough consensus and running code,” and by Jon Postel’s famous dictum to “be conservative in what you do, be liberal in what you accept from others.” But as networks have grown in scale and complexity, the frequency of faults has led to new interest in techniques for formally verifying network behavior. This talk will discuss recent progress on practical tools for specifying and verifying formal properties of networks. In the first part of the talk, I will present p4v, a tool for verifying the low-level code that executes on individual devices such as routers and firewalls. In the second part of the talk, I will present NetKAT, a formal system for specifying and verifying network-wide behavior. In the third part of the talk, I will highlight some challenges and opportunities for future research in network verification.
Abstract: Higher-order modal fixpoint logic (HFL) is a higher-order extension of the modal mu-calculus, and strictly more expressive than the modal mu-calculus. It has recently been shown that various program verification problems can naturally be reduced to HFL model checking: the problem of whether a given finite state system satisfies a given HFL formula. In this paper, we propose a novel algorithm for HFL model checking: it is the first practical algorithm in that it runs fast for typical inputs, despite the hyper-exponential worst-case complexity of the HFL model checking problem. Our algorithm is based on Kobayashi et al.’s type-based characterization of HFL model checking, and was inspired by a saturation-based algorithm for HORS model checking, another higher-order extension of model checking. We prove the correctness of the algorithm and report on an implementation and experimental results.
Abstract: We present a manifest contract system PCFv∆H with intersection types. A manifest contract system is a typed functional calculus in which software contracts are integrated into a refinement type system and consistency of contracts is checked by combination of compile- and run-time type checking. Intersection types naturally arise when a contract is expressed by a conjunction of smaller contracts. Run-time contract checking for conjunctive higher-order contracts in an untyped language has been studied but our typed setting poses an additional challenge due to the fact that an expression of an intersection type τ1 ∧ τ2 may have to perform different run-time checking whether it is used as τ1 or τ2. We build PCFv∆H on top of the ∆-calculus, a Church-style intersection type system by Liquori and Stolze. In the ∆-calculus, a canonical expression of an intersection type is a strong pair, whose elements are the same expressions except for type annotations. To address the challenge above, we relax strong pairs so that expressions in a pair are the same except for type annotations and casts, which are a construct for run-time checking. We give a formal definition of PCFv∆H and show its basic properties as a manifest contract system: preservation, progress, and value inversion. Furthermore, we show that run-time checking does not affect essential computation.
Abstract: Widening ensures or accelerates convergence of an analysis, and sometimes contributes a guarantee of soundness that would otherwise be absent. In this paper we propose a generalised view of widening, in which widening operates on values that are not necessarily elements of the given abstract domain, although they must be in a correspondence, the details of which we spell out. We show that the new view generalizes the traditional view, and that at least three distinct advantages flow from the generalization. First, it gives a handle on “compositional safety”, the problem of creating widening operators for product domains. Second, it adds a degree of flexibility, allowing us to define variants of widening, such as delayed widening, without resorting to intrusive surgery on an underlying fixpoint engine. Third, it adds a degree of robustness, by making it difficult for an analysis implementor to make certain subtle (syntactic vs semantic) category mistakes. The paper supports these claims with examples. Our proposal has been implemented in a state-of-the-art abstract interpreter, and we briefly report on the changes that the revised view necessitated.
Abstract: Static analysis tools help to detect programming errors but generate a large number of alarms. Repositioning of alarms is recently proposed technique to reduce the number of alarms by replacing a group of similar alarms with a small number of newly created representative alarms. However, the technique fails to replace a group of similar alarms with a fewer representative alarms mainly when the immediately enclosing conditional statements of the alarms are different and not nested. This limitation is due to conservative assumption that a conditional statement of an alarm may prevent the alarm from being an error. To address the limitation above, we introduce the notion of non-impacting control dependencies (NCDs). An NCD of an alarm is a transitive control dependency of the alarm’s program point, that does not affect whether the alarm is an error. We approximate the computation of NCDs based on the alarms that are similar, and then reposition the similar alarms by considering the effect of their NCDs. The NCD-based repositioning allows to merge more similar alarms together and represent them by a small number of representative alarms than the state-of-the-art repositioning technique. Thus, it can be expected to further reduce the number of alarms. To measure the reduction obtained, we evaluate the NCD-based repositioning using total 105,546 alarms generated on 16 open source C applications, 11 industry C applications, and 5 industry COBOL applications. The evaluation results indicate that, compared to the state-of-the-art repositioning technique, the NCD-based repositioning reduces the number of alarms respectively by up to 23.57%, 29.77%, and 36.09%. The median reductions are 9.02%, 17.18%, and 28.61%, respectively.
Abstract: lambda-calculi come with no fixed evaluation strategy. Different strategies may then be considered, and it is important that they satisfy some abstract rewriting property, such as factorization or nomalization theorems. In this paper we provide simple proof techniques for these theorems. Our starting point is a revisitation of Takahashi’s technique to prove factorization for head reduction. Our technique is both simpler and more powerful, as it works in cases where Takahishi’s does not. We then pair factorization with two other abstract properties, defining essential systems, and show that normalization follows. Concretely, we apply the technique to four case studies, two classic ones, head and the leftmost-outermost reductions, and two less classic ones, non-deterministic weak call-by-value and least-level reductions.
Abstract: Many systems use ad hoc collections of files and directories to store persistent data. For most consumers of this data, the process of properly parsing, using, and updating these filestores using standard APIs is cumbersome and error-prone. Making matters worse, many filestores are too big to fit in memory, so applications must process the data incrementally. They must also correctly manage concurrent accesses by multiple users. This paper presents the design of Transactional Forest (TxForest), which builds on earlier work on Forest and Incremental Forest to provide a simpler, more powerful API for managing filestores and which provides a mechanism for managing concurrent accesses using serializable transactions. Under the hood, TxForest uses Huet’s zippers to track the data associated with filestores and implements an optimistic concurrency control mechanism. We formalize TxForest in a core calculus, develop a formal proof of serializability, describe our OCaml prototype, and present several realistic applications, which serve as case studies.
Abstract: Recently, a proper bisimulation equivalence relation for random process model has been defined in a model independent approach. Model independence clarifies the difference between nondeterministic and probabilistic actions in concurrency and makes the new equivalence relation to be congruent. In this paper, we focus on the finite state randomized CCS model and deepen the previous work in two aspects. First, we show that the equivalence relation can be checked in polynomial time. Second, we give a sound and complete axiomatization system for this model. The algorithm and axiomatization system also have the merit of model independency as they can be easily generalized to the randomized extension of any finite state concurrent model.
Abstract: Separation logic is successful for software verification in both theory and practice. Decision procedure for symbolic heaps is one of the key issues. This paper proposes a cyclic proof system for symbolic heaps with general form of inductive definitions called cone inductive definitions, and shows its soundness and completeness. Cone inductive definitions are obtained from bounded-treewidth inductive definitions by imposing some restrictions for existentials, but they still include a wide class of recursive data structures. The completeness is proved by using a proof search algorithm and it also gives us a decision procedure for entailments of symbolic heaps with cone inductive definitions. The time complexity of the algorithm is nondeterministic double exponential. A prototype system for the algorithm is implemented and experimental results are also presented.
Abstract: Analyzing and verifying heap-manipulating programs automatically is challenging. A key for fighting the complexity is to develop compositional methods. For instance, existing verifiers for heap-manipulating programs require user-provided specification for each function in the program in order to decompose the verification problem. The requirement, however, often hinders the users from applying such tools. To overcome the issue, we propose to automatically learn heap-related program invariants in a property-guided way for each function call. The invariants are learned based on the memory graphs observed during test execution and improved through memory graph mutation. We implemented a prototype of our approach and integrated it with two existing program verifiers. The experimental results show that our approach enhances existing verifiers effectively in automatically verifying complex heap-manipulating programs with multiple function calls.
Abstract: We present the first machine-checked formalization of Jaffe and Ehrenfeucht, Parikh and Rozenberg’s (EPR) pumping lemmas in the Coq proof assistant. We formulate regularity in terms of finite derivatives, and prove that both Jaffe’s pumping property and EPR’s block pumping property precisely characterize regularity. We illuminate EPR’s classical proof that the block cancellation property implies regularity, and discover that—as best we can tell—their proof relies on the Axiom of Choice. We provide a new proof which eliminates the use of Choice. We explicitly construct an function which computes block cancelable languages from well-formed short languages.
Abstract: Complementation of Buchi automata is an essential technique used in a number of approaches for termination analysis of programs. The long search for an optimal complementation construction climaxed with the work of Schewe, who proposed a worst-case optimal rank-based procedure that generates complements of a size matching the theoretical lower bound of (0.76n)^n, modulo a polynomial factor of O(n^2). Although worst-case optimal, the procedure in many cases produces automata that are unnecessarily large. In this paper, we propose several novel ways of how to use the direct and delayed simulation relations to reduce the size of the automaton obtained in the rank-based complementation procedure. Our techniques are based on either (i) ignoring macrostates that cannot be used for accepting a word in the complement or (ii) saturating macrostates with simulation-smaller states, in order to decrease their total number. We experimentally showed that our techniques can indeed considerably decrease the size of the output of the complementation.
Abstract: We propose an efficient algorithm for determinising counting automata (CAs), i.e., finite automata extended with bounded counters. The algorithm avoids unfolding counters into control states, unlike the naive approach, and thus produces much smaller deterministic automata. We also develop a simplified and faster version of the general algorithm for the sub-class of so-called monadic CAs (MCAs), i.e., CAs with counting loops on character classes, which are common in practice. Our main motivation is (besides applications in verification and decision procedures of logics) the application of deterministic (M)CAs in pattern matching regular expressions with counting, which are very common in e.g. network traffic processing and log analysis. We have evaluated our algorithm against practical benchmarks from these application domains and concluded that compared to the naive approach, our algorithm is much less prone to explode, produces automata that can be several orders of magnitude smaller, and is overall faster.
Abstract: We extend recent work in Quantitative Information Flow (QIF) to provide tools for the analysis of programs that aim to implement differentially private mechanisms. We demonstrate how differential privacy can be expressed using loss functions, and how to use this idea in conjunction with a QIF-enabled program semantics to verify differentially private guarantees. Finally we describe how to use this approach experimentally using Kuifje, a recently developed tool for analysing information-flow properties of programs.